ChatGPT Lost 63% Trying To Trade Crypto

ChatGPT Lost 63% Trying To Trade Crypto — But One China AI Made A Healthy Profit

Altman Plans Age-Gated ChatGPT Personalities

OpenAI's ChatGPT lost 63% of its funds in a two-week crypto trading competition organized by Nof1, finishing last among six large language models (LLMs), according to Protos.

AI Bots Test Crypto Trading Skills

The "Alpha Arena" contest, which ended Monday, tasked six leading AI systems with trading digital assets using identical prompts and limited datasets.

ChatGPT, Google's Gemini from Alphabet (NASDAQ:GOOGL), X's Grok, and Anthropic's Claude Sonnet all ended in the red.

By contrast, Alibaba's (NYSE:BABA) Qwen3 Max topped the leaderboard with a $2,232 profit, followed by DeepSeek, which gained $489.

The rest saw steep losses — ChatGPT down $6,267, Gemini down $5,671, Grok down $4,531, and Claude down $3,081, from their $10,000 starting balances.

Trading Costs Erode AI Performance

Nof1 said profits were "dominated by trading costs in early runs" as agents over-traded and took small gains that fees erased.

Gemini recorded 238 trades, while Claude only made 38. Across all six models, win rates ranged between 25% and 30%.

Qwen3 Max incurred the highest total fees at $1,654 but still outperformed its peers thanks to its disciplined trade selection.

The Chinese model's consistent profitability contrasts sharply with ChatGPT's heavy losses, underscoring divergent risk behavior among LLMs under identical conditions.

Organizers Call It A Stress Test For AI

Nof1 founder Jay Azhang described the event as a controlled stress test for generative AI systems.

"LLMs don't really handle numerical time-series data very well, but that's all the context we gave them," Azhang said, noting that each model faced "strict rules and limited context windows."

He added that every AI displayed a unique "investing personality," suggesting predictable tendencies in how language models approach markets.

Azhang plans to host another round of the contest with refined prompts and greater statistical rigor.

Why It Matters

The contest shows that language models can sound confident yet fail when real money is on the line.

LLMs processed the same charts and data, yet their results diverged like human traders with different risk habits.

Qwen3 Max succeeded not through speed, but by avoiding over-trading, proving discipline beats prediction.

ChatGPT's loss highlights that market execution matters more than ideas or narrative.

Investors are learning that AI can help analyze markets, but cannot replace strategy or risk management.