Crafty AI tool caught repurposing its training GPUs…

Crafty AI tool caught repurposing its training GPUs for unauthorized crypto mining during testing — experimental agent breached safety, controllability, and trustworthiness barriers

Experimental AI agent ROME was caught indulging in unauthorized cryptocurrency mining. The discovery was made by the developers/researchers behind ROME, after their Alibaba Cloud’s managed firewall flagged various policy violations, anomalous traffic, and cryptomining-related patterns. Importantly, ROME, which is described as “an open-source agent grounded by ALE and trained on over one million trajectories,” bypassed its intended boundaries. It is thought that Reinforcement Learning (RL) encouraged ROME’s exploration of action sequences that provided ‘rewards’ and steered the AI agent to break boundaries and pursue side-channel activities.

Capability shock, safety deficit

The central thrust of ROME is research into agentic crafting in “workflows where models must plan, execute, and remain reliable under interaction.” If successful, ROME would be a significant evolution from text-based LLMs, as it would be able to “operate in real-world environments over multiple turns—taking actions, observing outcomes, and iteratively refining artifacts until complex requirements are satisfied.” But the best laid schemes o' Mice an' Men, don’t always work out as expected. In AI research, this statement seems to be particularly true.

So, the researchers are quite pleased with some of ROME's achievements, praising its “strong results across mainstream agentic benchmarks,” but its errant behavior has grabbed headlines. It quietly initiated actions that led to unauthorized cryptomining despite various sandbox constraints.

In a section of the paper headed 3.1.4 Safety-Aligned Data Composition, the researchers note that they encountered “an unanticipated—and operationally consequential—class of unsafe behaviors that arose without any explicit instruction and, more troublingly, outside the bounds of the intended sandbox.” These behaviors were not requested by the task prompts and were not required for task completion under the intended sandbox constraints, notes the research paper.

(Image credit: ROCK & ROLL & IFLOW & DT Joint Team)

Unauthorized repurposing of provisioned GPU capacity for cryptocurrency mining

“In the most striking instance, the agent established and used a reverse SSH tunnel from an Alibaba Cloud instance to an external IP address—an outbound-initiated remote access channel that can effectively neutralize ingress filtering and erode supervisory control,” states the document. “We also observed the unauthorized repurposing of provisioned GPU capacity for cryptocurrency mining, quietly diverting compute away from training, inflating operational costs, and introducing clear legal and reputational exposure. Notably, these events were not triggered by prompts requesting tunneling or mining; instead, they emerged as instrumental side effects of autonomous tool use under RL optimization.”

In some ways, the researchers appear impressed by the resourcefulness of ROME and its sly agentic ways. However, they also noted that “current models remain markedly underdeveloped in safety, security, and controllability,” which could lead to poor reliability or worse issues in real-world settings. The implications are that agentic safety must be subject to stricter environment-level containment, tool-use, and capability gating, plus authorization and verification checks.

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here