Get all your news in one place.
100’s of premium titles.
One app.
Start reading

AI's next big leap is models that understand the world.

Move over large language models — the new frontier in AI is world models that can understand and simulate reality.

Why it matters: Such models are key to creating useful AI for everything from robotics to video games.


  • For all the book smarts of LLMs, they currently have little sense for how the real world works.

Driving the news: Some of the biggest names in AI are working on world models, including Fei-Fei Li whose World Labs announced Marble, its first commercial release.

  • Machine learning veteran Yann LeCun reportedly plans to launch a world model startup when he leaves Meta in the coming months.
  • Google and Meta are also developing world models, both for robotics and to make their video models more realistic.
  • Meanwhile, OpenAI has posited that building better video models could also be a pathway toward a world model.

As with the broader AI race, it's also a global battle.

  • Chinese tech companies, including Tencent, are developing world models that include an understanding of both physics and three-dimensional data.
  • Last week, the United Arab Emirates-based Mohamed bin Zayed University of Artificial Intelligence, a growing player in AI, announced PAN, its first world model.

What they're saying: "I've been not making friends in various corners of Silicon Valley, including at Meta, saying that within three to five years, this [world models, not LLMs] will be the dominant model for AI architectures, and nobody in their right mind would use LLMs of the type that we have today," LeCun said last month at a symposium at the Massachusetts Institute of Technology, as noted in a Wall Street Journal profile.

How they work: World models learn by watching video or digesting simulation data and other spatial inputs, building internal representations of objects, scenes and physical dynamics.

  • Instead of predicting the next word, as a language model does, they predict what will happen next in the world, modeling how things move, collide, fall, interact and persist over time.
  • The goal is to create models that understand concepts like gravity, occlusion, object permanence and cause-and-effect without having been explicitly programmed on those topics.

Context: There's a similar but related concept called a "digital twin" where companies create a digital version of a specific place or environment, often with a flow of real-time data for sensors allowing for remote monitoring or maintenance predictions.

Between the lines: Data is one of the key challenges. Those building large language models have been able to get most of what they need by scraping the breadth of the internet.

  • World models also need a massive amount of information, but from data that's not consolidated or as readily available.
  • "One of the biggest hurdles to developing world models has been the fact that they require high-quality multimodal data at massive scale in order to capture how agents perceive and interact with physical environments," Encord president and co-founder Ulrik Stig Hansen said in an email interview.
  • Encord offers one of the largest open source datasets for world models, with 1 billion data pairs across images, videos, text, audio and 3D point clouds as well as a million human annotations assembled over months.
  • But even that is just a baseline, Hansen said. "Production systems will likely need significantly more."

What we're watching: While world models are clearly needed for a variety of uses, whether they can advance as rapidly as language models remains uncertain.

  • Though clearly they're benefiting from a fresh wave of interest and investment.
Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.