Get all your news in one place.

100’s of premium titles.
One app.

Start reading

Get all your news in one place.

100’s of premium titles. One news app.

Start reading

Fortune

Jeremy Kahn

Is another 'AI winter' coming? Here's what past AI slumps can tell us about what the future might hold

Marvin Minsky United States Sam Altman OpenAI DARPA Pentagon

A person with a read umbrella standing by a black car that is stuck in snow in the middle of a blizzard. (Credit: Photo illustration by Getty Images)

As summer fades into fall, many in the tech world are worried about winter. Late last month, a Bloomberg columnist asked “is the AI winter finally upon us?” British newspaper The Telegraph was more definitive. “The next AI winter is coming,” it declared. Meanwhile, social media platform X was filled with chatter about a possible AI winter.

An “AI winter” is what folks in artificial intelligence call a period in which enthusiasm for the idea of machines that can learn and think like people wanes—and investment for AI products, companies, and research dries up. There’s a reason this phrase comes so naturally to the lips of AI pundits: We’ve already lived through several AI winters over the 70-year history of artificial intelligence as a research field. If we’re about to enter another one, as some suspect, it’ll be at least the fourth.

The most recent talk of a looming winter has been triggered by growing concerns among investors that AI technology may not live up to the hype surrounding it—and that the valuations of many AI-related companies are far too highl. In a worst case scenario, this AI winter could be accompanied by the popping of an AI-inflated stock market bubble, with reverberations across the entire economy. While there have been AI hype cycles before, they’ve never involved anything close to the multiple hundreds of billions of dollars that investors have sunk into the generative AI boom. And so if there is another AI winter, it could involve polar vortex levels of pain.

The markets have been spooked recently by comments from OpenAI CEO Sam Altman, who told reporters he thought some venture-backed AI startups were grossly overvalued (although not OpenAI, of course, which is one of the most highly-valued venture-backed startups of all time). Hot on the heels of Altman’s remarks came a study from MIT that concluded that 95% of AI pilot projects fail.

A look at past AI winters, and what caused them, may give us some indication of whether that chill in the air is just a passing breeze or the first hints of an impending Ice Age. Sometimes those AI winters have been brought on by academic research highlighting the limitations of particular AI techniques. Sometimes they have been caused by frustrations getting AI tech to work well in real world applications. Sometimes both factors have been at play. But what previous AI winters all had in common was disillusionment among those footing the bill after promising new advances failed to deliver on the ensuing hype.

The first AI hype cycle

The U.S. and allied governments lavishly funded artificial intelligence research throughout the early days of the Cold War. Then, as now, Washington saw the technology as potentially conferring a strategic and military advantage, and much of the funding for AI research came from the Pentagon.

During this period, there were two competing approaches to AI. One was based on hard-coding logical rules for categorizing inputs into symbols and then for manipulating those symbols to arrive at outputs. This was the method that yielded the first great leaps forward in computers that could play checkers and chess, and also led to the world’s first chatbots.

The rival AI method was based on something called a perceptron, which was the forerunner of today’s neural networks, a kind of AI loosely built on a caricature of how the brain works. Rather than starting with rules and logic, a perceptron learned a rule for accomplishing some task from data. The U.S. Office of Naval Research funded much of the early work on perceptrons, which were pioneered by Cornell University neuroscientist and psychologist Frank Rosenblatt. Both the Navy and the CIA tested perceptrons to see if they could classify things like the silhouettes of enemy ships or potential targets in aerial reconnaissance photos.

The two competing camps both made hyperbolic claims that their technology would soon deliver computers that equalled or exceeded human intelligence. Rosenblatt told The New York Times in 1958 that his perceptrons would soon be able to recognize individuals and call out their names, that it was “only one more step of development” before they could instantly translate languages, and that eventually the AI systems would self-replicate and become conscious. Meanwhile Marvin Minsky, cofounder of MIT’s AI Lab and a leading figure in the symbolic AI camp, told Life magazine in 1970 that “in three to eight years we will have a machine with the general intelligence of an average human being.”

That’s the first prerequisite for an AI winter: hype. And there are clear parallels today in statements made by a number of prominent AI figures. Back in January, OpenAI CEO Sam Altman wrote on his personal blog that “we are now confident we know how to build [human-level artificial general intelligence] as we have traditionally understood it” and that OpenAI was turning increasingly towards building super-human “superintelligence.” He wrote that this year “we may see the first AI agents ‘join the workforce’ and materially change the output of companies.” Dario Amodei, the cofounder and CEO of Anthropic, has said the human-level AI could arrive in 2026. Meanwhile, Demis Hassabis, the cofounder and CEO of Google DeepMind, has said that AI matching humans across all cognitive domains would arrive in the next “five to 10 years.”

Government loses faith

But what precipitates an AI winter is some definitive evidence this hype cannot be met. For the first AI winter, that evidence came in a succession of blows. In 1966, a committee commissioned by the National Research Council issued a damning report on the state of natural language processing and machine translation. It concluded that computer-based translation was more expensive, slower and less accurate than human translation. The research council, which had provided $20 million towards this early kind of language AI (at least $200 million in today’s dollars), cut off all funding.

Then, in 1969, Minsky was responsible for a second punch. That year, he and Seymour Papert, a fellow AI researcher, published a book-length takedown of perceptrons. In the book, Minsky and Papert proved mathematically that a single layer perceptron, like the kind Rosenblatt had shown off to great fanfare in 1958, could only ever make accurate binary classifications—in other words, it could identify if something were black or white, or a circle or a square. But it could not categorize things into more than two buckets.

It turned out there was a big problem with Minsky’s and Papert’s critique. While most interpreted the book as definitive proof that neural network-based AI would never come close to human-level intelligence, their proofs applied only to a simple perceptron that had just a single layer: an input layer consisting of several neurons that took in data, all linked to a single output neuron. They had ignored, likely deliberately, that some researchers in the 1960s had already begun experimenting with multilayer perceptrons, which had a middle “hidden” layer of neurons that sat between the input neurons and output neuron. True forerunners of today’s “deep learning,” these multilayer perceptrons could, in fact, classify data into more than two categories. But at the time, training such a multilayer neural network was fiendishly difficult. And it didn’t matter. The damage was done. After the publication of Minsky’s and Papert’s book, U.S. government funding for neural network-based approaches to AI largely ended.

Minsky’s and Papert’s attack didn’t just persuade Pentagon funding bodies. It also convinced many computer scientists too that neural networks were a dead end. Some neural network researchers came to blame Minsky for setting back the field by decades. In 2006, Terry Sjenowski, a researcher who helped revive interest in neural networks, stood up at a conference and confronted Minsky, asking him if he were the devil. Minsky ignored the question and began detailing what he saw as the failings of neural networks. Sjenowski persisted in asking Minsky again if he were the devil. Eventually an angry Minsky shouted back: “Yes, I am!”

But Minsky’s symbolic AI soon faced a funding drought too. Also in 1969, Congress forced the Defense Advanced Research Project Agency (DARPA), which had been a major funder of both AI approaches, to change its approach to issuing grants. The agency was told to fund research that had clear, applied military applications, instead of more blue-sky research. And while some symbolic AI research fit this rubric, a lot of it did not.

The final punch came in 1973, when the U.K. parliament commissioned Cambridge University mathematician James Lighthill to investigate the state of AI research in Britain. His conclusion was that AI had failed to show any promise of fulfilling its grand claims of equaling human intelligence and that many of its favored algorithms, while they might work for toy problems, could never deal with the real world’s complexity. Based on Lighthill’s conclusions, the U.K. government curtailed all funding for A.I. research.

Lighthill had only looked at U.K. AI efforts, but DARPA and other U.S. funders of AI research took note of its conclusions, which reinforced their own growing skepticism of AI. By 1974, U.S. funding for AI projects was a fraction of what it had been in the 1960s. Winter had set in—and it would last until the early 1980s.

Today, too, there are parallels with this first AI winter when it comes to studies suggesting AI isn’t meeting expectations. Two recent research papers from researchers at Apple and Arizona State University have cast doubt on whether the cutting edge AI models, which are supposed to use a “chain of thought” to reason about how to answer a prompt, are actually engaging in reasoning at all. Both papers conclude that rather than learning to apply generalizable logical rules and problem-solving techniques to new problems—which is what humans would consider reasoning—the models simply try to match a problem to one seen in its training data. These studies could turn out to be the equivalent of Minsky’s and Papert’s attack on perceptrons.

Meanwhile, there are also a growing number of studies on the real world impact of today’s AI models that parallel the Lighthill and NRC reports. For instance, there’s that MIT study which concluded 95% of AI pilots are failing to boost corporate revenues. There’s a recent study from researchers at Salesforce that concluded most of today’s large language models (LLMs) cannot accurately perform customer relation management (CRM) tasks—a particularly ironic conclusion since Salesforce itself has been pushing AI agents to automate CRM processes. Anthropic research showed that its Claude model could not successfully run a vending machine business—a relatively simple business compared to many of those that tech boosters say are poised to be “utterly transformed” by the AI agents. There’s also a study from the AI research group METR that showed software developers using an AI coding assistant were actually 19% slower at completing tasks than they were without it.

But there are some key differences. Most significantly, today’s AI boom is not dependent on public funding. Although government entities, including the U.S. military, are becoming important customers for AI companies, the money fueling the current boom is almost entirely private. Venture capitalists have invested at least $250 billion into AI startups since ChatGPT debuted in November 2022. And that doesn’t include the vast amount being spent by large, publicly-traded tech companies like Microsoft, Alphabet, Amazon, and Meta on their own AI efforts. An estimated $350 billion is being spent to build out AI data centers this year alone, with even more expected next year.

What’s more, unlike in that first AI winter, when AI systems were mostly just research experiments, today AI is being widely deployed across businesses. AI has also become a massive consumer technology—ChatGPT alone is thought to have 700 million weekly users—which was never the case previously. While today’s AI still seems to lack some key aspects of human intelligence, it is a lot better than systems that existed previously and it is hard to argue that people are not finding the technology useful for a good number of tasks.

Winter No. 2: Business loses patience

That first AI winter thawed in the early 1980s thanks largely to increases in computing power and some improved algorithmic techniques. This time, much of the hype in AI was around “expert systems”. These were computer programs that were designed to encode the knowledge of human experts in a particular domain into a set of logical rules which the software would then apply to accomplish some specific task.

Nevertheless, business was enthusiastic, believing expert systems would lead to a productivity boom. At the height of this AI hype cycle, nearly two-thirds of the Fortune 500 said they had deployed expert systems. By 1985, U.S. corporations were collectively spending more than $1 billion on expert systems and an entire industry, much of it backed by venture capital, sprouted up around the technology. Much of it was focused on building specialized computer hardware, called LISP machines, that were optimized to run expert systems, many of which were coded in the programming language LISP. What’s more, starting in 1983, DARPA returned to funding AI research through the new Strategic Computing Initiative, eventually offering over $100 million to more than 90 different AI projects at universities throughout the U.S.

Although expert systems drew on many of the methods symbolic AI researchers pioneered, many academic computer scientists were wary that inflated expectations would once again precipitate a boom and bust cycle that would hurt the field. Among them were Minsky and fellow AI researcher Roger Schank who coined the term “AI winter” during an AI conference in 1984. The pair chose the neologism to echo the term “nuclear winter”—the devastating and bleak period without sunlight that would likely follow a major nuclear war.

Three things then happened to bring about the next winter. In 1987, a new kind of computer workstation debuted from Sun Microsystems. These workstations, as well as increasingly powerful desktop computers from IBM and Apple, obviated the need for specialized LISP machines. Within a year, the market for LISP machines evaporated. Many venture capitalists lost their shirts—and became wary of ever backing AI-related startups again. That same year, New York University computer scientist Jack Schwartz became head of DARPA’s computing research. He was no fan of AI in general or expert systems in particular, and slashed funding for both.

Meanwhile, businesses gradually discovered that expert systems were difficult and expensive to build and maintain. They were also “brittle”—while they could handle highly routinized tasks well, when they encountered slightly unusual cases, they struggled to apply the logical rules they had been given. In such cases, they often produced bizarre and inaccurate outputs, or simply broke down completely. Delineating rules that would apply to every edge case proved an impossible task. As a result, by the early 1990s, companies were starting to abandon expert systems. Unlike in the first AI boom, where scientists and government funders came to question the technology, this second winter was mostly driven much more by business frustration.

Again there are some clear echoes in what’s happening with AI today. For instance, hundreds of billions of dollars are being invested in AI datacenters being constructed by Microsoft, Alphabet, Amazon’s AWS, Elon Musk’s X.ai, and Meta. OpenAI is working on its $500 billion Project Stargate data center plan with Softbank, Oracle and other investors. Nvidia has become the world’s most valuable company with a $4.3 trillion market cap largely by catering to this demand for AI chips for data centers. One of the big suppositions behind the big data center boom is that the most cutting edge AI models will be at least as large, if not larger, than the leading models that exist today. Training and running models of this size requires extremely large data centers.

But, at the same time, a number of startups have found clever ways to create much smaller models that mimic many of the capabilities of the giant models. These smaller models require far less computing resources—and in some cases don’t even require the kinds of specialized AI chips that Nvidia makes. Some might be small enough to run on a smart phone. If this trend continues, it is possible that those massive data centers won’t be required—just as it turned out LISP machines weren’t necessary. That could mean that hundreds of billions of dollars in AI infrastructure investment winds up stranded.

Today’s AI systems are in many ways more capable—and flexible—than the expert systems of the 1980s. But businesses are still finding them complicated and expensive to deploy and their return on investment too often elusive. While more general purpose and less brittle than the expert systems were, today’s AI models remain unreliable, especially when it comes to addressing unusual cases that might not have been well-represented in their training data. They are prone to hallucinations, confidently spewing inaccurate information, and can sometimes make mistakes no human ever would. This means companies and governments cannot use AI to automate mission critical processes. Whether this means companies will lose patience with generative AI and large language models, just as they did with expert systems, remains to be seen. But it could happen.

Winter No. 3: The rise and fall (and rise) of neural networks

The 1980s also saw renewed interest in the other AI method, neural networks, due in part to the work of David Rumelhart, Geoffrey Hinton and Ronald Williams, who in 1986 figured out a way to overcome a key challenge that had bedeviled multilayered perceptrons since the 1960s. Their innovation was something called backpropagation, or backprop for short, which was a method for correcting the outputs of the middle, hidden layer of neurons during each training pass so that the network as a whole could learn efficiently.

Backprop, along with more powerful computers, helped spur a renaissance in neural networks. Soon researchers were building multilayered neural networks that could decipher handwritten letters on envelopes and checks, learn the relationships between people in a family tree, recognize typed characters and read them aloud through a voice synthesizer, and even steer an early self-driving car, keeping it between the lanes of a highway.

This led to a short-lived boom in neural networks in the late 1980s. But neural networks had some big drawbacks too. Training them required a lot of data, and for many tasks, the amount of data required just didn’t exist. They also were extremely slow to train and sometimes slow to run on the computer hardware that existed at the time.

This meant that there were many things neural networks could still not do. Businesses did not rush to adopt neural networks as they had expert systems because their uses seemed highly circumscribed. Meanwhile, there were other statistical machine learning techniques that used less data and required less computing power that seemed to be making rapid progress. Once again, many AI researchers and engineers wrote off neural networks. Another decade-long AI winter set in.

Two things thawed this third winter: the internet created vast amounts of digital data and made accessing it relatively easy. This helped break the data bottleneck that had held neural networks back in the 1980s. Then, starting in 2004, researchers at the University of Maryland and then Microsoft began experimenting with using a new kind of computer chip that had been invented for video games, called a graphics processing unit, to train and run neural networks. GPUs could perform many of the same operations in parallel, which is what neural networks required. Soon, Geoffrey Hinton and his graduate students began demonstrating that neural networks, trained on large datasets and run on GPUs, could do things—like classify images into a thousand different categories—that would have been impossible in the late 1980s. The modern “deep learning” revolution was taking off.

That boom has largely continued through today. At first, neural networks were largely trained to do one particular task well—to play Go, or to recognize faces. But the AI summer deepened in 2017, when researchers at Google designed a particular kind of neural network called a Transformer that was good at figuring out language sequences. It was given another boost in 2019 when OpenAI figured out that Transformers trained on large amounts of text could not only write text well, but master many other language tasks, from translation to summarization. Three years later, an updated version of OpenAI’s transformer-based neural network, GPT-3.5, would be used to power the viral chatbot ChatGPT.

Now, three years after ChatGPT’s debut, the hype around AI has never been greater. There are certainly a few autumnal signs, a falling leaf carried on the breeze here and there, if past AI winters are any guide. But only time will tell if it is the prelude to another Arctic bomb that will freeze AI investment for a generation, or merely a momentary cold-snap before the sun appears again.

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here