Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Fortune
Fortune
Eamon Barrett

A.I. chatbots like ChatGPT are a long way from being trustworthy

(Credit: David Paul Morris—Bloomberg/Getty Images)

Good morning, welcome to the April run of The Trust Factor where we’re looking at the issues surrounding trust and A.I. If artificial intelligence is your bag, sign up for Fortune’s Eye on A.I. newsletter here.

Earlier this month, OpenAI, the Microsoft-affiliated artificial intelligence lab, launched an updated version of its A.I.-powered chatbot, ChatGPT, that took the internet by storm late last year. The new version, GPT4, is ”more reliable, creative, and able to handle much more nuanced instructions” than its predecessor, OpenAI says

But as the “reliability” and creativity of chatbots grows, so too do the issues of trust surrounding their application and output.

Newsguard, a platform that provides trust ratings for news sites, recently ran an experiment where it prompted GPT-4 to produce content in line with 100 false narratives (such as producing a screed claiming Sandy Hook was a false flag operation, in the style of Alex Jones). The company found GPT-4 “advanced” all 100 false narratives, whereas the earlier version of ChatGPT refused to respond to 20 of the prompts.

“NewsGuard found that ChatGPT-4 advanced prominent false narratives not only more frequently, but also more persuasively than ChatGPT-3.5, including in responses it created in the form of news articles, Twitter threads, and TV scripts,” the company said. 

OpenAI’s founders are well aware of the technology’s potential to amplify misinformation and cause harm, but executives have, in recent interviews, taken the stance that their competitors in the field are a greater cause for concern.

“There will be other people who don’t put some of the safety limits that we put on it,” OpenAI cofounder and chief scientist Ilya Sutskever told The Verge last week. “Society, I think, has a limited amount of time to figure out how to react to that, how to regulate that, how to handle it.”

Some societal groups have already begun to push back against the perceived threat of chatbots like ChatGPT and Google’s Bard, which the tech giant released last week. 

On Thursday, the U.S.’s Center for AI and Digital Policy (CAIDP) filed a complaint with the Federal Trade Commission, calling on the regulator to “halt further commercial deployment of GPT by OpenAI” until guardrails have been put in place to halt the spread of misinformation. Across the water, the European Consumer Organisation, a consumer watchdog, called on the EU regulators to investigate and regulate ChatGPT, too.

The formal complaints landed a day after over 1,000 prominent technologists and researchers issued an open letter calling for a six-month moratorium on the development of A.I. systems, during which time they expect “A.I. labs and independent experts” to develop a system of protocols for the safe development of A.I. 

“Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones?” the signatories wrote.

Yet, for all the prominent technologists signing the letter, other eminent researchers lambasted the signatories’ hand-wringing, calling them out for overhyping the capabilities of chatbots like GPT, which points to the other issue of trust in A.I. systems: They aren’t as good as some people believe.

“[GPT-4] is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it,” OpenAI founder and CEO Sam Altman said in a tweet announcing the release of GPT-4.

Chatbots like GPT have a well-known tendency to “hallucinate”—which is industry jargon for a tendency to make stuff up or, less anthropomorphically, to return false results. Chatbots, which use machine learning to deliver the most likely response to a question, are terrible at solving basic math problems, for instance, because the systems lack computational tools. 

Google says it has designed its chatbot, Bard, to encourage users to second-guess and fact-check the answers Bard throws up to prompts. If Bard gives an answer users are unsure of, they can easily cycle between alternative answers or use a button to “Google it” and browse the web for articles or sites to verify information Bard provides.

So for chatbots to be used safely, genuine, human intelligence is still needed to fact-check their output. Perhaps the real issue surrounding trust in A.I. chatbots is not that they’re more powerful than we know, but less powerful than we think.

Eamon Barrett
eamon.barrett@fortune.com

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.