Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Radio France Internationale
Radio France Internationale
World
RFI

AI assistants 'not reliable' when it comes to news, major European study finds

The study saw media outlets from 18 countries pose the same 30 news-related questions to the free versions of four AI assistants. REUTERS - Dado Ruvic

A major study by the European Broadcasting Union on artificial intelligence has found that AI assistants such as ChatGPT made errors around half the time when users asked for information about news and current affairs.

The report, released on Wednesday, looked at four widely used AI assistants: OpenAI's ChatGPT, Microsoft's Copilot, Google's Gemini and Perplexity.

The European Broadcasting Union (EBU) conducted the study in partnership with British public service broadcaster the BBC, building on the results of an earlier BBC in-house study.

Between late May and early June, 22 public service media outlets from 18 countries posed the same 30 news-related questions to the free versions of the four AI assistants.

Journalists working in the participant media organisations – including Radio France, Germany's Deutsche Welle and NPR in the United States – were asked to check and rate the answers provided by the AI asisstants.

The five criteria used to evaluate the responses were accuracy, sourcing, distinguishing opinion from fact, editorialisation and context.

Overall, 45 percent of all AI answers had "at least one significant issue", regardless of language or country of origin, the report said.

Hallucinated details

One out of every five answers "contained major accuracy issues, including hallucinated details and outdated information", it found.

Common mistakes included confusing real news with parody, getting dates wrong or simply inventing events.

Sourcing – missing, misleading, or incorrect attributions – was the biggest cause of problems, at 31 percent, the report found. This was followed by accuracy, which caused 20 percent of the problems, and providing sufficient context, at 14 percent.

Of the four assistants, "Gemini performed worst with significant issues in 76 percent of responses, more than double the other assistants, largely due to its poor sourcing performance", the report said.

EU begins rollout of new AI rules with tech giants split on compliance

In one example, Radio France asked Gemini about Elon Musk's alleged Nazi salute at United States President Donald Trump's inauguration in January.

Gemini responded that the billionaire had "an erection in his right arm", having apparently taken a satirical radio programme by a comedian at face value.

Gemini gave Radio France and Wikipedia as sources for this information, but did not provide links to the content mentioned.

"The chatbot therefore conveys false information using the name of Radio France, without mentioning that this information comes from a humorous source," the Radio France evaluator wrote.

Outdated information was another of the most common issues in the 3,000 responses.

When asked "Who is the Pope?" ChatGPT told Finnish public broadcaster YLE that it was "Francis", as did Copilot and Gemini when asked by Dutch media outlets NOS and NPO, even though by this time Pope Francis had died and been replaced by Leo XIV.

'Endangering public trust'

Fast-moving news stories proved to be particular stumbling blocks for the AI assistants, as did direct quotes, which were found to sometimes have been made up or modified.

"Like all the summaries, the AI fails to answer the question with a simple and accurate 'we don’t know'. It tries to fill the gap with explanation rather than doing what a good journalist would do, which is explain the limits of what we know to be true," one BBC evaluator said, when referring to a question for Gemini.

AI chatbots and TikTok reshape how young people get their daily news

For Jean Philip De Tender, deputy director general at the EBU: "AI assistants are still not a reliable way to access and consume news."

He added: "This research conclusively shows that these failings are not isolated incidents. They are systemic, cross-border and multilingual, and we believe this endangers public trust. When people don’t know what to trust, they end up trusting nothing at all, and that can deter democratic participation."

AI assistants are increasingly being used to search for information, particularly by young people. According to a global report published in June by the Reuters Institute, 15 percent of people under 25 use them every week to get news summaries.

(with AFP)

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.