Get all your news in one place.
100's of premium titles.
One app.
Start reading
TechRadar
TechRadar
Craig Hale

'When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done': Microsoft and OpenAI are making AI research tools smarter to help answer even your trickiest questions

Microsoft 365 Copilot Cowork.

  • Multi-model agents will check each other before sharing research with you to ensure maximum quality
  • Researcher mode with Critique enabled scores highly on the DRACO benchmark
  • Copilot Cowork is here to Frontier program customers

Microsoft has announced plans to upgrade its M365 Copilot Researcher agent with a clear focus on using multiple models across AI workflows to combine the power of various systems.

Under this shift away from single-model systems, multiple AI agents will collaborate and hand off different parts of the task to each other.

To begin with, the Researcher agent will use GPT models to generate the initial response, with Claude stepping in to review it for accuracy, completeness and quality.

M365 Copilot's Researcher agent will pass responses through other agents

Microsoft AI at Work Chief Marketing Officer Jared Spataro explained The update follows the success of Anthropic's Claude Cowork, which has since been integrated into M365 Copilot. The aptly-named Copilot Cowork, has now been made available in the Frontier program ahead of a broader rollout, allows humans to delegate work to AI.

Spataro explained how Copilot Cowork moves AI's usefulness from single, basic prompts to end-to-end task execution, ideal for long-running and multi-step workflows.

As for Researcher mode with the new Claude-based Critique function, it has already outperformed single-model systems in early testing, with the second ensuring best-quality output. It scores 13.8% higher on the DRACO benchmark (Deep Research Accuracy, Completeness and Objectivity), deemed the industry standard.

Achieving 57.4% thanks to the multi-model setup, it's more than twice as reliable as Deep Research with OpenAI's o4-mini model. It's also better than o3-based Deep Research, Gemini Deep Research, Claude Opus 4.6 and Perplexity's Deep Research when using Opus 4.5 and 4.6. Microsoft didn't compare it to newer flagship models like GPT-5.4, acting singularly.

"When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done," Spataro wrote, speaking about Microsoft's progress towards Wave 3 of M365 Copilot – an intelligence it defines as "understand[ing] the context of work."


Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.