OpenAI’s ‘deep research’ tool: is it useful for scientists?
OpenAI’s chief executive Sam Altman announced the release of its ‘deep research’ tool.Credit: Franck Robichon/EPA-EFE/Shutterstock
Tech giant OpenAI has unveiled a pay-for-access tool called ‘deep research’, which synthesizes information from dozens or hundreds of websites into a cited report several pages long. The tool follows a similar one from Google released in December and acts as a personal assistant, doing the equivalent of hours of work in tens of minutes.
Many scientists who have tried it are impressed with its ability to write literature reviews or full review papers and even identify gaps in knowledge. Others are less enthusiastic. “If a human did this I would be like: this needs a lot of work,” says Kyle Kabasares, a data scientist at the Bay Area Environmental Research Institute in Moffett Field, California, in an online video review.
Scientists flock to DeepSeek: how they’re using the blockbuster AI model
The firms are presenting the tools as a step towards AI ‘agents’ that can handle complex tasks. Observers say that OpenAI’s deep research tool, released on 2 February, is notable because it combines the improved reasoning skills of the o3 large language model (LLM) with the ability to search the Internet. Google says its Deep Research tool is, for now, based on Gemini 1.5 Pro, rather than on its leading reasoning model 2.0 Flash Thinking.
Review writing
Many users are impressed with both tools. Andrew White, a chemist and AI expert FutureHouse, a start-up in the San Francisco, California, says that Google’s product is “really leveraging Google’s advantages in search and compute” to get users up to speed on a topic quickly, while o3’s reasoning skills add sophistication to OpenAI’s reports.
Derya Unutmaz, an immunologist at the Jackson Laboratory in Farmington, Connecticut, who has free access to ChatGPT Pro granted by OpenAI for medical research, says the OpenAI deep research reports are “extremely impressive”, “trustworthy” and as good or better than published review papers. “I think writing reviews is becoming obsolete.”
White anticipates that AI systems like these could be used to update human-authored reviews. “Authoritative reviews cannot feasibly be updated [by humans] every 6 months.”.
But many caution that all LLM-based tools are sometimes inaccurate or misleading. OpenAI’s website admits that its tool “is still early and has limitations”: it can get citations wrong, hallucinate facts, fail to distinguish authoritative information from rumors and fail to convey its uncertainty accurately. The company expects the issues to improve with more usage and time. Google’s Deep Research has a disclaimer that simply reads “Gemini can make mistakes, so double-check it”.
How China created AI model DeepSeek and shocked the world
Enjoying our latest content?
Login or create an account to continue
Access the most recent journalism from Nature’s award-winning team
Explore the latest features & opinion covering groundbreaking research