Artificial-intelligence (AI) tools are transforming the way we work. Many products attempt to make scientific research more efficient by helping researchers to sort through large volumes of literature.
These scientific search engines are based on large language models (LLMs) and are designed to sift through existing research papers and summarize the key findings. AI firms are constantly updating their models’ features and new tools are regularly being released.
Nature spoke to developers of these tools and researchers who use them to garner tips on how to apply them — and pitfalls to watch out for.
What tools are available?
Some of the most popular LLM-based tools are Elicit, Consensus and You, which offer various ways to speed up literature review.
When users input a research question into Elicit, it returns lists of relevant papers and summaries of their key findings. Users can ask further questions about specific papers, or filter by journal or study type.
AI science search engines are exploding in number — are they any good?
Consensus helps researchers to understand the variety of scientific information on a topic. Users can input questions such as ‘Can ketamine treat depression?’, and the tool provides a ‘consensus meter’ showing where scientific agreement lies. Researchers can read summaries of papers that agree, disagree or are unsure about the hypothesis. Eric Olson, the chief executive of Consensus in Boston, Massachusetts, says that the AI tool doesn’t replace in-depth interrogation of papers, but it is useful for high-level scanning of studies.
You, a software development company in Palo Alto, California, says that You was the first search engine to incorporate AI search with up-to-date citation data for studies. The tool offers users different ways to explore research questions, for instance its ‘genius mode’ offers answers in charts. Last month, You launched a ‘multiplayer tool’ that allows colleagues to collaborate and share custom AI chats that can automate specific tasks, such as fact checking.
Clarivate, a research analytics company headquartered in London, released its AI-powered research assistant in September, which allows users to quickly search through the Web of Science database. Scientists can input a research question and view relevant abstract summaries, related topics and citation maps, which show the papers that each study cites and can help researchers to identify key literature, Clarivate says.
And although papers in the Web of Science are in English, Clarivate’s AI tool can also summarize paper abstracts in different languages. “Language translation baked into large language models has huge potential to even out scientific literature around the world,” says Francesca Buckland, vice-president of product at Clarivate, who is based in London.
BioloGPT is one of a growing number of subject-specific AI tools and produces summary and in-depth answers to biological questions.
Which tools suit which tasks?
“I always say that it depends on what you really want to do,” says Razia Aliani, an epidemiologist in Calgary, Canada, when asked about the best AI search-engine tools to use.
When she needs to understand the consensus or diversity of opinion on a topic, Aliani gravitates towards Consensus.
Aliani, who also works at the systematic-review company Covidence, uses other AI tools when reviewing large databases. For example, she has used Elicit to fine-tune her research interests. After inputting an initial research question, Aliani uses Elicit to exclude irrelevant papers and delve deeper into more-relevant ones.
Aliani says that AI search tools don’t just save time, they can help with “enhancing the quality of work, sparking creativity and even finding ways to make tasks less stressful”.
Anna Mills teaches introductory writing classes at the College of Marin in San Francisco, California, including lessons on the research process. She says that it’s tempting to introduce her students to these tools, but she’s concerned that they could hinder the students’ understanding of scholarly research. Instead, she’s keen to teach students how AI search tools make mistakes, so they can develop the skills to “critically assess what these AI systems are giving them”.
“Part of being a good scientist is being sceptical about everything, including your own methods,” says Conner Lambden, BiologGPT’s founder, who is based in Golden, Colorado.
What about inaccurate answers and misinformation?
Concerns abound about the accuracy of the outputs of major AI chatbots, such as ChatGPT, which can ‘hallucinate’ false information and invent references.
Three ways ChatGPT helps me in my academic writing
That’s led to some scepticism about science search engines — and researchers should be cautious, say users. Common errors that AI research tools face include making up statistics, misrepresenting cited papers and biases based on these tools’ training systems.
The issues that sports scientist Alec Thomas has experienced when using AI tools have led him to abandon their use. Thomas, who is at the University of Lausanne in Switzerland, previously appreciated AI search tools, but stopped using them after finding “some very serious, basic errors”. For instance, when researching how people with eating disorders are affected if they take up a sport, an AI tool summarized a paper that it said was relevant, but in reality “it had nothing to do with the original query”, he says. “We wouldn’t trust a human that is known to hallucinate, so why would we trust an AI?” he says.
How are developers addressing inaccurate answers?
The developers that Nature spoke to say they have implemented safeguards to improve accuracy. James Brady, head of engineering at Elicit in Oakland, California, says that the firm takes accuracy seriously, and it is using several safety systems to check for errors in answers.
Buckland says that the Web of Science AI tool has “robust safeguards”, to prevent the inclusion of fraudulent and problematic content. During beta-testing, the team worked with around 12,000 researchers to incorporate feedback, she says.
AI chatbots are coming to search engines — can you trust the results?
Although such feedback improves user experience, Olson says that this, too, could influence hallucinations. AI search tools are “trained on human feedback, they want to provide a good answer to humans”, says Olson. So “they’ll fill in gaps of things that aren’t there”.
Andrew Hoblitzell, a generative-AI researcher in Indianapolis, Indiana, who lectures at universities through a programme called AI4All, thinks that AI search tools can support the research process, providing that scientists verify the information generated. “Right now, these tools should be used in a hybrid manner rather than a definitive source.”