This story is part of Nature’s 10, an annual list compiled by Nature’s editors exploring key developments in science and the individuals who contributed to them.
It co-wrote scientific papers — sometimes surreptitiously. It drafted outlines for presentations, grant proposals and classes, churned out computer code, and served as a sounding board for research ideas. It also invented references, made up facts and regurgitated hate speech. Most of all, it captured people’s imaginations: by turns obedient, engaging, entertaining, even terrifying, ChatGPT took on whatever role its interlocutors desired — and some they didn’t.
Nature’s 10: read the 2023 list
Why include a computer program in a list of people who have shaped science in 2023? ChatGPT is not a person. Yet in many ways, this program has had a profound and wide-ranging effect on science in the past year.
ChatGPT’s sole objective is to plausibly continue dialogues in the style of its training data. But in doing so, it and other generative artificial-intelligence (AI) programs are changing how scientists work (see go.nature.com/413hjnp). They have also rekindled debates about the limits of AI, the nature of human intelligence and how best to regulate the interaction between the two. That’s why this year’s Nature’s 10 has a non-human addition.
Some scientists have long been aware of the potential of large language models (LLMs). But for many, it was ChatGPT’s release as a free-to-use dialogue agent in November 2022 that quickly revealed this technology’s power and pitfalls. The program was created by researchers at OpenAI in San Francisco, California; among them was Ilya Sutskever, also profiled in this year’s Nature’s 10. It is built on a neural network with hundreds of billions of parameters, which was trained, at a cost estimated at tens of millions of dollars, on a giant online corpus of books and documents. Large teams of workers were also hired to edit or rate its responses, further shaping the bot’s output. This year, OpenAI upgraded ChatGPT’s underlying LLM and connected it to other programs so that the tool can take in and create images, and can use mathematical and coding software for help. Other firms have rushed out competitors.
For some researchers, these apps have already become invaluable lab assistants — helping to summarize or write manuscripts, polish applications and write code (see Nature 621, 672–675; 2023). ChatGPT and related software can help to brainstorm ideas, enhance scientific search engines and identify research gaps in the literature, says Marinka Zitnik, who works on AI for medical research at Harvard Medical School in Boston, Massachusetts. Models trained in similar ways on scientific data could help to build AI systems that can guide research, perhaps by designing new molecules or simulating cell behaviour, Zitnik adds.
But the technology is also dangerous. Automated conversational agents can aid cheats and plagiarists; left unchecked, they could irreversibly foul the well of scientific knowledge. Undisclosed AI-made content has begun to percolate through the Internet and some scientists have admitted using ChatGPT to generate articles without declaring it.
Then there are the problems of error and bias, which are baked into how generative AI works. LLMs build up a model of the world by mapping language’s interconnections, and then spit back plausible samplings of this distribution with no concept of evaluating truth or falsehood. This leads to the programs reproducing historical prejudices or inaccuracies in their training data, and making up information, including non-existent scientific references (see W. H. Walters & E. I. Wilder Sci. Rep. 13, 14045; 2023).
Emily Bender, a computational linguist at the University of Washington, Seattle, sees few appropriate ways to use what she terms synthetic text-extruding machines. ChatGPT has a large environmental impact, problematic biases and can mislead its users into thinking that its output comes from a person, she says. On top of that, OpenAI is being sued for stealing data and has been accused of exploitative labour practices (by hiring freelancers at low wages).
The size and complexity of LLMs means that they are intrinsically ‘black boxes’, but understanding why they produce what they do is harder when their code and training materials aren’t public, as in ChatGPT’s case. The open-source LLM movement is growing, but so far, these models are less capable than the large proprietary programs.
Some countries are developing national AI-research resources to enable scientists outside large companies to build and study big generative AIs (see Nature 623, 229–230; 2023). But it remains unclear how far regulation will compel LLM developers to disclose proprietary information or build in safety features.
No one knows how much more there is to squeeze out of ChatGPT-like systems. Their capabilities might yet be limited by the availability of computing power or new training data. But the generative AI revolution has started. And there’s no turning back.