On 30 November 2022, the technology company OpenAI released ChatGPT — a chatbot built to respond to prompts in a human-like manner. It has taken the scientific community and the public by storm, attracting one million users in the first 5 days alone; that number now totals more than 180 million. Seven researchers told Nature how it has changed their approach.
MARZYEH GHASSEMI: Fix, don’t amplify, biases in health care
There is no denying the technical accomplishments of the generative language and image models that have emerged in artificial intelligence (AI). In my case, I use ChatGPT mostly to help rewrite content in a different style — for example, to make a scientific abstract more suitable for a general audience, or to summarize my research for a financial officer. I’ve also used it to suggest introductory language at the start of an article, e-mail or paper.
To safely deploy generative AI in health care, models must be open source
I do have concerns about these generative AI tools being used for content creation, whether by students, academics, companies or the public. Generative models are known, for example, to ‘hallucinate’ content (that is, give incorrect or fictional outputs).
More pressingly, text and image generation are prone to societal biases that cannot be easily fixed. In health care, this was illustrated by Tessa, a rule-based chatbot designed to help people with eating disorders, run by a US non-profit organization. After it was augmented with generative AI, the now-suspended bot gave detrimental advice. In some US hospitals, generative models are being used to manage and generate portions of electronic medical records. However, the large language models (LLMs) that underpin these systems are not giving medical advice and so do not require clearance by the US Food and Drug Administration. This means that it’s effectively up to the hospitals to ensure that LLM use is fair and accurate. This is a huge concern.
The use of generative AI tools, in general and in health settings, needs more research with an eye towards social responsibility rather than efficiency or profit. The tools are flexible and powerful enough to make billing and messaging faster — but a naive deployment will entrench existing equity issues in these areas. Chatbots have been found, for example, to recommend different treatments depending on a patient’s gender, race and ethnicity and socioeconomic status (see J. Kim et al. JAMA Netw. Open 6, e2338050; 2023).
Ultimately, it is important to recognize that generative models echo and extend the data they have been trained on. Making generative AI work to improve health equity, for instance by using empathy training or suggesting edits that decrease biases, is especially important given how susceptible humans are to convincing, and human-like, generated texts. Rather than taking the health-care system we have now and simply speeding it up — with the risk of exacerbating inequalities and throwing in hallucinations — AI needs to target improvement and transformation.
ABEBA BIRHANE: Think about whether it should be used at all
Now that LLMs have entered the mainstream, many academics are feeling the pressure to jump on the bandwagon or be seen as missing out. They might not know exactly how they can use this AI technology, but they feel that this increasingly powerful state-of-the-art technology must find a use somehow — like a hammer looking for a nail. So far, presumptions abound on how generative AI tools will transform society, but there is no clear or undisputed use for this technology.
From academic research to journalism and policy reports, in my view, the potential benefits of generative AI are routinely overinflated; their failures, limitations and drawbacks are either omitted or merely mentioned in passing. The critical discussions that exist are limited to a narrow range of topics, such as accuracy, reliability, performance and whether the data the model is trained on, and the model’s weights, are closed or open source. These are important issues. But a basic question is rarely asked: should the technology be used at all — especially as a ‘solution’ to complex, multifaceted challenges such as health care?
For example, LLM-based solutions have been proposed for health care in low- and middle-income countries, despite the fact that AI systems are known to exacerbate social biases (see J. Shaffer et al. BMJ Glob. Health 8, e013874; 2023). Understanding and solving global health inequalities requires getting to the root causes of ugly social realities. This includes confronting legacies of colonialism and slavery, and the preservation of asymmetries in power and wealth between the global north and south — such as that the health or illness of some groups matters less than that of others.
It is much easier to put forward a simplistic ‘technical solution’ — under the pretence of doing something — than it is to confront the roots of these enormous challenges. What is needed is political willpower and just distribution of power and resources, not LLMs.
MUSHTAQ BILAL: Use it for structure, not content
When ChatGPT launched, I was too busy moving from Pakistan to Denmark for a postdoctoral position to pay serious attention to it. But I was closely following the conversation about it on social media. In January, my contact Rob Lennon, who had been experimenting with prompting, wrote a thread on X (formerly Twitter; see go.nature.com/3teexb1) on how best to use the chatbot for business purposes. It proved popular. I used Rob’s prompts as a model to explore uses in academic writing and posted a thread that went viral, too (go.nature.com/3swrztn). Many researchers wanted to learn about ChatGPT. Since then, I have developed ways of using it for academic purposes and shared them on X and the jobs-focused platform LinkedIn.
The most important thing I have emphasized is that generative AI is well suited to creating structure, but not content. LLMs are trained to predict the next word in a sentence. That means the content a chatbot generates is typically predictable — whereas original research is anything but.
Instead, ChatGPT can serve as a brainstorming partner. It will not give you any groundbreaking ideas, but through careful prompting, it can certainly help you to think in the right direction. It can also propose an outline for a research paper, which can serve as a starting point.
OpenAI recently launched custom versions of ChatGPT tailored for specific purposes, including teaching and research. For example, a custom ChatGPT can be created for a course, asking it to always base its answers on the course materials provided. This should prevent the chatbot from hallucinating, making it a useful resource for students.
SIDDHARTH KANKARIA: A tool for tailored teaching
At first, I was excited about ChatGPT’s promise for communicating science. It seemed that it could write clear, crisp and accessible summaries of scientific papers and help to simplify jargon. However, I soon realized that many of these purported applications often merited a healthy dose of scrutiny and proofreading. I found it prudent to use ChatGPT in more targeted ways, being mindful of its pros and cons.
I had an opportunity to do so earlier this year, while teaching science communication and public engagement to secondary-school students. This is an area in which both creativity and critical thinking are crucial. As a firm proponent of ‘learning by doing’, I designed my sessions to be participatory and interactive. I used improvisation games, performances, debates and discussions to expose students to concepts in science communication — from storytelling and audience framing to ethical and social-justice considerations.
I used ChatGPT to brainstorm prompts, questions and content for classroom activities. For example, it quickly collated 50 scientific metaphors, such as for DNA (‘the blueprint of life’) and for gravity (‘spheres on a bed sheet’).
I was aware that many students would probably turn to ChatGPT for these activities and group projects. Rather than restrict their access to it — which would have felt disingenuous when I was relying on the chatbot myself — I encouraged them to use AI tools freely, but to reflect on their limitations. For a class on science writing, we collectively critiqued an anonymized mix of summaries of research papers written by students and by ChatGPT. This sparked interesting discussions on what constitutes a good opening sentence, the limitations of using AI tools and tips for improving one’s own writing.
This approach of embracing new technology gradually while still being acutely critical of its biases and pitfalls felt most prudent to me, at least in the context of teaching and communicating science.
CLAIRE MALONE: Not always reliable, but it does spark joy
A year ago, I was sceptical about how useful ChatGPT would be in my day-to-day work as a science communicator, which at its core involves presenting complex scientific ideas in an accessible manner. So far, many of my reservations have been well founded. For example, when I instructed the chatbot to rewrite the abstract of my PhD thesis in simpler terms, I was not impressed. It retained much of the jargon and failed to make key ideas accessible to a broad audience.
Yet it does have useful aspects. The crucial first step is knowing how to frame a question to avoid getting irrelevant results. I find ChatGPT an efficient means of getting a rough overview of a topic, which I can then drill into. And it is likely that each user’s experience will become increasingly personalized as ChatGPT gains in power and accuracy in the next few years.
In my view, it has an important part to play in sparking curiosity about a huge range of topics. It’s an immediate, interactive source of information — albeit not always guaranteed to be accurate. And its role is distinct from that of a journalist who, beyond fact-checking, considers wider implications and often shines a light on topics readers might not have thought to explore.
ETHAN MOLLICK: Embrace AI in teaching
Rather than skirting around AI in the classroom, I incorporated it into every assignment and lesson. On the basis of that experiment, and on the early research available on generative AI tools, this is what I think the future holds.
AI cheating will remain undetectable and widespread. AI detectors have repeatedly been shown to generate lots of false negatives and false positives — especially for students whose first language is not English (see go.nature.com/47am62d). Teachers will need to consider approaches other than homework to assess student learning.
AI technology will remain ubiquitous. Right now, ChatGPT-3.5 is free. Microsoft’s Bing and Google’s Bard are free. All of these LLM-powered systems give unprecedented writing and analytical power to everyone. Even if the technology did not develop further (and it will), I assume its use will become widespread and costs will remain reasonable.
AI will transform teaching. Students are already using LLMs as tutors and references. As one told me, “Why raise your hand in class when you can ask ChatGPT a question?” We are going to need to think deeply about how to incorporate these tools, with all their flaws and benefits, into class. We can do this in ways that benefit teachers, students and education as a whole. As models become more accurate and more powerful, they will probably take on a direct instructional role, but direct instruction is only a small part of what teachers do. Classrooms provide so much more — opportunities to practise learnt skills, collaborations on problem-solving, support from instructors and socializing.
Learning environments will continue to add value, even with excellent AI tutors, but this will require adopting approaches such as active learning and flipped classrooms (in which students are given course material before the class, during which the teacher instead facilitates group discussion). These are known to work well, but have been hard to implement because of the constraints that instructors face. AI could well kick-start this change.
We have dealt with such transformations before. When calculators were introduced in the 1970s, they completely altered how mathematics was taught. Now, education faces a much larger challenge, but one that provides opportunities as well as risks. Experimenting with AI in an ethical, appropriate way can lead to discovering the best ways to apply pedagogical principles to boost student learning.
FRANCISCO TUSTUMI: Transparency is needed
The abilities of ChatGPT and other generative AI systems have led some people to think that the tools might eventually take over human roles in reviewing and writing scientific articles. Indeed, these systems can — and probably will — serve in manuscript preparation and reviewing, including for data search. However, they are limited by certain aspects.
Firstly, ChatGPT is not a search engine — and it has been shown to provide wrong answers (S. Fergus et al. J. Chem. Educ. 100, 1672–1675; 2023). Another issue is that it is not transparent about how it constructs its texts. Scientific papers must have a clear and reproducible methodology. The source of information, search, selection, data extraction and reporting strategies should be detailed, empowering readers to critically evaluate not only the data in a manuscript but also its text.
Hopefully, future AI-based programs will be more amenable to such critical appraisal. Until then, they cannot be reliably used in manuscript writing and reviewing.