Summary: A new study has tested the diagnostic prowess of generative AI, specifically the GPT-4 chatbot, with promising results.
The study involved assessing the diagnostic accuracy of AI in managing complex medical cases, with GPT-4 correctly identifying the main diagnosis nearly 40% of the time and including the correct diagnosis in its list of potential diagnoses 64% of difficult cases.
The success of artificial intelligence in this study could provide new insights into its potential clinical applications. However, more research is needed to address the benefits, optimal use and limitations of this technology.
- In a study involving 70 complex case reports, GPT-4 correctly matched the final diagnosis 39% of the time.
- GPT-4 included the correct diagnosis in its differential list (a list of potential conditions based on patients’ symptoms, medical history, and clinical findings) in 64% of cases.
- Despite the promising results, the researchers underline the importance of further investigations to understand the optimal use, benefits and limitations of AI in the clinical setting.
In a recent experiment published inJAMAPhysician-researchers from Beth Israel Deaconess Medical Center (BIDMC) tested the ability of a well-known publicly available chatbot to make accurate diagnoses in difficult medical cases.
The team found that the generative AI, Chat-GPT 4, selected the correct diagnosis as the lead diagnosis nearly 40% of the time and provided the correct diagnosis in its list of potential diagnoses in two-thirds of difficult cases.
Generative AI refers to a type of AI that uses models and information it has been trained on to create new content, rather than simply processing and analyzing existing data.
Some of the best known examples of generative artificial intelligence are so-called chatbots, which use a branch of artificial intelligence called natural language processing (NLP) that allows computers to understand, interpret and generate human-like language. Generative AI chatbots are powerful tools poised to revolutionize the creative industries, education, customer service and more.
However, little is known about their potential benefits in clinical settings, such as complex diagnostic reasoning.
Recent advances in AI have led to generative AI models that can provide detailed text-based answers that score highly on standardized medical exams, said Adam Rodman, MD, MPH, co-director of Innovations in Media and Education Delivery (iMED) Initiative at BIDMC and Instructor of Medicine at Harvard Medical School.
We wanted to know if such a generative model could think like a doctor, so we asked for one to solve standardized complex diagnostic cases used for educational purposes. It worked really, really well.
To evaluate the diagnostic capabilities of chatbots, Rodman and colleagues used Clinicopathology Case Conferences (CPCs), a complex and challenging patient case series that included relevant laboratory and clinical data, imaging studies, and histopathology findings published in the New England Journal of Medicine for educational purposes.
Evaluating 70 CPC cases, the AI exactly matched the final CPC diagnosis in 27 (39%) cases. In 64 percent of cases, the final CPC diagnosis was included in the AI differential, a list of possible conditions that could explain the patient’s symptoms, medical history, clinical findings, and laboratory or imaging findings.
While chatbots are no substitute for the experience and knowledge of a trained medical professional, generative AI is a promising potential complement to human cognition in diagnosis, said first author Zahir Kanjee, MD, MPH, hospital at BIDMC and assistant professor of medicine at Harvard Medical School.
It has the potential to help doctors make sense of complex medical data and broaden or refine our diagnostic thinking. We need more research into the optimal uses, benefits and limitations of this technology, and many privacy issues need to be addressed, but these are exciting discoveries for the future of patient diagnosis and care.
Our study adds to a growing body of literature demonstrating the promising capabilities of AI technology, said co-author Byron Crowe, MD, physician of internal medicine at BIDMC and instructor of medicine at Harvard Medical School.
Further investigation will help us better understand how these new AI models could transform healthcare delivery.
This work has not received separate funding or sponsorship. Kanjee reports royalties for edited books and paid advisory board membership for non-AI related medical education products from Wolters Kluwer, as well as CME fees provided by Oakstone Publishing. Crowe reports Solera Health hiring outside of submitted work. Rodman discloses no conflicts of interest.
About this ChatGPT and AI search news
Author: Chloe Meck
Contact; Chloe Meck – BIDMC
Image: The image is credited to Neuroscience News
Original research: Access closed.
“Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge” by Adam Rodman et al. JAMA
Accuracy of a generative AI model in a complex diagnostic challenge
Recent advances in artificial intelligence (AI) have led to generative models capable of providing accurate and detailed text responses to written requests (chats). These models score highly on standardized medical exams.
Less is known about their performance in clinical applications such as complex diagnostic reasoning. We evaluated the accuracy of one of these models (Generative Pre-trained Transformer 4 [GPT-4]) in a series of diagnostically difficult cases.
#ChatGPT #Shines #Challenging #Medical #Cases #Neuroscience #News