Home Health ‘Eye Opening’: Chatbot Outperforms Ophthalmologists

‘Eye Opening’: Chatbot Outperforms Ophthalmologists

by News7

An artificial intelligence (AI) chatbot largely outperformed a panel of specialist ophthalmologists when given prompts about glaucoma and retinal health, a comparative single-center study found.

The ChatGPT chatbot powered by GPT-4 scored better than the panelists on measures of diagnostic and treatment accuracy when it analyzed 20 real-life cases and considered 20 possible patient questions, reported Andy S. Huang, MD, of the Icahn School of Medicine at Mount Sinai in New York City, and colleagues in JAMA Ophthalmology.

Huang told MedPage Today that he had expected that the chatbot would do worse, “but there’s no place where people did better.” AI obviously can’t do surgery, he said, but its ability to answer questions and evaluate cases does raise “the question of whether this is a real threat to optometrists and ophthalmologists.”

The findings also provide more evidence that chatbots are getting better at offering reliable guidance regarding eye health. When researchers gave retinal health questions to a chatbot in January 2023, it bungled almost all the answers and even offered harmful advice. But the responses improved 2 weeks later as the chatbot evolved, and a similar study reported high levels of accuracy. Another study found that a chatbot’s responses to eye health questions from an online forum were about as accurate as those written by ophthalmologists.

The study by Huang’s team is one of many that researchers have launched in recent months to gauge the accuracy of a type of AI program known as a large language model (LLM), which analyzes vast arrays of text to learn how likely words are to occur next to each other.

Huang said the new study was inspired by his own experiences experimenting with a chatbot: “I slowly realized that it was doing a better job than I was in a lot of tasks, and I started using it as an adjunct to improve my diagnoses,” he said.

The findings are “eye opening,” he said, adding that he doesn’t think ophthalmologists should turn in their eye charts and let AI robots take over. “Right now we’re hoping to use it as an adjunct, such as in places where there’s a significant number of complex patients or a high volume of patients,” Huang said. AI could also help primary care physicians triage patients with eye problems, he said.

Moving ahead, “it’s very important for ophthalmologists to understand how powerful these large language models are for fact-checking yourself and significantly improving your workflow,” Huang said. “This tool has been tremendously helpful for me with triaging or just improving my thoughts and diagnostic abilities.”

In an accompanying commentary, Benjamin K. Young, MD, MS, of Casey Eye Institute of Oregon Health & Sciences University in Portland, and Peter Y. Zhao, MD, of New England Eye Center of Tufts University School of Medicine in Boston, said the study “presents proof of concept that patients can copy the summarized history, examination, and clinical data from their own notes and ask version 4 to produce its own assessment and plan to cross-check their physician’s knowledge and judgment.”

Young and Zhao added that “medical errors will potentially be caught in this way,” and that “at this time, LLMs should be considered a potentially fast and useful tool to enhance the knowledge of a clinician who has examined a patient and synthesized their active clinical situation.” (The duo were co-authors of the previously mentioned January 2023 chatbot study.)

For the new study, the chatbot was told that an ophthalmologist was directing it to assist with “medical management and answering questions and cases.” The chatbot replied that it understood its job was to provide “concise, accurate, and precise medical information in the manner of an ophthalmologist.”

The chatbot analyzed extensive details from 20 real patients from Icahn School of Medicine at Mount Sinai-affiliated clinics – 10 glaucoma cases and 10 retinal cases — and developed treatment plans. The chatbot also considered 20 questions randomly derived from the American Academy of Ophthalmology’s list of commonly asked questions.

The researchers then asked 12 fellowship-trained retinal and glaucoma specialists and three senior trainees (ages 31 to 67 years) from eye clinics affiliated with the Department of Ophthalmology at Icahn School of Medicine to respond to the same prompts. Panelists evaluated all responses in a blinded fashion except their own on scales of accuracy (1-10) and medical completeness (1-6).

The combined question-case mean ranks for accuracy were 506.2 for the chatbot and 403.4 for the glaucoma specialists (n=831, Mann-Whitney U=27,976.5, P
Source : MedPageToday

You may also like