ChatGPT fails top medical exam in the US: What it could do and what not – ET HealthWorld


New Delhi: ChatGPT has failed another top US exam OpenAI’s acclaimed chatbot ChatGPT failed a urologist exam in the US, according to a study. The study, published in the journal Urology Practice, showed that ChatGPT achieved less than 30 percent correct responses on the widely used American Urologist Association Self-Assessment Curriculum for Urology.SASP).

“ChatGPT not only has a low rate of correct answers to clinical questions in urology practice, but also makes certain types of errors that pose a risk of spreading medical misinformation,” said Christopher M Deibert, of the University of California Medical Center. University of Nebraska, in the report. .

What is the self-assessment study program for the urology test?
The AUA Self-Assessment Study Program (SASP) is a 150-question practice exam that addresses the core curriculum of medical knowledge in urology. The study excluded 15 questions that contained visual information such as pictures or graphics.

How ChatGPT performed in the test
Overall, ChatGPT reportedly gave correct answers to less than 30 percent of these SASP questions, 28.2 percent of multiple-choice questions, and 26.7 percent of open-ended questions. The chatbot is said to have provided “indeterminate” answers to various questions. For these questions, accuracy decreased when the LLM model was asked to regenerate its responses.

The report said that for most of the open questions, ChatGPT provided an explanation for the selected answer only. The responses given by ChatGPT were longer than those provided by SASP, but “frequently redundant and cyclical in nature”, according to the authors.

  He sold his mental-health startup to Airbnb, then relaunched it as a nonprofit with a bigger mission

“In general, ChatGPT often gave vague justifications with broad statements and rarely commented on details,” Dr. Deibert said. Even when receiving feedback, “ChatGPT continually reiterated the original explanation despite being inaccurate,” the report says.

What doesn’t work for ChatGPT
The researchers suggest that while ChatGPT may perform well on tests that require recall of facts, it appears to fall short on questions related to clinical medicine, which require “simultaneously weighing multiple overlapping facts, situations, and outcomes.”

“Since LLMs are limited by their human training, more research is needed to understand their limitations and capabilities across multiple disciplines before it is available for general use,” Dr. Deibert said.

    <!–

  • Updated On Jun 8, 2023 at 01:50 PM IST
  • –>

  • Posted on Jun 8, 2023 at 1:50 PM IST
  • <!–

  • 2 min read
  • –>

Join the community of over 2 million industry professionals

Sign up to our newsletter for the latest insights and analysis.

Download the ETHealthworld app

  • Get real-time updates
  • Save your favorite items


Scan to download app




Source link

Leave a Comment