According to a new study published in Radiology, the latest version of ChatGPT passed a radiology-board style exam. The study highlighted the great potential that ChatGPT carries across the field, but it also highlighted its limitations.  

 

The team tested ChatGPT, using the GPT-3.5 model, to assess its performance on radiology board-style examinations questions. They designed and utilised a series of 150 multiple-choice questions to match the style, content, and difficulty of the Canadian Royal College and American Board of Radiology examinations.

 

The questions excluded images and they were grouped according to question type to gain deeper insight into performance. The question types included: higher-order thinking (apply, analyse, synthesise), lower-order thinking (knowledge recall, basic understanding) and topic (physical, clinical). The higher-order thinking questions were further subclassified by type (description of imaging findings, clinical management, calculation and classification, disease associations).

 

ChatGPT’s performance was evaluated on an overall basis by question type and topics.

 

GPT-3.5 answered 69% of the questions correctly (104 of 150), which is close to the passing grade of 70% used by the Royal College in Canada. On questions concerning lower-order thinking the model performed well scoring 84% (51 out of 60), but struggled with questions involving higher-order thinking, scoring 60% (53 out of 89).

 

Regarding higher-order questions which involved description of imaging findings, the model struggled to perform well, scoring 61%, 25% for questions involving calculation and classification and 30% for those including application of concepts. However, ChatGPT performed well on higher-order clinical management questions (89%, 16 of 18).

 

Overall the study demonstrated a huge improvement in performance of ChatGPT in healthcare, illustrating the expanding role of large language models in radiology.

 

However, radiologists must be aware of the limitations of ChatGPT. There is a dangerous tendency to confidently phrase inaccurate responses, and therefore radiologists must be aware they cannot rely on it for practice at present.

 

Source: Radiology

Image Credit: iStock

«« AI Emerged as a Promising Tool to Advance Global Radiology


AI in Medical Imaging May Exaggerate Health Inequities »»

References:

Bhayana R et al (2023) Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations. Radiology.



Latest Articles

Radiology,ChatGPT,American Board of Radiology examinations According to a new study published in Radiology, the latest version of ChatGPT passed a radiology-board style exam. The study highlighted the great potential that ChatGPT carries across the field, but it also highlighted its limitations.