Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy
Standard
Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy. / Gräf, Markus; Knitza, Johannes; Leipe, Jan; Krusche, Martin; Welcker, Martin; Kuhn, Sebastian; Mucke, Johanna; Hueber, Axel J.; Hornig, Johannes; Klemm, Philipp; Kleinert, Stefan; Aries, Peer; Vuillerme, Nicolas; Simon, David; Kleyer, Arnd; Schett, Georg; Callhoff, Johanna.
in: RHEUMATOL INT, Jahrgang 42, Nr. 12, 12.2022, S. 2167-2176.Publikationen: SCORING: Beitrag in Fachzeitschrift/Zeitung › SCORING: Zeitschriftenaufsatz › Forschung › Begutachtung
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy
AU - Gräf, Markus
AU - Knitza, Johannes
AU - Leipe, Jan
AU - Krusche, Martin
AU - Welcker, Martin
AU - Kuhn, Sebastian
AU - Mucke, Johanna
AU - Hueber, Axel J.
AU - Hornig, Johannes
AU - Klemm, Philipp
AU - Kleinert, Stefan
AU - Aries, Peer
AU - Vuillerme, Nicolas
AU - Simon, David
AU - Kleyer, Arnd
AU - Schett, Georg
AU - Callhoff, Johanna
PY - 2022/12
Y1 - 2022/12
N2 - Symptom checkers are increasingly used to assess new symptoms and navigate the health care system. The aim of this study was to compare the accuracy of an artificial intelligence (AI)-based symptom checker (Ada) and physicians regarding the presence/absence of an inflammatory rheumatic disease (IRD). In this survey study, German-speaking physicians with prior rheumatology working experience were asked to determine IRD presence/absence and suggest diagnoses for 20 different real-world patient vignettes, which included only basic health and symptom-related medical history. IRD detection rate and suggested diagnoses of participants and Ada were compared to the gold standard, the final rheumatologists' diagnosis, reported on the discharge summary report. A total of 132 vignettes were completed by 33 physicians (mean rheumatology working experience 8.8 (SD 7.1) years). Ada's diagnostic accuracy (IRD) was significantly higher compared to physicians (70 vs 54%, p = 0.002) according to top diagnosis. Ada listed the correct diagnosis more often compared to physicians (54 vs 32%, p < 0.001) as top diagnosis as well as among the top 3 diagnoses (59 vs 42%, p < 0.001). Work experience was not related to suggesting the correct diagnosis or IRD status. Confined to basic health and symptom-related medical history, the diagnostic accuracy of physicians was lower compared to an AI-based symptom checker. These results highlight the potential of using symptom checkers early during the patient journey and importance of access to complete and sufficient patient information to establish a correct diagnosis.
AB - Symptom checkers are increasingly used to assess new symptoms and navigate the health care system. The aim of this study was to compare the accuracy of an artificial intelligence (AI)-based symptom checker (Ada) and physicians regarding the presence/absence of an inflammatory rheumatic disease (IRD). In this survey study, German-speaking physicians with prior rheumatology working experience were asked to determine IRD presence/absence and suggest diagnoses for 20 different real-world patient vignettes, which included only basic health and symptom-related medical history. IRD detection rate and suggested diagnoses of participants and Ada were compared to the gold standard, the final rheumatologists' diagnosis, reported on the discharge summary report. A total of 132 vignettes were completed by 33 physicians (mean rheumatology working experience 8.8 (SD 7.1) years). Ada's diagnostic accuracy (IRD) was significantly higher compared to physicians (70 vs 54%, p = 0.002) according to top diagnosis. Ada listed the correct diagnosis more often compared to physicians (54 vs 32%, p < 0.001) as top diagnosis as well as among the top 3 diagnoses (59 vs 42%, p < 0.001). Work experience was not related to suggesting the correct diagnosis or IRD status. Confined to basic health and symptom-related medical history, the diagnostic accuracy of physicians was lower compared to an AI-based symptom checker. These results highlight the potential of using symptom checkers early during the patient journey and importance of access to complete and sufficient patient information to establish a correct diagnosis.
U2 - 10.1007/s00296-022-05202-4
DO - 10.1007/s00296-022-05202-4
M3 - SCORING: Journal article
VL - 42
SP - 2167
EP - 2176
JO - RHEUMATOL INT
JF - RHEUMATOL INT
SN - 0172-8172
IS - 12
ER -