Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy

Standard

Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy. / Gräf, Markus; Knitza, Johannes; Leipe, Jan; Krusche, Martin; Welcker, Martin; Kuhn, Sebastian; Mucke, Johanna; Hueber, Axel J.; Hornig, Johannes; Klemm, Philipp; Kleinert, Stefan; Aries, Peer; Vuillerme, Nicolas; Simon, David; Kleyer, Arnd; Schett, Georg; Callhoff, Johanna.

in: RHEUMATOL INT, Jahrgang 42, Nr. 12, 12.2022, S. 2167-2176.

Publikationen: SCORING: Beitrag in Fachzeitschrift/ZeitungSCORING: ZeitschriftenaufsatzForschungBegutachtung

Harvard

Gräf, M, Knitza, J, Leipe, J, Krusche, M, Welcker, M, Kuhn, S, Mucke, J, Hueber, AJ, Hornig, J, Klemm, P, Kleinert, S, Aries, P, Vuillerme, N, Simon, D, Kleyer, A, Schett, G & Callhoff, J 2022, 'Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy', RHEUMATOL INT, Jg. 42, Nr. 12, S. 2167-2176. https://doi.org/10.1007/s00296-022-05202-4

APA

Gräf, M., Knitza, J., Leipe, J., Krusche, M., Welcker, M., Kuhn, S., Mucke, J., Hueber, A. J., Hornig, J., Klemm, P., Kleinert, S., Aries, P., Vuillerme, N., Simon, D., Kleyer, A., Schett, G., & Callhoff, J. (2022). Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy. RHEUMATOL INT, 42(12), 2167-2176. https://doi.org/10.1007/s00296-022-05202-4

Vancouver

Bibtex

@article{9de6b214629148868434537f05d28759,
title = "Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy",
abstract = "Symptom checkers are increasingly used to assess new symptoms and navigate the health care system. The aim of this study was to compare the accuracy of an artificial intelligence (AI)-based symptom checker (Ada) and physicians regarding the presence/absence of an inflammatory rheumatic disease (IRD). In this survey study, German-speaking physicians with prior rheumatology working experience were asked to determine IRD presence/absence and suggest diagnoses for 20 different real-world patient vignettes, which included only basic health and symptom-related medical history. IRD detection rate and suggested diagnoses of participants and Ada were compared to the gold standard, the final rheumatologists' diagnosis, reported on the discharge summary report. A total of 132 vignettes were completed by 33 physicians (mean rheumatology working experience 8.8 (SD 7.1) years). Ada's diagnostic accuracy (IRD) was significantly higher compared to physicians (70 vs 54%, p = 0.002) according to top diagnosis. Ada listed the correct diagnosis more often compared to physicians (54 vs 32%, p < 0.001) as top diagnosis as well as among the top 3 diagnoses (59 vs 42%, p < 0.001). Work experience was not related to suggesting the correct diagnosis or IRD status. Confined to basic health and symptom-related medical history, the diagnostic accuracy of physicians was lower compared to an AI-based symptom checker. These results highlight the potential of using symptom checkers early during the patient journey and importance of access to complete and sufficient patient information to establish a correct diagnosis.",
author = "Markus Gr{\"a}f and Johannes Knitza and Jan Leipe and Martin Krusche and Martin Welcker and Sebastian Kuhn and Johanna Mucke and Hueber, {Axel J.} and Johannes Hornig and Philipp Klemm and Stefan Kleinert and Peer Aries and Nicolas Vuillerme and David Simon and Arnd Kleyer and Georg Schett and Johanna Callhoff",
year = "2022",
month = dec,
doi = "10.1007/s00296-022-05202-4",
language = "English",
volume = "42",
pages = "2167--2176",
journal = "RHEUMATOL INT",
issn = "0172-8172",
publisher = "Springer",
number = "12",

}

RIS

TY - JOUR

T1 - Comparison of physician and artificial intelligence-based symptom checker diagnostic accuracy

AU - Gräf, Markus

AU - Knitza, Johannes

AU - Leipe, Jan

AU - Krusche, Martin

AU - Welcker, Martin

AU - Kuhn, Sebastian

AU - Mucke, Johanna

AU - Hueber, Axel J.

AU - Hornig, Johannes

AU - Klemm, Philipp

AU - Kleinert, Stefan

AU - Aries, Peer

AU - Vuillerme, Nicolas

AU - Simon, David

AU - Kleyer, Arnd

AU - Schett, Georg

AU - Callhoff, Johanna

PY - 2022/12

Y1 - 2022/12

N2 - Symptom checkers are increasingly used to assess new symptoms and navigate the health care system. The aim of this study was to compare the accuracy of an artificial intelligence (AI)-based symptom checker (Ada) and physicians regarding the presence/absence of an inflammatory rheumatic disease (IRD). In this survey study, German-speaking physicians with prior rheumatology working experience were asked to determine IRD presence/absence and suggest diagnoses for 20 different real-world patient vignettes, which included only basic health and symptom-related medical history. IRD detection rate and suggested diagnoses of participants and Ada were compared to the gold standard, the final rheumatologists' diagnosis, reported on the discharge summary report. A total of 132 vignettes were completed by 33 physicians (mean rheumatology working experience 8.8 (SD 7.1) years). Ada's diagnostic accuracy (IRD) was significantly higher compared to physicians (70 vs 54%, p = 0.002) according to top diagnosis. Ada listed the correct diagnosis more often compared to physicians (54 vs 32%, p < 0.001) as top diagnosis as well as among the top 3 diagnoses (59 vs 42%, p < 0.001). Work experience was not related to suggesting the correct diagnosis or IRD status. Confined to basic health and symptom-related medical history, the diagnostic accuracy of physicians was lower compared to an AI-based symptom checker. These results highlight the potential of using symptom checkers early during the patient journey and importance of access to complete and sufficient patient information to establish a correct diagnosis.

AB - Symptom checkers are increasingly used to assess new symptoms and navigate the health care system. The aim of this study was to compare the accuracy of an artificial intelligence (AI)-based symptom checker (Ada) and physicians regarding the presence/absence of an inflammatory rheumatic disease (IRD). In this survey study, German-speaking physicians with prior rheumatology working experience were asked to determine IRD presence/absence and suggest diagnoses for 20 different real-world patient vignettes, which included only basic health and symptom-related medical history. IRD detection rate and suggested diagnoses of participants and Ada were compared to the gold standard, the final rheumatologists' diagnosis, reported on the discharge summary report. A total of 132 vignettes were completed by 33 physicians (mean rheumatology working experience 8.8 (SD 7.1) years). Ada's diagnostic accuracy (IRD) was significantly higher compared to physicians (70 vs 54%, p = 0.002) according to top diagnosis. Ada listed the correct diagnosis more often compared to physicians (54 vs 32%, p < 0.001) as top diagnosis as well as among the top 3 diagnoses (59 vs 42%, p < 0.001). Work experience was not related to suggesting the correct diagnosis or IRD status. Confined to basic health and symptom-related medical history, the diagnostic accuracy of physicians was lower compared to an AI-based symptom checker. These results highlight the potential of using symptom checkers early during the patient journey and importance of access to complete and sufficient patient information to establish a correct diagnosis.

U2 - 10.1007/s00296-022-05202-4

DO - 10.1007/s00296-022-05202-4

M3 - SCORING: Journal article

VL - 42

SP - 2167

EP - 2176

JO - RHEUMATOL INT

JF - RHEUMATOL INT

SN - 0172-8172

IS - 12

ER -