Explainable AI to improve acceptance of convolutional neural networks for automatic classification of dopamine transporter SPECT in the diagnosis of clinically uncertain parkinsonian syndromes

Standard

Explainable AI to improve acceptance of convolutional neural networks for automatic classification of dopamine transporter SPECT in the diagnosis of clinically uncertain parkinsonian syndromes. / Nazari, Mahmood; Kluge, Andreas; Apostolova, Ivayla; Klutmann, Susanne; Kimiaei, Sharok; Schroeder, Michael; Buchert, Ralph.

In: EUR J NUCL MED MOL I, Vol. 49, No. 4, 03.2022, p. 1176-1186.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

APA

Vancouver

Bibtex

@article{567c6b3c568c49ca814e91a281b95eb5,
title = "Explainable AI to improve acceptance of convolutional neural networks for automatic classification of dopamine transporter SPECT in the diagnosis of clinically uncertain parkinsonian syndromes",
abstract = "PURPOSE: Deep convolutional neural networks (CNN) provide high accuracy for automatic classification of dopamine transporter (DAT) SPECT images. However, CNN are inherently black-box in nature lacking any kind of explanation for their decisions. This limits their acceptance for clinical use. This study tested layer-wise relevance propagation (LRP) to explain CNN-based classification of DAT-SPECT in patients with clinically uncertain parkinsonian syndromes.METHODS: The study retrospectively included 1296 clinical DAT-SPECT with visual binary interpretation as {"}normal{"} or {"}reduced{"} by two experienced readers as standard-of-truth. A custom-made CNN was trained with 1008 randomly selected DAT-SPECT. The remaining 288 DAT-SPECT were used to assess classification performance of the CNN and to test LRP for explanation of the CNN-based classification.RESULTS: Overall accuracy, sensitivity, and specificity of the CNN were 95.8%, 92.8%, and 98.7%, respectively. LRP provided relevance maps that were easy to interpret in each individual DAT-SPECT. In particular, the putamen in the hemisphere most affected by nigrostriatal degeneration was the most relevant brain region for CNN-based classification in all reduced DAT-SPECT. Some misclassified DAT-SPECT showed an {"}inconsistent{"} relevance map more typical for the true class label.CONCLUSION: LRP is useful to provide explanation of CNN-based decisions in individual DAT-SPECT and, therefore, can be recommended to support CNN-based classification of DAT-SPECT in clinical routine. Total computation time of 3 s is compatible with busy clinical workflow. The utility of {"}inconsistent{"} relevance maps to identify misclassified cases requires further investigation.",
author = "Mahmood Nazari and Andreas Kluge and Ivayla Apostolova and Susanne Klutmann and Sharok Kimiaei and Michael Schroeder and Ralph Buchert",
note = "{\textcopyright} 2021. The Author(s).",
year = "2022",
month = mar,
doi = "10.1007/s00259-021-05569-9",
language = "English",
volume = "49",
pages = "1176--1186",
journal = "EUR J NUCL MED MOL I",
issn = "1619-7070",
publisher = "Springer",
number = "4",

}

RIS

TY - JOUR

T1 - Explainable AI to improve acceptance of convolutional neural networks for automatic classification of dopamine transporter SPECT in the diagnosis of clinically uncertain parkinsonian syndromes

AU - Nazari, Mahmood

AU - Kluge, Andreas

AU - Apostolova, Ivayla

AU - Klutmann, Susanne

AU - Kimiaei, Sharok

AU - Schroeder, Michael

AU - Buchert, Ralph

N1 - © 2021. The Author(s).

PY - 2022/3

Y1 - 2022/3

N2 - PURPOSE: Deep convolutional neural networks (CNN) provide high accuracy for automatic classification of dopamine transporter (DAT) SPECT images. However, CNN are inherently black-box in nature lacking any kind of explanation for their decisions. This limits their acceptance for clinical use. This study tested layer-wise relevance propagation (LRP) to explain CNN-based classification of DAT-SPECT in patients with clinically uncertain parkinsonian syndromes.METHODS: The study retrospectively included 1296 clinical DAT-SPECT with visual binary interpretation as "normal" or "reduced" by two experienced readers as standard-of-truth. A custom-made CNN was trained with 1008 randomly selected DAT-SPECT. The remaining 288 DAT-SPECT were used to assess classification performance of the CNN and to test LRP for explanation of the CNN-based classification.RESULTS: Overall accuracy, sensitivity, and specificity of the CNN were 95.8%, 92.8%, and 98.7%, respectively. LRP provided relevance maps that were easy to interpret in each individual DAT-SPECT. In particular, the putamen in the hemisphere most affected by nigrostriatal degeneration was the most relevant brain region for CNN-based classification in all reduced DAT-SPECT. Some misclassified DAT-SPECT showed an "inconsistent" relevance map more typical for the true class label.CONCLUSION: LRP is useful to provide explanation of CNN-based decisions in individual DAT-SPECT and, therefore, can be recommended to support CNN-based classification of DAT-SPECT in clinical routine. Total computation time of 3 s is compatible with busy clinical workflow. The utility of "inconsistent" relevance maps to identify misclassified cases requires further investigation.

AB - PURPOSE: Deep convolutional neural networks (CNN) provide high accuracy for automatic classification of dopamine transporter (DAT) SPECT images. However, CNN are inherently black-box in nature lacking any kind of explanation for their decisions. This limits their acceptance for clinical use. This study tested layer-wise relevance propagation (LRP) to explain CNN-based classification of DAT-SPECT in patients with clinically uncertain parkinsonian syndromes.METHODS: The study retrospectively included 1296 clinical DAT-SPECT with visual binary interpretation as "normal" or "reduced" by two experienced readers as standard-of-truth. A custom-made CNN was trained with 1008 randomly selected DAT-SPECT. The remaining 288 DAT-SPECT were used to assess classification performance of the CNN and to test LRP for explanation of the CNN-based classification.RESULTS: Overall accuracy, sensitivity, and specificity of the CNN were 95.8%, 92.8%, and 98.7%, respectively. LRP provided relevance maps that were easy to interpret in each individual DAT-SPECT. In particular, the putamen in the hemisphere most affected by nigrostriatal degeneration was the most relevant brain region for CNN-based classification in all reduced DAT-SPECT. Some misclassified DAT-SPECT showed an "inconsistent" relevance map more typical for the true class label.CONCLUSION: LRP is useful to provide explanation of CNN-based decisions in individual DAT-SPECT and, therefore, can be recommended to support CNN-based classification of DAT-SPECT in clinical routine. Total computation time of 3 s is compatible with busy clinical workflow. The utility of "inconsistent" relevance maps to identify misclassified cases requires further investigation.

U2 - 10.1007/s00259-021-05569-9

DO - 10.1007/s00259-021-05569-9

M3 - SCORING: Journal article

C2 - 34651223

VL - 49

SP - 1176

EP - 1186

JO - EUR J NUCL MED MOL I

JF - EUR J NUCL MED MOL I

SN - 1619-7070

IS - 4

ER -