Machine learning-based identification of risk-factor signatures for undiagnosed atrial fibrillation in primary prevention and post-stroke in clinical practice

Standard

Machine learning-based identification of risk-factor signatures for undiagnosed atrial fibrillation in primary prevention and post-stroke in clinical practice. / Schnabel, Renate B; Witt, Henning; Walker, Jochen; Ludwig, Marion; Geelhoed, Bastian; Kossack, Nils; Schild, Marie; Miller, Robert; Kirchhof, Paulus.

In: EUR HEART J-QUAL CAR, Vol. 9, No. 1, 13.12.2022, p. 16-23.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

APA

Vancouver

Bibtex

@article{e64992db12d64827acb1d8f017f456b2,
title = "Machine learning-based identification of risk-factor signatures for undiagnosed atrial fibrillation in primary prevention and post-stroke in clinical practice",
abstract = "AIMS: Atrial fibrillation (AF) carries a substantial risk of ischemic stroke and other complications, and estimates suggest that over a third of cases remain undiagnosed. AF detection is particularly pressing in stroke survivors. To tailor AF screening efforts, we explored German health claims data for routinely available predictors of incident AF in primary care and post-stroke using machine learning methods.METHODS AND RESULTS: We combined AF predictors in patients over 45 years of age using claims data in the InGef database (n = 1 476 391) for (i) incident AF and (ii) AF post-stroke, using machine learning techniques. Between 2013-2016, new-onset AF was diagnosed in 98 958 patients (6.7%). Published risk factors for AF including male sex, hypertension, heart failure, valvular heart disease, and chronic kidney disease were confirmed. Component-wise gradient boosting identified additional predictors for AF from ICD-codes available in ambulatory care. The area under the curve (AUC) of the final, condensed model consisting of 13 predictors, was 0.829 (95% confidence interval (CI) 0.826-0.833) in the internal validation, and 0.755 (95% CI 0.603-0.890) in a prospective validation cohort (n = 661). The AUC for post-stroke AF was of 0.67 (95% CI 0.651-0.689) in the internal validation data set, and 0.766 (95% CI 0.731-0.800) in the prospective clinical cohort.CONCLUSION: ICD-coded clinical variables selected by machine learning can improve the identification of patients at risk of newly diagnosed AF. Using this readily available, automatically coded information can target AF screening efforts to identify high-risk populations in primary care and stroke survivors.",
keywords = "Humans, Male, Atrial Fibrillation/complications, Risk Assessment, Stroke/diagnosis, Risk Factors, Machine Learning, Primary Prevention",
author = "Schnabel, {Renate B} and Henning Witt and Jochen Walker and Marion Ludwig and Bastian Geelhoed and Nils Kossack and Marie Schild and Robert Miller and Paulus Kirchhof",
note = "{\textcopyright} The Author(s) 2022. Published by Oxford University Press on behalf of the European Society of Cardiology.",
year = "2022",
month = dec,
day = "13",
doi = "10.1093/ehjqcco/qcac013",
language = "English",
volume = "9",
pages = "16--23",
journal = "EUR HEART J-QUAL CAR",
issn = "2058-5225",
publisher = "Oxford University Press",
number = "1",

}

RIS

TY - JOUR

T1 - Machine learning-based identification of risk-factor signatures for undiagnosed atrial fibrillation in primary prevention and post-stroke in clinical practice

AU - Schnabel, Renate B

AU - Witt, Henning

AU - Walker, Jochen

AU - Ludwig, Marion

AU - Geelhoed, Bastian

AU - Kossack, Nils

AU - Schild, Marie

AU - Miller, Robert

AU - Kirchhof, Paulus

N1 - © The Author(s) 2022. Published by Oxford University Press on behalf of the European Society of Cardiology.

PY - 2022/12/13

Y1 - 2022/12/13

N2 - AIMS: Atrial fibrillation (AF) carries a substantial risk of ischemic stroke and other complications, and estimates suggest that over a third of cases remain undiagnosed. AF detection is particularly pressing in stroke survivors. To tailor AF screening efforts, we explored German health claims data for routinely available predictors of incident AF in primary care and post-stroke using machine learning methods.METHODS AND RESULTS: We combined AF predictors in patients over 45 years of age using claims data in the InGef database (n = 1 476 391) for (i) incident AF and (ii) AF post-stroke, using machine learning techniques. Between 2013-2016, new-onset AF was diagnosed in 98 958 patients (6.7%). Published risk factors for AF including male sex, hypertension, heart failure, valvular heart disease, and chronic kidney disease were confirmed. Component-wise gradient boosting identified additional predictors for AF from ICD-codes available in ambulatory care. The area under the curve (AUC) of the final, condensed model consisting of 13 predictors, was 0.829 (95% confidence interval (CI) 0.826-0.833) in the internal validation, and 0.755 (95% CI 0.603-0.890) in a prospective validation cohort (n = 661). The AUC for post-stroke AF was of 0.67 (95% CI 0.651-0.689) in the internal validation data set, and 0.766 (95% CI 0.731-0.800) in the prospective clinical cohort.CONCLUSION: ICD-coded clinical variables selected by machine learning can improve the identification of patients at risk of newly diagnosed AF. Using this readily available, automatically coded information can target AF screening efforts to identify high-risk populations in primary care and stroke survivors.

AB - AIMS: Atrial fibrillation (AF) carries a substantial risk of ischemic stroke and other complications, and estimates suggest that over a third of cases remain undiagnosed. AF detection is particularly pressing in stroke survivors. To tailor AF screening efforts, we explored German health claims data for routinely available predictors of incident AF in primary care and post-stroke using machine learning methods.METHODS AND RESULTS: We combined AF predictors in patients over 45 years of age using claims data in the InGef database (n = 1 476 391) for (i) incident AF and (ii) AF post-stroke, using machine learning techniques. Between 2013-2016, new-onset AF was diagnosed in 98 958 patients (6.7%). Published risk factors for AF including male sex, hypertension, heart failure, valvular heart disease, and chronic kidney disease were confirmed. Component-wise gradient boosting identified additional predictors for AF from ICD-codes available in ambulatory care. The area under the curve (AUC) of the final, condensed model consisting of 13 predictors, was 0.829 (95% confidence interval (CI) 0.826-0.833) in the internal validation, and 0.755 (95% CI 0.603-0.890) in a prospective validation cohort (n = 661). The AUC for post-stroke AF was of 0.67 (95% CI 0.651-0.689) in the internal validation data set, and 0.766 (95% CI 0.731-0.800) in the prospective clinical cohort.CONCLUSION: ICD-coded clinical variables selected by machine learning can improve the identification of patients at risk of newly diagnosed AF. Using this readily available, automatically coded information can target AF screening efforts to identify high-risk populations in primary care and stroke survivors.

KW - Humans

KW - Male

KW - Atrial Fibrillation/complications

KW - Risk Assessment

KW - Stroke/diagnosis

KW - Risk Factors

KW - Machine Learning

KW - Primary Prevention

U2 - 10.1093/ehjqcco/qcac013

DO - 10.1093/ehjqcco/qcac013

M3 - SCORING: Journal article

C2 - 35436783

VL - 9

SP - 16

EP - 23

JO - EUR HEART J-QUAL CAR

JF - EUR HEART J-QUAL CAR

SN - 2058-5225

IS - 1

ER -