Semi-Automated Mapping of German Study Data Concepts to an English Common Data Model

Standard

Semi-Automated Mapping of German Study Data Concepts to an English Common Data Model. / Chechulina, Anna; Carus, Jasmin; Breitfeld, Philipp; Gundler, Christopher; Hees, Hanna; Twerenbold, Raphael; Blankenberg, Stefan; Ückert, Frank; Nürnberg, Sylvia.

in: APPL SCI-BASEL, Jahrgang 13, Nr. 14, 13.07.2023, S. 8159.

Publikationen: SCORING: Beitrag in Fachzeitschrift/ZeitungSCORING: ZeitschriftenaufsatzForschungBegutachtung

Harvard

APA

Vancouver

Bibtex

@article{1c5d108dbe7146a991b044fc37a1c14d,
title = "Semi-Automated Mapping of German Study Data Concepts to an English Common Data Model",
abstract = "The standardization of data from medical studies and hospital information systems to a common data model such as the Observational Medical Outcomes Partnership (OMOP) model can help make large datasets available for analysis using artificial intelligence approaches. Commonly, automatic mapping without intervention from domain experts delivers poor results. Further challenges arise from the need for translation of non-English medical data. Here, we report the establishment of a mapping approach which automatically translates German data variable names into English and suggests OMOP concepts. The approach was set up using study data from the Hamburg City Health Study. It was evaluated against the current standard, refined, and tested on a separate dataset. Furthermore, different types of graphical user interfaces for the selection of suggested OMOP concepts were created and assessed. Compared to the current standard our approach performs slightly better. Its main advantage lies in the automatic processing of German phrases into English OMOP concept suggestions, operating without the need for human intervention. Challenges still lie in the adequate translation of nonstandard expressions, as well as in the resolution of abbreviations into long names.",
author = "Anna Chechulina and Jasmin Carus and Philipp Breitfeld and Christopher Gundler and Hanna Hees and Raphael Twerenbold and Stefan Blankenberg and Frank {\"U}ckert and Sylvia N{\"u}rnberg",
year = "2023",
month = jul,
day = "13",
doi = "10.3390/app13148159",
language = "English",
volume = "13",
pages = "8159",
journal = "APPL SCI-BASEL",
issn = "2076-3417",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "14",

}

RIS

TY - JOUR

T1 - Semi-Automated Mapping of German Study Data Concepts to an English Common Data Model

AU - Chechulina, Anna

AU - Carus, Jasmin

AU - Breitfeld, Philipp

AU - Gundler, Christopher

AU - Hees, Hanna

AU - Twerenbold, Raphael

AU - Blankenberg, Stefan

AU - Ückert, Frank

AU - Nürnberg, Sylvia

PY - 2023/7/13

Y1 - 2023/7/13

N2 - The standardization of data from medical studies and hospital information systems to a common data model such as the Observational Medical Outcomes Partnership (OMOP) model can help make large datasets available for analysis using artificial intelligence approaches. Commonly, automatic mapping without intervention from domain experts delivers poor results. Further challenges arise from the need for translation of non-English medical data. Here, we report the establishment of a mapping approach which automatically translates German data variable names into English and suggests OMOP concepts. The approach was set up using study data from the Hamburg City Health Study. It was evaluated against the current standard, refined, and tested on a separate dataset. Furthermore, different types of graphical user interfaces for the selection of suggested OMOP concepts were created and assessed. Compared to the current standard our approach performs slightly better. Its main advantage lies in the automatic processing of German phrases into English OMOP concept suggestions, operating without the need for human intervention. Challenges still lie in the adequate translation of nonstandard expressions, as well as in the resolution of abbreviations into long names.

AB - The standardization of data from medical studies and hospital information systems to a common data model such as the Observational Medical Outcomes Partnership (OMOP) model can help make large datasets available for analysis using artificial intelligence approaches. Commonly, automatic mapping without intervention from domain experts delivers poor results. Further challenges arise from the need for translation of non-English medical data. Here, we report the establishment of a mapping approach which automatically translates German data variable names into English and suggests OMOP concepts. The approach was set up using study data from the Hamburg City Health Study. It was evaluated against the current standard, refined, and tested on a separate dataset. Furthermore, different types of graphical user interfaces for the selection of suggested OMOP concepts were created and assessed. Compared to the current standard our approach performs slightly better. Its main advantage lies in the automatic processing of German phrases into English OMOP concept suggestions, operating without the need for human intervention. Challenges still lie in the adequate translation of nonstandard expressions, as well as in the resolution of abbreviations into long names.

U2 - 10.3390/app13148159

DO - 10.3390/app13148159

M3 - SCORING: Journal article

VL - 13

SP - 8159

JO - APPL SCI-BASEL

JF - APPL SCI-BASEL

SN - 2076-3417

IS - 14

ER -