Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.

Standard

Modeling Variables With a Spike at Zero. Examples and Practical Recommendations. / Lorenz, Eva; Jenkner, Carolin; Sauerbrei, Willi; Becher, Heiko.

In: AM J EPIDEMIOL, Vol. 185, No. 8, 15.04.2017, p. 650-660.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

Lorenz, E, Jenkner, C, Sauerbrei, W & Becher, H 2017, 'Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.', AM J EPIDEMIOL, vol. 185, no. 8, pp. 650-660. https://doi.org/10.1093/aje/kww122

APA

Vancouver

Bibtex

@article{cd7bfddb7ce7417a8346e935d871dffe,
title = "Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.",
abstract = "In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (J{\"o}ckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.",
keywords = "Aged, Asbestos, Breast Neoplasms, Case-Control Studies, Construction Materials, Data Interpretation, Statistical, Dose-Response Relationship, Drug, Dust, Estrogen Replacement Therapy, Female, Humans, Laryngeal Neoplasms, Lung Neoplasms, Male, Middle Aged, Models, Statistical, Occupational Exposure, Regression Analysis, Risk Factors, Journal Article",
author = "Eva Lorenz and Carolin Jenkner and Willi Sauerbrei and Heiko Becher",
note = "{\textcopyright} The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.",
year = "2017",
month = apr,
day = "15",
doi = "10.1093/aje/kww122",
language = "English",
volume = "185",
pages = "650--660",
journal = "AM J EPIDEMIOL",
issn = "0002-9262",
publisher = "Oxford University Press",
number = "8",

}

RIS

TY - JOUR

T1 - Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.

AU - Lorenz, Eva

AU - Jenkner, Carolin

AU - Sauerbrei, Willi

AU - Becher, Heiko

N1 - © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

PY - 2017/4/15

Y1 - 2017/4/15

N2 - In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (Jöckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.

AB - In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (Jöckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.

KW - Aged

KW - Asbestos

KW - Breast Neoplasms

KW - Case-Control Studies

KW - Construction Materials

KW - Data Interpretation, Statistical

KW - Dose-Response Relationship, Drug

KW - Dust

KW - Estrogen Replacement Therapy

KW - Female

KW - Humans

KW - Laryngeal Neoplasms

KW - Lung Neoplasms

KW - Male

KW - Middle Aged

KW - Models, Statistical

KW - Occupational Exposure

KW - Regression Analysis

KW - Risk Factors

KW - Journal Article

U2 - 10.1093/aje/kww122

DO - 10.1093/aje/kww122

M3 - SCORING: Journal article

C2 - 28369154

VL - 185

SP - 650

EP - 660

JO - AM J EPIDEMIOL

JF - AM J EPIDEMIOL

SN - 0002-9262

IS - 8

ER -