Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.
Standard
Modeling Variables With a Spike at Zero. Examples and Practical Recommendations. / Lorenz, Eva; Jenkner, Carolin; Sauerbrei, Willi; Becher, Heiko.
In: AM J EPIDEMIOL, Vol. 185, No. 8, 15.04.2017, p. 650-660.Research output: SCORING: Contribution to journal › SCORING: Journal article › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Modeling Variables With a Spike at Zero. Examples and Practical Recommendations.
AU - Lorenz, Eva
AU - Jenkner, Carolin
AU - Sauerbrei, Willi
AU - Becher, Heiko
N1 - © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
PY - 2017/4/15
Y1 - 2017/4/15
N2 - In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (Jöckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.
AB - In most epidemiologic studies and in clinical research generally, there are variables with a spike at zero, namely variables for which a proportion of individuals have zero exposure (e.g., never smokers) and among those exposed the variable has a continuous distribution. Different options exist for modeling such variables, such as categorization where the nonexposed form the reference group, or ignoring the spike by including the variable in the regression model with or without some transformation or modeling procedures. It has been shown that such situations can be analyzed by adding a binary indicator (exposed/nonexposed) to the regression model, and a method based on fractional polynomials with which to estimate a suitable functional form for the positive portion of the spike-at-zero variable distribution has been developed. In this paper, we compare different approaches using data from 3 case-control studies carried out in Germany: the Mammary Carcinoma Risk Factor Investigation (MARIE), a breast cancer study conducted in 2002-2005 (Flesch-Janys et al., Int J Cancer. 2008;123(4):933-941); the Rhein-Neckar Larynx Study, a study of laryngeal cancer conducted in 1998-2000 (Dietz et al., Int J Cancer. 2004;108(6):907-911); and a lung cancer study conducted in 1988-1993 (Jöckel et al., Int J Epidemiol. 1998;27(4):549-560). Strengths and limitations of different procedures are demonstrated, and some recommendations for practical use are given.
KW - Aged
KW - Asbestos
KW - Breast Neoplasms
KW - Case-Control Studies
KW - Construction Materials
KW - Data Interpretation, Statistical
KW - Dose-Response Relationship, Drug
KW - Dust
KW - Estrogen Replacement Therapy
KW - Female
KW - Humans
KW - Laryngeal Neoplasms
KW - Lung Neoplasms
KW - Male
KW - Middle Aged
KW - Models, Statistical
KW - Occupational Exposure
KW - Regression Analysis
KW - Risk Factors
KW - Journal Article
U2 - 10.1093/aje/kww122
DO - 10.1093/aje/kww122
M3 - SCORING: Journal article
C2 - 28369154
VL - 185
SP - 650
EP - 660
JO - AM J EPIDEMIOL
JF - AM J EPIDEMIOL
SN - 0002-9262
IS - 8
ER -