Modeling continuous covariates with a "spike" at zero: Bivariate approaches

Standard

Modeling continuous covariates with a "spike" at zero: Bivariate approaches. / Jenkner, Carolin; Lorenz, Eva; Becher, Heiko; Sauerbrei, Willi.

In: BIOMETRICAL J, Vol. 58, No. 4, 07.2016, p. 783-796.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

Jenkner, C, Lorenz, E, Becher, H & Sauerbrei, W 2016, 'Modeling continuous covariates with a "spike" at zero: Bivariate approaches', BIOMETRICAL J, vol. 58, no. 4, pp. 783-796. https://doi.org/10.1002/bimj.201400112

APA

Vancouver

Bibtex

@article{004de64b03fe4b879ed741db542d0158,
title = "Modeling continuous covariates with a {"}spike{"} at zero: Bivariate approaches",
abstract = "In epidemiology and clinical research, predictors often take value zero for a large amount of observations while the distribution of the remaining observations is continuous. These predictors are called variables with a spike at zero. Examples include smoking or alcohol consumption. Recently, an extension of the fractional polynomial (FP) procedure, a technique for modeling nonlinear relationships, was proposed to deal with such situations. To indicate whether or not a value is zero, a binary variable is added to the model. In a two stage procedure, called FP-spike, the necessity of the binary variable and/or the continuous FP function for the positive part are assessed for a suitable fit. In univariate analyses, the FP-spike procedure usually leads to functional relationships that are easy to interpret. This paper introduces four approaches for dealing with two variables with a spike at zero (SAZ). The methods depend on the bivariate distribution of zero and nonzero values. Bi-Sep is the simplest of the four bivariate approaches. It uses the univariate FP-spike procedure separately for the two SAZ variables. In Bi-D3, Bi-D1, and Bi-Sub, proportions of zeros in both variables are considered simultaneously in the binary indicators. Therefore, these strategies can account for correlated variables. The methods can be used for arbitrary distributions of the covariates. For illustration and comparison of results, data from a case-control study on laryngeal cancer, with smoking and alcohol intake as two SAZ variables, is considered. In addition, a possible extension to three or more SAZ variables is outlined. A combination of log-linear models for the analysis of the correlation in combination with the bivariate approaches is proposed.",
author = "Carolin Jenkner and Eva Lorenz and Heiko Becher and Willi Sauerbrei",
note = "{\textcopyright} 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.",
year = "2016",
month = jul,
doi = "10.1002/bimj.201400112",
language = "English",
volume = "58",
pages = "783--796",
journal = "BIOMETRICAL J",
issn = "0323-3847",
publisher = "Wiley-VCH Verlag GmbH",
number = "4",

}

RIS

TY - JOUR

T1 - Modeling continuous covariates with a "spike" at zero: Bivariate approaches

AU - Jenkner, Carolin

AU - Lorenz, Eva

AU - Becher, Heiko

AU - Sauerbrei, Willi

N1 - © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

PY - 2016/7

Y1 - 2016/7

N2 - In epidemiology and clinical research, predictors often take value zero for a large amount of observations while the distribution of the remaining observations is continuous. These predictors are called variables with a spike at zero. Examples include smoking or alcohol consumption. Recently, an extension of the fractional polynomial (FP) procedure, a technique for modeling nonlinear relationships, was proposed to deal with such situations. To indicate whether or not a value is zero, a binary variable is added to the model. In a two stage procedure, called FP-spike, the necessity of the binary variable and/or the continuous FP function for the positive part are assessed for a suitable fit. In univariate analyses, the FP-spike procedure usually leads to functional relationships that are easy to interpret. This paper introduces four approaches for dealing with two variables with a spike at zero (SAZ). The methods depend on the bivariate distribution of zero and nonzero values. Bi-Sep is the simplest of the four bivariate approaches. It uses the univariate FP-spike procedure separately for the two SAZ variables. In Bi-D3, Bi-D1, and Bi-Sub, proportions of zeros in both variables are considered simultaneously in the binary indicators. Therefore, these strategies can account for correlated variables. The methods can be used for arbitrary distributions of the covariates. For illustration and comparison of results, data from a case-control study on laryngeal cancer, with smoking and alcohol intake as two SAZ variables, is considered. In addition, a possible extension to three or more SAZ variables is outlined. A combination of log-linear models for the analysis of the correlation in combination with the bivariate approaches is proposed.

AB - In epidemiology and clinical research, predictors often take value zero for a large amount of observations while the distribution of the remaining observations is continuous. These predictors are called variables with a spike at zero. Examples include smoking or alcohol consumption. Recently, an extension of the fractional polynomial (FP) procedure, a technique for modeling nonlinear relationships, was proposed to deal with such situations. To indicate whether or not a value is zero, a binary variable is added to the model. In a two stage procedure, called FP-spike, the necessity of the binary variable and/or the continuous FP function for the positive part are assessed for a suitable fit. In univariate analyses, the FP-spike procedure usually leads to functional relationships that are easy to interpret. This paper introduces four approaches for dealing with two variables with a spike at zero (SAZ). The methods depend on the bivariate distribution of zero and nonzero values. Bi-Sep is the simplest of the four bivariate approaches. It uses the univariate FP-spike procedure separately for the two SAZ variables. In Bi-D3, Bi-D1, and Bi-Sub, proportions of zeros in both variables are considered simultaneously in the binary indicators. Therefore, these strategies can account for correlated variables. The methods can be used for arbitrary distributions of the covariates. For illustration and comparison of results, data from a case-control study on laryngeal cancer, with smoking and alcohol intake as two SAZ variables, is considered. In addition, a possible extension to three or more SAZ variables is outlined. A combination of log-linear models for the analysis of the correlation in combination with the bivariate approaches is proposed.

U2 - 10.1002/bimj.201400112

DO - 10.1002/bimj.201400112

M3 - SCORING: Journal article

C2 - 27072783

VL - 58

SP - 783

EP - 796

JO - BIOMETRICAL J

JF - BIOMETRICAL J

SN - 0323-3847

IS - 4

ER -