Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium

Standard

Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium. / Schurmann, Claudia; Heim, Katharina; Schillert, Arne; Blankenberg, Stefan; Carstensen, Maren; Dörr, Marcus; Endlich, Karlhans; Felix, Stephan B; Gieger, Christian; Grallert, Harald; Herder, Christian; Hoffmann, Wolfgang; Homuth, Georg; Illig, Thomas; Kruppa, Jochen; Meitinger, Thomas; Müller, Christian; Nauck, Matthias; Peters, Annette; Rettig, Rainer; Roden, Michael; Strauch, Konstantin; Völker, Uwe; Völzke, Henry; Wahl, Simone; Wallaschofski, Henri; Wild, Philipp S; Zeller, Tanja; Teumer, Alexander; Prokisch, Holger; Ziegler, Andreas.

In: PLOS ONE, Vol. 7, No. 12, 2012, p. e50938.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

Schurmann, C, Heim, K, Schillert, A, Blankenberg, S, Carstensen, M, Dörr, M, Endlich, K, Felix, SB, Gieger, C, Grallert, H, Herder, C, Hoffmann, W, Homuth, G, Illig, T, Kruppa, J, Meitinger, T, Müller, C, Nauck, M, Peters, A, Rettig, R, Roden, M, Strauch, K, Völker, U, Völzke, H, Wahl, S, Wallaschofski, H, Wild, PS, Zeller, T, Teumer, A, Prokisch, H & Ziegler, A 2012, 'Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium', PLOS ONE, vol. 7, no. 12, pp. e50938. https://doi.org/10.1371/journal.pone.0050938

APA

Schurmann, C., Heim, K., Schillert, A., Blankenberg, S., Carstensen, M., Dörr, M., Endlich, K., Felix, S. B., Gieger, C., Grallert, H., Herder, C., Hoffmann, W., Homuth, G., Illig, T., Kruppa, J., Meitinger, T., Müller, C., Nauck, M., Peters, A., ... Ziegler, A. (2012). Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium. PLOS ONE, 7(12), e50938. https://doi.org/10.1371/journal.pone.0050938

Vancouver

Bibtex

@article{82270c9b03b343e383a76e6ce5721c2f,
title = "Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium",
abstract = "Microarray profiling of gene expression is widely applied in molecular biology and functional genomics. Experimental and technical variations make meta-analysis of different studies challenging. In a total of 3358 samples, all from German population-based cohorts, we investigated the effect of data preprocessing and the variability due to sample processing in whole blood cell and blood monocyte gene expression data, measured on the Illumina HumanHT-12 v3 BeadChip array.Gene expression signal intensities were similar after applying the log(2) or the variance-stabilizing transformation. In all cohorts, the first principal component (PC) explained more than 95% of the total variation. Technical factors substantially influenced signal intensity values, especially the Illumina chip assignment (33-48% of the variance), the RNA amplification batch (12-24%), the RNA isolation batch (16%), and the sample storage time, in particular the time between blood donation and RNA isolation for the whole blood cell samples (2-3%), and the time between RNA isolation and amplification for the monocyte samples (2%). White blood cell composition parameters were the strongest biological factors influencing the expression signal intensities in the whole blood cell samples (3%), followed by sex (1-2%) in both sample types. Known single nucleotide polymorphisms (SNPs) were located in 38% of the analyzed probe sequences and 4% of them included common SNPs (minor allele frequency >5%). Out of the tested SNPs, 1.4% significantly modified the probe-specific expression signals (Bonferroni corrected p-value<0.05), but in almost half of these events the signal intensities were even increased despite the occurrence of the mismatch. Thus, the vast majority of SNPs within probes had no significant effect on hybridization efficiency.In summary, adjustment for a few selected technical factors greatly improved reliability of gene expression analyses. Such adjustments are particularly required for meta-analyses.",
keywords = "Gene Expression, Gene Expression Profiling/methods, Germany, Humans, Oligonucleotide Array Sequence Analysis/methods, Polymorphism, Single Nucleotide, Reproducibility of Results",
author = "Claudia Schurmann and Katharina Heim and Arne Schillert and Stefan Blankenberg and Maren Carstensen and Marcus D{\"o}rr and Karlhans Endlich and Felix, {Stephan B} and Christian Gieger and Harald Grallert and Christian Herder and Wolfgang Hoffmann and Georg Homuth and Thomas Illig and Jochen Kruppa and Thomas Meitinger and Christian M{\"u}ller and Matthias Nauck and Annette Peters and Rainer Rettig and Michael Roden and Konstantin Strauch and Uwe V{\"o}lker and Henry V{\"o}lzke and Simone Wahl and Henri Wallaschofski and Wild, {Philipp S} and Tanja Zeller and Alexander Teumer and Holger Prokisch and Andreas Ziegler",
year = "2012",
doi = "10.1371/journal.pone.0050938",
language = "English",
volume = "7",
pages = "e50938",
journal = "PLOS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

RIS

TY - JOUR

T1 - Analyzing illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the metaxpress consortium

AU - Schurmann, Claudia

AU - Heim, Katharina

AU - Schillert, Arne

AU - Blankenberg, Stefan

AU - Carstensen, Maren

AU - Dörr, Marcus

AU - Endlich, Karlhans

AU - Felix, Stephan B

AU - Gieger, Christian

AU - Grallert, Harald

AU - Herder, Christian

AU - Hoffmann, Wolfgang

AU - Homuth, Georg

AU - Illig, Thomas

AU - Kruppa, Jochen

AU - Meitinger, Thomas

AU - Müller, Christian

AU - Nauck, Matthias

AU - Peters, Annette

AU - Rettig, Rainer

AU - Roden, Michael

AU - Strauch, Konstantin

AU - Völker, Uwe

AU - Völzke, Henry

AU - Wahl, Simone

AU - Wallaschofski, Henri

AU - Wild, Philipp S

AU - Zeller, Tanja

AU - Teumer, Alexander

AU - Prokisch, Holger

AU - Ziegler, Andreas

PY - 2012

Y1 - 2012

N2 - Microarray profiling of gene expression is widely applied in molecular biology and functional genomics. Experimental and technical variations make meta-analysis of different studies challenging. In a total of 3358 samples, all from German population-based cohorts, we investigated the effect of data preprocessing and the variability due to sample processing in whole blood cell and blood monocyte gene expression data, measured on the Illumina HumanHT-12 v3 BeadChip array.Gene expression signal intensities were similar after applying the log(2) or the variance-stabilizing transformation. In all cohorts, the first principal component (PC) explained more than 95% of the total variation. Technical factors substantially influenced signal intensity values, especially the Illumina chip assignment (33-48% of the variance), the RNA amplification batch (12-24%), the RNA isolation batch (16%), and the sample storage time, in particular the time between blood donation and RNA isolation for the whole blood cell samples (2-3%), and the time between RNA isolation and amplification for the monocyte samples (2%). White blood cell composition parameters were the strongest biological factors influencing the expression signal intensities in the whole blood cell samples (3%), followed by sex (1-2%) in both sample types. Known single nucleotide polymorphisms (SNPs) were located in 38% of the analyzed probe sequences and 4% of them included common SNPs (minor allele frequency >5%). Out of the tested SNPs, 1.4% significantly modified the probe-specific expression signals (Bonferroni corrected p-value<0.05), but in almost half of these events the signal intensities were even increased despite the occurrence of the mismatch. Thus, the vast majority of SNPs within probes had no significant effect on hybridization efficiency.In summary, adjustment for a few selected technical factors greatly improved reliability of gene expression analyses. Such adjustments are particularly required for meta-analyses.

AB - Microarray profiling of gene expression is widely applied in molecular biology and functional genomics. Experimental and technical variations make meta-analysis of different studies challenging. In a total of 3358 samples, all from German population-based cohorts, we investigated the effect of data preprocessing and the variability due to sample processing in whole blood cell and blood monocyte gene expression data, measured on the Illumina HumanHT-12 v3 BeadChip array.Gene expression signal intensities were similar after applying the log(2) or the variance-stabilizing transformation. In all cohorts, the first principal component (PC) explained more than 95% of the total variation. Technical factors substantially influenced signal intensity values, especially the Illumina chip assignment (33-48% of the variance), the RNA amplification batch (12-24%), the RNA isolation batch (16%), and the sample storage time, in particular the time between blood donation and RNA isolation for the whole blood cell samples (2-3%), and the time between RNA isolation and amplification for the monocyte samples (2%). White blood cell composition parameters were the strongest biological factors influencing the expression signal intensities in the whole blood cell samples (3%), followed by sex (1-2%) in both sample types. Known single nucleotide polymorphisms (SNPs) were located in 38% of the analyzed probe sequences and 4% of them included common SNPs (minor allele frequency >5%). Out of the tested SNPs, 1.4% significantly modified the probe-specific expression signals (Bonferroni corrected p-value<0.05), but in almost half of these events the signal intensities were even increased despite the occurrence of the mismatch. Thus, the vast majority of SNPs within probes had no significant effect on hybridization efficiency.In summary, adjustment for a few selected technical factors greatly improved reliability of gene expression analyses. Such adjustments are particularly required for meta-analyses.

KW - Gene Expression

KW - Gene Expression Profiling/methods

KW - Germany

KW - Humans

KW - Oligonucleotide Array Sequence Analysis/methods

KW - Polymorphism, Single Nucleotide

KW - Reproducibility of Results

U2 - 10.1371/journal.pone.0050938

DO - 10.1371/journal.pone.0050938

M3 - SCORING: Journal article

C2 - 23236413

VL - 7

SP - e50938

JO - PLOS ONE

JF - PLOS ONE

SN - 1932-6203

IS - 12

ER -