Semiconductor sequencing: how many flows do you need?

Abstract

MOTIVATION: Semiconductor sequencing directly translates chemically encoded information (A, C, G or T) into voltage signals that are detected by a semiconductor device. Changes of pH value and thereby of the electric potential in the reaction well are detected during strand synthesis from nucleotides provided in cyclic repeated flows for each type of nucleotide. To minimize time requirement and costs, it is necessary to know the number of flows that are required for complete coverage of the templates.

RESULTS: We calculate the number of required flows in a random sequence model and present exact expressions for cumulative distribution function, expected value and variance. Additionally, we provide an algorithm to calculate the number of required flows for a concrete list of amplicons using a BED file of genomic positions as input. We apply the algorithm to calculate the number of flows that are required to cover six amplicon panels that are used for targeted sequencing in cancer research. The upper bounds obtained for the number of flows allow to enhance the instrument throughput from two chips to three chips per day for four of these panels.

AVAILABILITY AND IMPLEMENTATION: The algorithm for calculation of the flows was implemented in R and is available as package ionflows from the CRAN repository.

CONTACT: jan.budczies@charite.de

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Bibliographical data

Original languageEnglish
ISSN1367-4803
DOIs
Publication statusPublished - 15.04.2015
Externally publishedYes
PubMed 25480372