Data Analysis and BioInformatics in real-time qPCR (5)

main page
subpage 1
subpage 2
subpage 3
subpage 4  -- integrative data analysis
subpage 5  --  latest paper updates

Molecular Regulatory Networks
Big Data in Transcriptomics & Molecular Biology

Latest papers:

MAKERGAUL - an innovative MAK2-based model and software for real-time PCR quantification
Bultmann CA and Weiskirchen R
Clin Biochem. 2014 47(1-2): 117-122

OBJECTIVES: Gene expression analysis by quantitative PCR is a standard laboratory technique for RNA quantification with high accuracy. In particular real-time PCR techniques using SYBR Green and melting curve analysis allowing verification of specific product amplification have become a well accepted laboratory technique for rapid and high throughput gene expression quantification. However, the software that is applied for quantification is somewhat circuitous and needs actually above average manual operation.
DESIGN AND METHODS: We here developed a novel, simple to handle open source software package (i.e., MAKERGAUL) for quantification of gene expression data obtained by real time PCR technology.
RESULTS: The developed software was evaluated with an already well characterized real time PCR data set and the performance parameters (i.e., absolute bias, linearity, reproducibility, and resolution) of the algorithm that are the basis of our calculation procedure compared and ranked with those of other implemented and well-established algorithms. It shows good quantification performance with reduced requirements in computing power.
CONCLUSIONS: We conclude that MAKERGAUL is a convenient and easy to handle software allowing accurate and fast expression data analysis.


  • The open source software MAKERGAUL allows easy gene expression quantification.
  • The program is available in different program and server-side scripting languages.
  • Quantification does not require standard curves or normalization.
  • MAKERGAUL has good precision, linearity, bias, resolution, and variability.
  • MAKERGAUL shows good quantification performance and needs little computing power.

Comparing real-time quantitative polymerase chain reaction analysis methods for precision, linearity, and accuracy of estimating amplification efficiency
Tellinghuisen J, Spiess AN
Anal Biochem. 2014 Mar 15;449:76-82

New methods are used to compare seven qPCR analysis methods for their performance in estimating the quantification cycle (Cq) and amplification efficiency (E) for a large test data set (94 samples for each of 4 dilutions) from a recent study. Precision and linearity are assessed using chi-square (χ(2)), which is the minimized quantity in least-squares (LS) fitting, equivalent to the variance in unweighted LS, and commonly used to define statistical efficiency. All methods yield Cqs that vary strongly in precision with the starting concentration N0, requiring weighted LS for proper calibration fitting of Cq vs log(N0). Then χ(2) for cubic calibration fits compares the inherent precision of the Cqs, while increases in χ(2) for quadratic and linear fits show the significance of nonlinearity. Nonlinearity is further manifested in unphysical estimates of E from the same Cq data, results which also challenge a tenet of all qPCR analysis methods - that E is constant throughout the baseline region. Constant-threshold (Ct) methods underperform the other methods when the data vary considerably in scale, as these data do.

A new method for quantitative real-time polymerase chain reaction data analysis
Rao X, Lai D, Huang X.
J Comput Biol. 2013 20(9): 703-711

Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantification method that has been extensively used in biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle method and linear and nonlinear model-fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence can hardly be accurate and therefore can distort results. We propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtract the fluorescence in the former cycle from that in the latter cycle, transforming the n cycle raw data into n-1 cycle data. Then, linear regression is applied to the natural logarithm of the transformed data. Finally, PCR amplification efficiencies and the initial DNA molecular numbers are calculated for each reaction. This taking-difference method avoids the error in subtracting an unknown background, and thus it is more accurate and reliable. This method is easy to perform, and this strategy can be extended to all current methods for PCR data analysis.

The choice of reference gene affects statistical efficiency in quantitative PCR data analysis
Guo Y, Pennell ML, Pearl DK, Knobloch TJ, Fernandez S, Weghorst CM.
Biotechniques. 2013 55(4): 207-209

Quantitative polymerase chain reaction (qPCR), a highly sensitive method of measuring gene expression, is widely used in biomedical research. To produce reliable results, it is essential to use stably expressed reference genes (RGs) for data normalization so that sample-to-sample variation can be controlled. In this study, we examine the effect of different RGs on statistical efficiency by analyzing a qPCR data set that contains 12 target genes and 3 RGs. Our results show that choosing the most stably expressed RG for data normalization does not guarantee reduced variance or improved statistical efficiency. We also provide a formula for determining when data normalization will improve statistical efficiency and hence increase the power of statistical tests in data analysis.

Eprobe mediated real-time PCR monitoring and melting curve analysis
Hanami T, Delobel D, Kanamori H, Tanaka Y, Kimura Y, Nakasone A, Soma T, Hayashizaki Y, Usui K, Harbers M.
PLoS One. 2013 Aug 7;8(8):e70942

Real-time monitoring of PCR is one of the most important methods for DNA and RNA detection widely used in research and medical diagnostics. Here we describe a new approach for combined real-time PCR monitoring and melting curve analysis using a 3' end-blocked Exciton-Controlled Hybridization-sensitive fluorescent Oligonucleotide (ECHO) called Eprobe. Eprobes contain two dye moieties attached to the same nucleotide and their fluorescent signal is strongly suppressed as single-stranded oligonucleotides by an excitonic interaction between the dyes. Upon hybridization to a complementary DNA strand, the dyes are separated and intercalate into the double-strand leading to strong fluorescence signals. Intercalation of dyes can further stabilize the DNA/DNA hybrid and increase the melting temperature compared to standard DNA oligonucleotides. Eprobes allow for specific real-time monitoring of amplification reactions by hybridizing to the amplicon in a sequence-dependent manner. Similarly, Eprobes allow for analysis of reaction products by melting curve analysis. The function of different Eprobes was studied using the L858R mutation in the human epidermal growth factor receptor (EGFR) gene, and multiplex detection was demonstrated for the human EGFR and KRAS genes using Eprobes with two different dyes. Combining amplification and melting curve analysis in a single-tube reaction provides powerful means for new mutation detection assays. Functioning as "sequence-specific dyes", Eprobes hold great promises for future applications not only in PCR but also as hybridization probes in other applications.

BootstRatio: A web-based statistical analysis of fold-change in qPCR and RT-qPCR data using resampling methods
Clèries R1, Galvez J, Espino M, Ribes J, Nunes V, de Heredia ML.
Comput Biol Med. 2012 42(4): 438-445

Real-time quantitative polymerase chain reaction (qPCR) is widely used in biomedical sciences quantifying its results through the relative expression (RE) of a target gene versus a reference one. Obtaining significance levels for RE assuming an underlying probability distribution of the data may be difficult to assess. We have developed the web-based application BootstRatio, which tackles the statistical significance of the RE and the probability that RE>1 through resampling methods without any assumption on the underlying probability distribution for the data analyzed. BootstRatio perform these statistical analyses of gene expression ratios in two settings: (1) when data have been already normalized against a control sample and (2) when the data control samples are provided. Since the estimation of the probability that RE>1 is an important feature for this type of analysis, as it is used to assign statistical significance and it can be also computed under the Bayesian framework, a simulation study has been carried out comparing the performance of BootstRatio versus a Bayesian approach in the estimation of that probability. In addition, two analyses, one for each setting, carried out with data from real experiments are presented showing the performance of BootstRatio. Our simulation study suggests that Bootstratio approach performs better than the Bayesian one excepting in certain situations of very small sample size (N≤12). The web application BootstRatio is accessible through and developed for the purpose of these intensive computation statistical analyses.

RT-qPCR work-flow for single-cell data analysis
Anders Ståhlberg, Vendula Rusnakova, Amin Forootan, Miroslava Anderova, Mikael Kubista
Methods 2013, Vol 59, Issue 1, pages 80-88

Individual cells represent the basic unit in tissues and organisms and are in many aspects unique in their properties. The introduction of new and sensitive techniques to study single-cells opens up new avenues to understand fundamental biological processes. Well established statistical tools and recommendations exist for gene expression data based on traditional cell population measurements. However, these workflows are not suitable, and some steps are even inappropriate, to apply on single-cell data. Here, we present a simple and practical workflow for preprocessing of single-cell data generated by reverse transcription quantitative real-time PCR. The approach is demonstrated on a data set based on profiling of 41 genes in 303 single-cells. For some pre-processing steps we present options and also recommendations. In particular, we demonstrate and discuss different strategies for handling missing data and scaling data for downstream multivariate analysis. The aim of this workflow is provide guide to the rapidly growing community studying single-cells by means of reverse transcription quantitative real-time PCR profiling.

Evaluation of qPCR curve analysis methods for reliable biomarker discovery -- bias, resolution, precision, and implications
Ruijter JM1, Pfaffl MW, Zhao S, Spiess AN, Boggy G, Blom J, Rutledge RG, Sisti D, Lievens A, De Preter K, Derveaux S, Hellemans J, Vandesompele J.
Methods. 2013 59(1): 32-46

RNA transcripts such as mRNA or microRNA are frequently used as biomarkers to determine disease state or response to therapy. Reverse transcription (RT) in combination with quantitative PCR (qPCR) has become the method of choice to quantify small amounts of such RNA molecules. In parallel with the democratization of RT-qPCR and its increasing use in biomedical research or biomarker discovery, we witnessed a growth in the number of gene expression data analysis methods. Most of these methods are based on the principle that the position of the amplification curve with respect to the cycle-axis is a measure for the initial target quantity: the later the curve, the lower the target quantity. However, most methods differ in the mathematical algorithms used to determine this position, as well as in the way the efficiency of the PCR reaction (the fold increase of product per cycle) is determined and applied in the calculations. Moreover, there is dispute about whether the PCR efficiency is constant or continuously decreasing. Together this has lead to the development of different methods to analyze amplification curves. In published comparisons of these methods, available algorithms were typically applied in a restricted or outdated way, which does not do them justice. Therefore, we aimed at development of a framework for robust and unbiased assessment of curve analysis performance whereby various publicly available curve analysis methods were thoroughly compared using a previously published large clinical data set (Vermeulen et al., 2009) [11]. The original developers of these methods applied their algorithms and are co-author on this study. We assessed the curve analysis methods' impact on transcriptional biomarker identification in terms of expression level, statistical significance, and patient-classification accuracy. The concentration series per gene, together with data sets from unpublished technical performance experiments, were analyzed in order to assess the algorithms' precision, bias, and resolution. While large differences exist between methods when considering the technical performance experiments, most methods perform relatively well on the biomarker data. The data and the analysis results per method are made available to serve as benchmark for further development and evaluation of qPCR curve analysis methods.
Download data =>

download the entire issue
Transcriptional Biomarkers

Methods Vol 59, Issue 1
Pages 1 - 163  & 
January 2013

edited by  Michael W. Pfaffl
Table of content
Full papers and reviews
Sponsored Application Notes