Design Considerations

FOR RNA Expression Microarray Assays:

Biological Replicates

Biological Replication: when genetic material, such as mRNA or DNA, from multiple sample sources is independently measured by applying each to a different microarray (in the case of a one channel array).  In addition to estimating processing and measurement variability, biological replication allows for biological differences between samples to be assessed.  Biological replication is a requisite for any microarray study that aims to make inferences about a population based on sample data; this is the case for most microarray studies.  (Allison, D., et al, Nature Review Genetics 7, 55-65, Jan 2006).

Technical Replicates

Technical Replication: when genetic material, such as mRNA or DNA, from a single sample source is independently measured by applying to multiple microarrays.  Another level of technical replication is when multiple sample extractions are derived from the same sample source.  Technical replication allows for processing and measurement variability to be estimated and does not allow for inferences related to the biological differences between individuals.  In general, technical replication is not necessary unless the study aim is quality-control or if additional independent sample sources are prohibitively costly or unavailable.  (Allison, D., et al, Nature Review Genetics 7, 55-65, Jan 2006).

The number of replicates for a microarray study is related to the study's statistical "power," which is classically defined as the probability of rejecting a null hypothesis that is false.  In other words, for a study to be adequately "powered", a large enough sample size is needed to reject the starting assumption that measurements between sample groups are identical.  In general the smaller the effect size (

i.e., the measurement of differential expression) and the greater the variability between replicate samples, the larger the sample size requirement.  Effect size and variability are just two of the variables used to calculate a statistical power analysis.

As a starting point, many investigators design a differential expression microarray project to include three biological replicates per condition or experimental group.  There is literature that indicates that a minimum of five biological replicates per group should be analyzed.  Depending in the microarray assay, often useful preliminary or pilot data can be generated using a single pair of microarrays (this is a minimum, not optimum).  Regardless of current practices, there is a consensus that a power analysis for a particular study should be conducted as part of the microarray project design phase and that more replicates generally provide more power.  Support with power analysis is provided by the Oregon
Clinical and Translational Resource Institute (OCTRI).

Pooling Samples

Microarray samples are often pooled to reduce experimental costs, compensate for insufficient sample material, or to reduce sample variation.  However, sample pooling results in the loss of gene expression information that may be especially important to studies that include human clinical samples or any study that aims to make inferences about a population.  Whenever possible, it is recommended that investigators use non-pooled (individual) samples.  See Affymetrix and Illumina technotes that address sample pooling.

Paired Samples

Paired samples are non-independent and should be analyzed accordingly.  Some examples of paired samples include tumor and benign tissue from the same individual, tissue from the right and left eyes from the same individual, and samples taken before and after an infection from the same sample source.  Ideally, paired samples are collected from multiple individuals or sample sources, yielding biological replicates of the sample pairs.  If multiple paired samples are taken from the same individual, the replicates are considered technical replicates.


See GPSR-Affymetrix Microarray Core publication in PLos One (2008), Randomization in Laboratory Procedure Is Key to Obtaining Reproducible Microarray Results.