Literature DB >> 19106121

arrayQualityMetrics--a bioconductor package for quality assessment of microarray data.

Audrey Kauffmann1, Robert Gentleman, Wolfgang Huber.   

Abstract

SUMMARY: The assessment of data quality is a major concern in microarray analysis. arrayQualityMetrics is a Bioconductor package that provides a report with diagnostic plots for one or two colour microarray data. The quality metrics assess reproducibility, identify apparent outlier arrays and compute measures of signal-to-noise ratio. The tool handles most current microarray technologies and is amenable to use in automated analysis pipelines or for automatic report generation, as well as for use by individuals. The diagnosis of quality remains, in principle, a context-dependent judgement, but our tool provides powerful, automated, objective and comprehensive instruments on which to base a decision. AVAILABILITY: arrayQualityMetrics is a free and open source package, under LGPL license, available from the Bioconductor project at www.bioconductor.org. A users guide and examples are provided with the package. Some examples of HTML reports generated by arrayQualityMetrics can be found at http://www.microarray-quality.org

Entities:  

Mesh:

Year:  2008        PMID: 19106121      PMCID: PMC2639074          DOI: 10.1093/bioinformatics/btn647

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

As microarray data quality can be affected at each step of the microarray experiment processing (Schuchhardt et al., 2000), quality assessment is an integral part of the analysis. There are freely available tools allowing quality assessment for a specific microarray type, such as Affymetrix (Parman and Halling, 2005), Illumina (Dunning et al., 2007) and two-colour cDNA arrays (Buness et al., 2005). Other free tools are designed to identify a particular problem among which are spot quality (Li et al., 2005) or hybridization quality (Petri et al., 2004). Some tools perform outlier detection from quality metrics done before (Freue et al.,2007), or propose interactive quality plots (Lee et al., 2006). We developed a Bioconductor (Gentleman et al., 2004) package, arrayQualityMetrics, with the aim to provide a comprehensive tool that works on all expression arrays and platforms and produces a self-contained report which can be web-delivered. The Supplementary table shows a comparison with the functionality and scope of other Bioconductor packages concerned with quality assessment or outlier detection.

2 DESCRIPTION

Input: to perform an analysis using the arrayQualityMetrics package, one needs to provide the matrix of microarray intensities and optionally, information about the samples and the probes in a Bioconductor object of class AffyBatch, ExpressionSet, NChannelSet or BeadLevelList. These classes are widely used and well documented. The manner of calling the arrayQualityMetrics function to create a report is the same for all of these classes, and it can be applied to raw array intensities as well as to normalized data. Applied to raw intensities, the quality metrics can help with monitoring experimental procedures and with the choice of normalization procedure; application to the normalized data is more relevant for assessing the utility of the data in downstream analyses. Individual array quality: the MA-plot allows the evaluation of the dependence between the intensity levels and the distribution of the ratios (Fig. 1a) (Dudoit et al., 2002). For two-colour arrays, a probe's M-value is the log-ratio of the two intensities and the A-value is the mean of their logarithms. In the case of one colour arrays, the M-value is computed by dividing the intensity by the median intensity of the same probe across all arrays. A false colour representation of each array's spatial distribution of feature intensities (Fig. 1b) helps in identifying spatial effects that may be caused by, for example, gradients in the hybridization chamber, air bubbles or printing problems. Homogeneity between arrays: to assess the homogeneity between the arrays, boxplots of the log2 intensities and density estimate plots (Fig. 1c) are presented. Between array comparison: Figure 1d shows a heatmap of between array distances, computed as the mean absolute difference of the M-value for each pair of arrays where M is the M-value of the i-th probe on the x-th array. Consider the decomposition of M. where z is the probe effect for probe i (the same across all arrays), ε are i.i.d random variables with mean zero and β is a sparse matrix representing differential expression effects. Under these assumptions, all values d are approximately the same and deviations from this can be used to identify outlier arrays. The dendrogram can serve to check if the experiments cluster in accordance with the sample classes. Affymetrix specific plots: four Affymetrix-specific metrics are evaluated if the input object is an AffyBatch. The RNA degradation plot from the affy package (Gautier et al., 2004),, the relative log expression (RLE) boxplots and the normalized unscaled standard error (NUSE) boxplots from the affyPLM package (Brettschneider et al., 2007) and the QC stat plot from the simpleaffy package (Wilson and Miller, 2005) are represented. (a) MA-plot for an Agilent microarray. The M-values are not centered on zero meaning that there is a dependency between the intensities and the log-ratio. (b) Spatial distribution of the background of the green channel for an Illumina chip. There is an abnormal distribution of high intensities at the top border of the array. (c) Density plot of the log-intensities of an Affymetrix set of arrays (E-GEOD-349 ArrayExpress set). The density of one of the arrays is shifted on the x-axis. (d) Heatmap of the ArrayExpress Affymetrix data set E-GEOD-1571. Array 18 is an outlier. Scores: to guide the interpretation of the report, we have included the computation of numeric scores associated with the plots. Outliers are detected on the MA-plot, spatial distributions of the features’ intensities, boxplot, heatmap, RLE and NUSE. The mean of the absolute value of M is computed for each array and those that lie beyond the extremes of the boxplot's whiskers are considered as possible outliers arrays. The same approach, i.e. using the whiskers of the boxplot, is applied to the following: the mean and interquartile range (IQR) from the boxplots and NUSE, the sums of the rows of the distance matrix, and the relative amplitude of low versus high frequence components of the Fourier transformation. In the case of the RLE plot, any array with a median RLE higher than 0.1 is considered an outlier. Report: the metrics are rendered as figures with legends in a detailed report and the scores are used to provide a summary table. Examples of reports are provided at http://www.microarray-quality.org/quality_metrics.html.

3 CONCLUSION

arrayQualityMetrics supports the quality assessment of many types of microarrays in R. After preparation of the data, a single command line is used to create the report. The main benefits of arrayQualityMetrics are its simplicity of use, the ability to have the same report for different types of platforms, and the opportunity for users or developers to extend it for their needs. This tool can be used for individual data analyses or in routine data production pipelines, to provide fast uniform reporting.
  10 in total

1.  Normalization strategies for cDNA microarrays.

Authors:  J Schuchhardt; D Beule; A Malik; E Wolski; H Eickhoff; H Lehrach; H Herzel
Journal:  Nucleic Acids Res       Date:  2000-05-15       Impact factor: 16.971

2.  affy--analysis of Affymetrix GeneChip data at the probe level.

Authors:  Laurent Gautier; Leslie Cope; Benjamin M Bolstad; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2004-02-12       Impact factor: 6.937

3.  arrayMagic: two-colour cDNA microarray quality control and preprocessing.

Authors:  Andreas Buness; Wolfgang Huber; Klaus Steiner; Holger Sültmann; Annemarie Poustka
Journal:  Bioinformatics       Date:  2004-09-28       Impact factor: 6.937

4.  Donuts, scratches and blanks: robust model-based segmentation of microarray images.

Authors:  Qunhua Li; Chris Fraley; Roger E Bumgarner; Ka Yee Yeung; Adrian E Raftery
Journal:  Bioinformatics       Date:  2005-04-21       Impact factor: 6.937

5.  Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis.

Authors:  Claire L Wilson; Crispin J Miller
Journal:  Bioinformatics       Date:  2005-08-02       Impact factor: 6.937

6.  arrayQCplot: software for checking the quality of microarray data.

Authors:  Eun-Kyung Lee; Sung-Gon Yi; Taesung Park
Journal:  Bioinformatics       Date:  2006-07-24       Impact factor: 6.937

7.  beadarray: R classes and methods for Illumina bead-based data.

Authors:  Mark J Dunning; Mike L Smith; Matthew E Ritchie; Simon Tavaré
Journal:  Bioinformatics       Date:  2007-06-22       Impact factor: 6.937

8.  MDQC: a new quality assessment method for microarrays based on quality control reports.

Authors:  Gabriela V Cohen Freue; Zsuzsanna Hollander; Enqing Shen; Ruben H Zamar; Robert Balshaw; Andreas Scherer; Bruce McManus; Paul Keown; W Robert McMaster; Raymond T Ng
Journal:  Bioinformatics       Date:  2007-10-12       Impact factor: 6.937

9.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

10.  Array-A-Lizer: a serial DNA microarray quality analyzer.

Authors:  Andreas Petri; Jan Fleckner; Mads Wichmann Matthiessen
Journal:  BMC Bioinformatics       Date:  2004-02-09       Impact factor: 3.169

  10 in total
  431 in total

1.  Ultra-low dose interleukin-2 promotes immune-modulating function of regulatory T cells and natural killer cells in healthy volunteers.

Authors:  Sawa Ito; Catherine M Bollard; Mattias Carlsten; Jan Joseph Melenhorst; Angélique Biancotto; Ena Wang; Jinguo Chen; Yuri Kotliarov; Foo Cheung; Zhi Xie; Francesco Marincola; Kazushi Tanimoto; Minoo Battiwalla; Matthew J Olnes; Shira Perl; Paula Schum; Thomas E Hughes; Keyvan Keyvanfar; Nancy Hensel; Pawel Muranski; Neal S Young; A John Barrett
Journal:  Mol Ther       Date:  2014-04-01       Impact factor: 11.454

2.  Neuronal Kmt2a/Mll1 histone methyltransferase is essential for prefrontal synaptic plasticity and working memory.

Authors:  Mira Jakovcevski; Hongyu Ruan; Erica Y Shen; Aslihan Dincer; Behnam Javidfar; Qi Ma; Cyril J Peter; Iris Cheung; Amanda C Mitchell; Yan Jiang; Cong L Lin; Venu Pothula; A Francis Stewart; Patricia Ernst; Wei-Dong Yao; Schahram Akbarian
Journal:  J Neurosci       Date:  2015-04-01       Impact factor: 6.167

3.  Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks.

Authors:  Monika Ray; Reem Yunis; Xiucui Chen; David M Rocke
Journal:  BMC Genomics       Date:  2012-05-17       Impact factor: 3.969

4.  Malignant germ cell tumors display common microRNA profiles resulting in global changes in expression of messenger RNA targets.

Authors:  Roger D Palmer; Matthew J Murray; Harpreet K Saini; Stijn van Dongen; Cei Abreu-Goodger; Balaji Muralidhar; Mark R Pett; Claire M Thornton; James C Nicholson; Anton J Enright; Nicholas Coleman
Journal:  Cancer Res       Date:  2010-03-23       Impact factor: 12.701

5.  Reciprocal responses in the interaction between Arabidopsis and the cell-content-feeding chelicerate herbivore spider mite.

Authors:  Vladimir Zhurov; Marie Navarro; Kristie A Bruinsma; Vicent Arbona; M Estrella Santamaria; Marc Cazaux; Nicky Wybouw; Edward J Osborne; Cherise Ens; Cristina Rioja; Vanessa Vermeirssen; Ignacio Rubio-Somoza; Priti Krishna; Isabel Diaz; Markus Schmid; Aurelio Gómez-Cadenas; Yves Van de Peer; Miodrag Grbic; Richard M Clark; Thomas Van Leeuwen; Vojislava Grbic
Journal:  Plant Physiol       Date:  2013-11-27       Impact factor: 8.340

6.  Discovery of a ZIP7 inhibitor from a Notch pathway screen.

Authors:  Erin Nolin; Sara Gans; Luis Llamas; Somnath Bandyopadhyay; Scott M Brittain; Paula Bernasconi-Elias; Kyle P Carter; Joseph J Loureiro; Jason R Thomas; Markus Schirle; Yi Yang; Ning Guo; Guglielmo Roma; Sven Schuierer; Martin Beibel; Alicia Lindeman; Frederic Sigoillot; Amy Chen; Kevin X Xie; Samuel Ho; John Reece-Hoyes; Wilhelm A Weihofen; Kayla Tyskiewicz; Dominic Hoepfner; Richard I McDonald; Nicolette Guthrie; Abhishek Dogra; Haibing Guo; Jian Shao; Jian Ding; Stephen M Canham; Geoff Boynton; Elizabeth L George; Zhao B Kang; Christophe Antczak; Jeffery A Porter; Owen Wallace; John A Tallarico; Amy E Palmer; Jeremy L Jenkins; Rishi K Jain; Simon M Bushell; Christy J Fryer
Journal:  Nat Chem Biol       Date:  2019-01-14       Impact factor: 15.040

7.  Preliminary Transcriptome Analysis in Lymphoblasts from Cluster Headache and Bipolar Disorder Patients Implicates Dysregulation of Circadian and Serotonergic Genes.

Authors:  Marta Costa; Alessio Squassina; Ignazio Stefano Piras; Claudia Pisanu; Donatella Congiu; Paola Niola; Andrea Angius; Caterina Chillotti; Raffaella Ardau; Giovanni Severino; Erminia Stochino; Arianna Deidda; Antonio M Persico; Martin Alda; Maria Del Zompo
Journal:  J Mol Neurosci       Date:  2015-04-28       Impact factor: 3.444

8.  A Mouse Brain-based Multi-omics Integrative Approach Reveals Potential Blood Biomarkers for Ischemic Stroke.

Authors:  Alba Simats; Laura Ramiro; Teresa García-Berrocoso; Ferran Briansó; Ricardo Gonzalo; Luna Martín; Anna Sabé; Natalia Gill; Anna Penalba; Nuria Colomé; Alex Sánchez; Francesc Canals; Alejandro Bustamante; Anna Rosell; Joan Montaner
Journal:  Mol Cell Proteomics       Date:  2020-08-31       Impact factor: 5.911

9.  Function relaxation followed by diversifying selection after whole-genome duplication in flowering plants.

Authors:  Hui Guo; Tae-Ho Lee; Xiyin Wang; Andrew H Paterson
Journal:  Plant Physiol       Date:  2013-04-11       Impact factor: 8.340

10.  Broad defects in the energy metabolism of leukocytes underlie immunoparalysis in sepsis.

Authors:  Shih-Chin Cheng; Brendon P Scicluna; Rob J W Arts; Mark S Gresnigt; Ekta Lachmandas; Evangelos J Giamarellos-Bourboulis; Matthijs Kox; Ganesh R Manjeri; Jori A L Wagenaars; Olaf L Cremer; Jenneke Leentjens; Anne J van der Meer; Frank L van de Veerdonk; Marc J Bonten; Marcus J Schultz; Peter H G M Willems; Peter Pickkers; Leo A B Joosten; Tom van der Poll; Mihai G Netea
Journal:  Nat Immunol       Date:  2016-03-07       Impact factor: 25.606

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.