Ricardo A Verdugo1, Juan F Medrano. 1. Department of Animal Science, University of California Davis, CA 95616-8521, USA. raverdugo@ucdavis.edu
Abstract
BACKGROUND: The increasing use of DNA microarrays for genetical genomics studies generates a need for platforms with complete coverage of the genome. We have compared the effective gene coverage in the mouse genome of different commercial and noncommercial oligonucleotide microarray platforms by performing an in-house gene annotation of probes. We only used information about probes that is available from vendors and followed a process that any researcher may take to find the gene targeted by a given probe. In order to make consistent comparisons between platforms, probes in each microarray were annotated with an Entrez Gene id and the chromosomal position for each gene was obtained from the UCSC Genome Browser Database. Gene coverage was estimated as the percentage of Entrez Genes with a unique position in the UCSC Genome database that is tested by a given microarray platform. RESULTS: A MySQL relational database was created to store the mapping information for 25,416 mouse genes and for the probes in five microarray platforms (gene coverage level in parenthesis): Affymetrix430 2.0 (75.6%), ABI Genome Survey (81.24%), Agilent (79.33%), Codelink (78.09%), Sentrix (90.47%); and four array-ready oligosets: Sigma (47.95%), Operon v.3 (69.89%), Operon v.4 (84.03%), and MEEBO (84.03%). The differences in coverage between platforms were highly conserved across chromosomes. Differences in the number of redundant and unspecific probes were also found among arrays. The database can be queried to compare specific genomic regions using a web interface. The software used to create, update and query the database is freely available as a toolbox named ArrayGene. CONCLUSION: The software developed here allows researchers to create updated custom databases by using public or proprietary information on genes for any organisms. ArrayGene allows easy comparisons of gene coverage between microarray platforms for any region of the genome. The comparison presented here reveals that the commercial microarray Sentrix, which is based on the MEEBO public oligoset, showed the best mouse genome coverage currently available. We also suggest the creation of guidelines to standardize the minimum set of information that vendors should provide to allow researchers to accurately evaluate the advantages and disadvantages of using a given platform.
BACKGROUND: The increasing use of DNA microarrays for genetical genomics studies generates a need for platforms with complete coverage of the genome. We have compared the effective gene coverage in the mouse genome of different commercial and noncommercial oligonucleotide microarray platforms by performing an in-house gene annotation of probes. We only used information about probes that is available from vendors and followed a process that any researcher may take to find the gene targeted by a given probe. In order to make consistent comparisons between platforms, probes in each microarray were annotated with an Entrez Gene id and the chromosomal position for each gene was obtained from the UCSC Genome Browser Database. Gene coverage was estimated as the percentage of Entrez Genes with a unique position in the UCSC Genome database that is tested by a given microarray platform. RESULTS: A MySQL relational database was created to store the mapping information for 25,416 mouse genes and for the probes in five microarray platforms (gene coverage level in parenthesis): Affymetrix430 2.0 (75.6%), ABI Genome Survey (81.24%), Agilent (79.33%), Codelink (78.09%), Sentrix (90.47%); and four array-ready oligosets: Sigma (47.95%), Operon v.3 (69.89%), Operon v.4 (84.03%), and MEEBO (84.03%). The differences in coverage between platforms were highly conserved across chromosomes. Differences in the number of redundant and unspecific probes were also found among arrays. The database can be queried to compare specific genomic regions using a web interface. The software used to create, update and query the database is freely available as a toolbox named ArrayGene. CONCLUSION: The software developed here allows researchers to create updated custom databases by using public or proprietary information on genes for any organisms. ArrayGene allows easy comparisons of gene coverage between microarray platforms for any region of the genome. The comparison presented here reveals that the commercial microarray Sentrix, which is based on the MEEBO public oligoset, showed the best mouse genome coverage currently available. We also suggest the creation of guidelines to standardize the minimum set of information that vendors should provide to allow researchers to accurately evaluate the advantages and disadvantages of using a given platform.
Authors: Glynn Dennis; Brad T Sherman; Douglas A Hosack; Jun Yang; Wei Gao; H Clifford Lane; Richard A Lempicki Journal: Genome Biol Date: 2003-04-03 Impact factor: 13.583
Authors: Iain A Eaves; Linda S Wicker; Ghassan Ghandour; Paul A Lyons; Laurence B Peterson; John A Todd; Richard J Glynne Journal: Genome Res Date: 2002-02 Impact factor: 9.043
Authors: Eric E Schadt; Stephanie A Monks; Thomas A Drake; Aldons J Lusis; Nam Che; Veronica Colinayo; Thomas G Ruff; Stephen B Milligan; John R Lamb; Guy Cavet; Peter S Linsley; Mao Mao; Roland B Stoughton; Stephen H Friend Journal: Nature Date: 2003-03-20 Impact factor: 49.962
Authors: J Tsai; R Sultana; Y Lee; G Pertea; S Karamycheva; V Antonescu; J Cho; B Parvizi; F Cheung; J Quackenbush Journal: Genome Biol Date: 2001 Impact factor: 13.583
Authors: Nicole K Henderson-Maclennan; Jeanette C Papp; C Conover Talbot; Edward R B McCabe; Angela P Presson Journal: Mol Genet Metab Date: 2010-06-22 Impact factor: 4.797
Authors: Xiaohong Xu; Jennifer K Coats; Cindy F Yang; Amy Wang; Osama M Ahmed; Maricruz Alvarado; Tetsuro Izumi; Nirao M Shah Journal: Cell Date: 2012-02-03 Impact factor: 41.582
Authors: Cristina Faralla; Gabrielle A Rizzuto; David E Lowe; Byoungkwan Kim; Cara Cooke; Lawrence R Shiow; Anna I Bakardjiev Journal: Infect Immun Date: 2016-11-18 Impact factor: 3.441
Authors: Adriana S Leme; Annerose Berndt; Laura K Williams; Shirng-Wern Tsaih; Jin P Szatkiewicz; Ricardo Verdugo; Beverly Paigen; Steven D Shapiro Journal: Mol Genet Genomics Date: 2010-02-09 Impact factor: 3.291
Authors: Monika C Wolkers; Carmen Gerlach; Ramon Arens; Edith M Janssen; Patrick Fitzgerald; Ton N Schumacher; Jan Paul Medema; Douglas R Green; Stephen P Schoenberger Journal: Blood Date: 2011-11-29 Impact factor: 22.113
Authors: Charles C Kim; Sunil Parikh; Joseph C Sun; Alissa Myrick; Lewis L Lanier; Philip J Rosenthal; Joseph L DeRisi Journal: Infect Immun Date: 2008-09-29 Impact factor: 3.441
Authors: Sarah M McWhirter; Roman Barbalat; Kathryn M Monroe; Mary F Fontana; Mamoru Hyodo; Nathalie T Joncker; Ken J Ishii; Shizuo Akira; Marco Colonna; Zhijian J Chen; Katherine A Fitzgerald; Yoshihiro Hayakawa; Russell E Vance Journal: J Exp Med Date: 2009-08-03 Impact factor: 14.307
Authors: Ricardo A Verdugo; Christian F Deschepper; Gloria Muñoz; Daniel Pomp; Gary A Churchill Journal: Nucleic Acids Res Date: 2009-07-17 Impact factor: 16.971