BACKGROUND: Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. RESULTS: We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download. CONCLUSION: Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms.
BACKGROUND: Complete genome annotation is a necessary tool as Anopheles gambiae researchers probe the biology of this potent malaria vector. RESULTS: We reannotate the A. gambiae genome by synthesizing comparative and ab initio sets of predicted coding sequences (CDSs) into a single set using an exon-gene-union algorithm followed by an open-reading-frame-selection algorithm. The reannotation predicts 20,970 CDSs supported by at least two lines of evidence, and it lowers the proportion of CDSs lacking start and/or stop codons to only approximately 4%. The reannotated CDS set includes a set of 4,681 novel CDSs not represented in the Ensembl annotation but with EST support, and another set of 4,031 Ensembl-supported genes that undergo major structural and, therefore, probably functional changes in the reannotated set. The quality and accuracy of the reannotation was assessed by comparison with end sequences from 20,249 full-length cDNA clones, and evaluation of mass spectrometry peptide hit rates from an A. gambiae shotgun proteomic dataset confirms that the reannotated CDSs offer a high quality protein database for proteomics. We provide a functional proteomics annotation, ReAnoXcel, obtained by analysis of the new CDSs through the AnoXcel pipeline, which allows functional comparisons of the CDS sets within the same bioinformatic platform. CDS data are available for download. CONCLUSION: Comprehensive A. gambiae genome reannotation is achieved through a combination of comparative and ab initio gene prediction algorithms.
Authors: Aron Marchler-Bauer; Anna R Panchenko; Benjamin A Shoemaker; Paul A Thiessen; Lewis Y Geer; Stephen H Bryant Journal: Nucleic Acids Res Date: 2002-01-01 Impact factor: 16.971
Authors: Evgeny M Zdobnov; Christian von Mering; Ivica Letunic; David Torrents; Mikita Suyama; Richard R Copley; George K Christophides; Dana Thomasova; Robert A Holt; G Mani Subramanian; Hans-Michael Mueller; George Dimopoulos; John H Law; Michael A Wells; Ewan Birney; Rosane Charlab; Aaron L Halpern; Elena Kokoza; Cheryl L Kraft; Zhongwu Lai; Suzanna Lewis; Christos Louis; Carolina Barillas-Mury; Deborah Nusskern; Gerald M Rubin; Steven L Salzberg; Granger G Sutton; Pantelis Topalis; Ron Wides; Patrick Wincker; Mark Yandell; Frank H Collins; Jose Ribeiro; William M Gelbart; Fotis C Kafatos; Peer Bork Journal: Science Date: 2002-10-04 Impact factor: 47.728
Authors: Stephanie Blandin; Shin-Hong Shiao; Luis F Moita; Chris J Janse; Andrew P Waters; Fotis C Kafatos; Elena A Levashina Journal: Cell Date: 2004-03-05 Impact factor: 41.582
Authors: Robert A Holt; G Mani Subramanian; Aaron Halpern; Granger G Sutton; Rosane Charlab; Deborah R Nusskern; Patrick Wincker; Andrew G Clark; José M C Ribeiro; Ron Wides; Steven L Salzberg; Brendan Loftus; Mark Yandell; William H Majoros; Douglas B Rusch; Zhongwu Lai; Cheryl L Kraft; Josep F Abril; Veronique Anthouard; Peter Arensburger; Peter W Atkinson; Holly Baden; Veronique de Berardinis; Danita Baldwin; Vladimir Benes; Jim Biedler; Claudia Blass; Randall Bolanos; Didier Boscus; Mary Barnstead; Shuang Cai; Angela Center; Kabir Chaturverdi; George K Christophides; Mathew A Chrystal; Michele Clamp; Anibal Cravchik; Val Curwen; Ali Dana; Art Delcher; Ian Dew; Cheryl A Evans; Michael Flanigan; Anne Grundschober-Freimoser; Lisa Friedli; Zhiping Gu; Ping Guan; Roderic Guigo; Maureen E Hillenmeyer; Susanne L Hladun; James R Hogan; Young S Hong; Jeffrey Hoover; Olivier Jaillon; Zhaoxi Ke; Chinnappa Kodira; Elena Kokoza; Anastasios Koutsos; Ivica Letunic; Alex Levitsky; Yong Liang; Jhy-Jhu Lin; Neil F Lobo; John R Lopez; Joel A Malek; Tina C McIntosh; Stephan Meister; Jason Miller; Clark Mobarry; Emmanuel Mongin; Sean D Murphy; David A O'Brochta; Cynthia Pfannkoch; Rong Qi; Megan A Regier; Karin Remington; Hongguang Shao; Maria V Sharakhova; Cynthia D Sitter; Jyoti Shetty; Thomas J Smith; Renee Strong; Jingtao Sun; Dana Thomasova; Lucas Q Ton; Pantelis Topalis; Zhijian Tu; Maria F Unger; Brian Walenz; Aihui Wang; Jian Wang; Mei Wang; Xuelan Wang; Kerry J Woodford; Jennifer R Wortman; Martin Wu; Alison Yao; Evgeny M Zdobnov; Hongyu Zhang; Qi Zhao; Shaying Zhao; Shiaoping C Zhu; Igor Zhimulev; Mario Coluzzi; Alessandra della Torre; Charles W Roth; Christos Louis; Francis Kalush; Richard J Mural; Eugene W Myers; Mark D Adams; Hamilton O Smith; Samuel Broder; Malcolm J Gardner; Claire M Fraser; Ewan Birney; Peer Bork; Paul T Brey; J Craig Venter; Jean Weissenbach; Fotis C Kafatos; Frank H Collins; Stephen L Hoffman Journal: Science Date: 2002-10-04 Impact factor: 47.728
Authors: Jesus G Valenzuela; Ivo M B Francischetti; Van My Pham; Mark K Garfield; José M C Ribeiro Journal: Insect Biochem Mol Biol Date: 2003-07 Impact factor: 4.714
Authors: Dolphine A Amenya; Wayne Chou; Jianyong Li; Guiyun Yan; Paul D Gershon; Anthony A James; Osvaldo Marinotti Journal: J Insect Physiol Date: 2010-05-05 Impact factor: 2.354
Authors: David W Rogers; Francesco Baldini; Francesca Battaglia; Maria Panico; Anne Dell; Howard R Morris; Flaminia Catteruccia Journal: PLoS Biol Date: 2009-12-22 Impact factor: 8.029
Authors: Jesús Martínez-Barnetche; Rosa E Gómez-Barreto; Marbella Ovilla-Muñoz; Juan Téllez-Sosa; David E García López; Rhoel R Dinglasan; Ceereena Ubaida Mohien; Robert M MacCallum; Seth N Redmond; John G Gibbons; Antonis Rokas; Carlos A Machado; Febe E Cazares-Raga; Lilia González-Cerón; Salvador Hernández-Martínez; Mario H Rodríguez López Journal: BMC Genomics Date: 2012-05-30 Impact factor: 3.969
Authors: Daniel Lawson; Peter Arensburger; Peter Atkinson; Nora J Besansky; Robert V Bruggner; Ryan Butler; Kathryn S Campbell; George K Christophides; Scott Christley; Emmanuel Dialynas; David Emmert; Martin Hammond; Catherine A Hill; Ryan C Kennedy; Neil F Lobo; M Robert MacCallum; Greg Madey; Karine Megy; Seth Redmond; Susan Russo; David W Severson; Eric O Stinson; Pantelis Topalis; Evgeny M Zdobnov; Ewan Birney; William M Gelbart; Fotis C Kafatos; Christos Louis; Frank H Collins Journal: Nucleic Acids Res Date: 2006-12-01 Impact factor: 16.971
Authors: Yoosook Lee; Travis C Collier; Michelle R Sanford; Clare D Marsden; Abdrahamane Fofana; Anthony J Cornel; Gregory C Lanzaro Journal: PLoS One Date: 2013-03-20 Impact factor: 3.240