Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Proteogenomic database construction driven from large scale RNA-seq data.

Literature DB >> 23802565

Proteogenomic database construction driven from large scale RNA-seq data.

Sunghee Woo¹, Seong Won Cha, Gennifer Merrihew, Yupeng He, Natalie Castellana, Clark Guest, Michael MacCoss, Vineet Bafna.

Abstract

The advent of inexpensive RNA-seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS-based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our paper addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2 GB of aligned RNA-seq SAM files to 410 MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom data set, using a completely automated pipeline, and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame shifts, 1166 reverse strands, and 42 translated UTRs. Our results highlight the usefulness of transcript + proteomic integration for improved genome annotations.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2013 PMID： 23802565 PMCID： PMC4034692 DOI： 10.1021/pr400294c

Source DB: PubMed Journal: J Proteome Res ISSN： 1535-3893 Impact factor: 4.466

24 in total

1. The human genome browser at UCSC.

Authors: W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal: Genome Res Date: 2002-06 Impact factor: 9.043

2. The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.

Authors: Sangtae Kim; Nikolai Mischerikow; Nuno Bandeira; J Daniel Navarro; Louis Wich; Shabaz Mohammed; Albert J R Heck; Pavel A Pevzner
Journal: Mol Cell Proteomics Date: 2010-09-09 Impact factor: 5.911

3. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

Authors: Mark B Gerstein; Zhi John Lu; Eric L Van Nostrand; Chao Cheng; Bradley I Arshinoff; Tao Liu; Kevin Y Yip; Rebecca Robilotto; Andreas Rechtsteiner; Kohta Ikegami; Pedro Alves; Aurelien Chateigner; Marc Perry; Mitzi Morris; Raymond K Auerbach; Xin Feng; Jing Leng; Anne Vielle; Wei Niu; Kahn Rhrissorrakrai; Ashish Agarwal; Roger P Alexander; Galt Barber; Cathleen M Brdlik; Jennifer Brennan; Jeremy Jean Brouillet; Adrian Carr; Ming-Sin Cheung; Hiram Clawson; Sergio Contrino; Luke O Dannenberg; Abby F Dernburg; Arshad Desai; Lindsay Dick; Andréa C Dosé; Jiang Du; Thea Egelhofer; Sevinc Ercan; Ghia Euskirchen; Brent Ewing; Elise A Feingold; Reto Gassmann; Peter J Good; Phil Green; Francois Gullier; Michelle Gutwein; Mark S Guyer; Lukas Habegger; Ting Han; Jorja G Henikoff; Stefan R Henz; Angie Hinrichs; Heather Holster; Tony Hyman; A Leo Iniguez; Judith Janette; Morten Jensen; Masaomi Kato; W James Kent; Ellen Kephart; Vishal Khivansara; Ekta Khurana; John K Kim; Paulina Kolasinska-Zwierz; Eric C Lai; Isabel Latorre; Amber Leahey; Suzanna Lewis; Paul Lloyd; Lucas Lochovsky; Rebecca F Lowdon; Yaniv Lubling; Rachel Lyne; Michael MacCoss; Sebastian D Mackowiak; Marco Mangone; Sheldon McKay; Desirea Mecenas; Gennifer Merrihew; David M Miller; Andrew Muroyama; John I Murray; Siew-Loon Ooi; Hoang Pham; Taryn Phippen; Elicia A Preston; Nikolaus Rajewsky; Gunnar Rätsch; Heidi Rosenbaum; Joel Rozowsky; Kim Rutherford; Peter Ruzanov; Mihail Sarov; Rajkumar Sasidharan; Andrea Sboner; Paul Scheid; Eran Segal; Hyunjin Shin; Chong Shou; Frank J Slack; Cindie Slightam; Richard Smith; William C Spencer; E O Stinson; Scott Taing; Teruaki Takasaki; Dionne Vafeados; Ksenia Voronina; Guilin Wang; Nicole L Washington; Christina M Whittle; Beijing Wu; Koon-Kiu Yan; Georg Zeller; Zheng Zha; Mei Zhong; Xingliang Zhou; Julie Ahringer; Susan Strome; Kristin C Gunsalus; Gos Micklem; X Shirley Liu; Valerie Reinke; Stuart K Kim; LaDeana W Hillier; Steven Henikoff; Fabio Piano; Michael Snyder; Lincoln Stein; Jason D Lieb; Robert H Waterston
Journal: Science Date: 2010-12-22 Impact factor: 47.728

4. A bioinformatics workflow for variant peptide detection in shotgun proteomics.

Authors: Jing Li; Zengliu Su; Ze-Qiang Ma; Robbert J C Slebos; Patrick Halvey; David L Tabb; Daniel C Liebler; William Pao; Bing Zhang
Journal: Mol Cell Proteomics Date: 2011-03-09 Impact factor: 5.911

5. Discovery and revision of Arabidopsis genes by proteogenomics.

Authors: Natalie E Castellana; Samuel H Payne; Zhouxin Shen; Mario Stanke; Vineet Bafna; Steven P Briggs
Journal: Proc Natl Acad Sci U S A Date: 2008-12-19 Impact factor: 11.205

6. Multiplexed size separation of intact proteins in solution phase for mass spectrometry.

Authors: John C Tran; Alan A Doucette
Journal: Anal Chem Date: 2009-08-01 Impact factor: 6.986

7. The genetics of Caenorhabditis elegans.

Authors: S Brenner
Journal: Genetics Date: 1974-05 Impact factor: 4.562

8. Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations.

Authors: Gennifer E Merrihew; Colleen Davis; Brent Ewing; Gary Williams; Lukas Käll; Barbara E Frewen; William Stafford Noble; Phil Green; James H Thomas; Michael J MacCoss
Journal: Genome Res Date: 2008-07-24 Impact factor: 9.043

9. Regulation of Caenorhabditis elegans vitellogenesis by DAF-2/IIS through separable transcriptional and posttranscriptional mechanisms.

Authors: Ana S DePina; Wendy B Iser; Sung-Soo Park; Stuart Maudsley; Mark A Wilson; Catherine A Wolkow
Journal: BMC Physiol Date: 2011-07-12

10. Novel peptide identification from tandem mass spectra using ESTs and sequence database compression.

Authors: Nathan J Edwards
Journal: Mol Syst Biol Date: 2007-04-17 Impact factor: 11.429

41 in total

1. A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites.

Authors: Alexander Koch; Daria Gawron; Sandra Steyaert; Elvis Ndah; Jeroen Crappé; Sarah De Keulenaer; Ellen De Meester; Ming Ma; Ben Shen; Kris Gevaert; Wim Van Criekinge; Petra Van Damme; Gerben Menschaert
Journal: Proteomics Date: 2014-10-02 Impact factor: 3.984

2. Proteogenomic strategies for identification of aberrant cancer peptides using large-scale next-generation sequencing data.

Authors: Sunghee Woo; Seong Won Cha; Seungjin Na; Clark Guest; Tao Liu; Richard D Smith; Karin D Rodland; Samuel Payne; Vineet Bafna
Journal: Proteomics Date: 2014-11-17 Impact factor: 3.984

3. A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra.

Authors: Qiang Kou; Si Wu; Nikola Tolic; Ljiljana Paša-Tolic; Yunlong Liu; Xiaowen Liu
Journal: Bioinformatics Date: 2017-05-01 Impact factor: 6.937

Review 4. Methods, Tools and Current Perspectives in Proteogenomics.

Authors: Kelly V Ruggles; Karsten Krug; Xiaojing Wang; Karl R Clauser; Jing Wang; Samuel H Payne; David Fenyö; Bing Zhang; D R Mani
Journal: Mol Cell Proteomics Date: 2017-04-29 Impact factor: 5.911

5. Annotation of the zebrafish genome through an integrated transcriptomic and proteomic analysis.

Authors: Dhanashree S Kelkar; Elayne Provost; Raghothama Chaerkady; Babylakshmi Muthusamy; Srikanth S Manda; Tejaswini Subbannayya; Lakshmi Dhevi N Selvan; Chieh-Huei Wang; Keshava K Datta; Sunghee Woo; Sutopa B Dwivedi; Santosh Renuse; Derese Getnet; Tai-Chung Huang; Min-Sik Kim; Sneha M Pinto; Christopher J Mitchell; Anil K Madugundu; Praveen Kumar; Jyoti Sharma; Jayshree Advani; Gourav Dey; Lavanya Balakrishnan; Nazia Syed; Vishalakshi Nanjappa; Yashwanth Subbannayya; Renu Goel; T S Keshava Prasad; Vineet Bafna; Ravi Sirdeshmukh; Harsha Gowda; Charles Wang; Steven D Leach; Akhilesh Pandey
Journal: Mol Cell Proteomics Date: 2014-07-24 Impact factor: 5.911

6. Leveraging the complementary nature of RNA-Seq and shotgun proteomics data.

Authors: Xiaojing Wang; Qi Liu; Bing Zhang
Journal: Proteomics Date: 2014-11-17 Impact factor: 3.984

7. Proteogenomics of Gammarus fossarum to document the reproductive system of amphipods.

Authors: Judith Trapp; Olivier Geffard; Gilles Imbert; Jean-Charles Gaillard; Anne-Hélène Davin; Arnaud Chaumot; Jean Armengaud
Journal: Mol Cell Proteomics Date: 2014-10-07 Impact factor: 5.911

Review 8. The emergence of proteome-wide technologies: systematic analysis of proteins comes of age.

Authors: Michal Breker; Maya Schuldiner
Journal: Nat Rev Mol Cell Biol Date: 2014-06-18 Impact factor: 94.444

9. JUMPg: An Integrative Proteogenomics Pipeline Identifying Unannotated Proteins in Human Brain and Cancer Cells.

Authors: Yuxin Li; Xusheng Wang; Ji-Hoon Cho; Timothy I Shaw; Zhiping Wu; Bing Bai; Hong Wang; Suiping Zhou; Thomas G Beach; Gang Wu; Jinghui Zhang; Junmin Peng
Journal: J Proteome Res Date: 2016-06-13 Impact factor: 4.466

10. Proteogenomic Annotation of Chinese Hamsters Reveals Extensive Novel Translation Events and Endogenous Retroviral Elements.

Authors: Shangzhong Li; Seong Won Cha; Kelly Heffner; Deniz Baycin Hizal; Michael A Bowen; Raghothama Chaerkady; Robert N Cole; Vijay Tejwani; Prashant Kaushik; Michael Henry; Paula Meleady; Susan T Sharfstein; Michael J Betenbaugh; Vineet Bafna; Nathan E Lewis
Journal: J Proteome Res Date: 2019-05-08 Impact factor: 4.466