Literature DB >> 30295032

ProteomeGenerator: A Framework for Comprehensive Proteomics Based on de Novo Transcriptome Assembly and High-Accuracy Peptide Mass Spectral Matching.

Paolo Cifani1, Avantika Dhabaria1, Zining Chen1, Akihide Yoshimi, Emily Kawaler, Omar Abdel-Wahab2, John T Poirier1,2, Alex Kentsis1,3.   

Abstract

Modern mass spectrometry now permits genome-scale and quantitative measurements of biological proteomes. However, analysis of specific specimens is currently hindered by the incomplete representation of biological variability of protein sequences in canonical reference proteomes and the technical demands for their construction. Here, we report ProteomeGenerator, a framework for de novo and reference-assisted proteogenomic database construction and analysis based on sample-specific transcriptome sequencing and high-accuracy mass spectrometry proteomics. This enables the assembly of proteomes encoded by actively transcribed genes, including sample-specific protein isoforms resulting from non-canonical mRNA transcription, splicing, or editing. To improve the accuracy of protein isoform identification in non-canonical proteomes, ProteomeGenerator relies on statistical target-decoy database matching calibrated using sample-specific controls. Its current implementation includes automatic integration with MaxQuant mass spectrometry proteomics algorithms. We applied this method for the proteogenomic analysis of splicing factor SRSF2 mutant leukemia cells, demonstrating high-confidence identification of non-canonical protein isoforms arising from alternative transcriptional start sites, intron retention, and cryptic exon splicing as well as improved accuracy of genome-scale proteome discovery. Additionally, we report proteogenomic performance metrics for current state-of-the-art implementations of SEQUEST HT, MaxQuant, Byonic, and PEAKS mass spectral analysis algorithms. Finally, ProteomeGenerator is implemented as a Snakemake workflow within a Singularity container for one-step installation in diverse computing environments, thereby enabling open, scalable, and facile discovery of sample-specific, non-canonical, and neomorphic biological proteomes.

Entities:  

Keywords:  de novo database construction; peptide fractionation; peptide−spectral matching; protein isoform analysis; proteogenomics; scoring function; transcriptomics

Mesh:

Substances:

Year:  2018        PMID: 30295032      PMCID: PMC6727203          DOI: 10.1021/acs.jproteome.8b00295

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  83 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

2.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

Review 3.  Mass spectrometry-based proteomics.

Authors:  Ruedi Aebersold; Matthias Mann
Journal:  Nature       Date:  2003-03-13       Impact factor: 49.962

4.  Proteogenomic mapping as a complementary method to perform genome annotation.

Authors:  Jacob D Jaffe; Howard C Berg; George M Church
Journal:  Proteomics       Date:  2004-01       Impact factor: 3.984

5.  Orthogonality of separation in two-dimensional liquid chromatography.

Authors:  Martin Gilar; Petra Olivova; Amy E Daly; John C Gebler
Journal:  Anal Chem       Date:  2005-10-01       Impact factor: 6.986

6.  De novo peptide sequencing and identification with precision mass spectrometry.

Authors:  Ari M Frank; Mikhail M Savitski; Michael L Nielsen; Roman A Zubarev; Pavel A Pevzner
Journal:  J Proteome Res       Date:  2007-01       Impact factor: 4.466

7.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.

Authors:  Marshall Bern; Yuhan Cai; David Goldberg
Journal:  Anal Chem       Date:  2007-01-23       Impact factor: 6.986

8.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry.

Authors:  Joshua E Elias; Steven P Gygi
Journal:  Nat Methods       Date:  2007-03       Impact factor: 28.547

9.  Semi-supervised learning for peptide identification from shotgun proteomics datasets.

Authors:  Lukas Käll; Jesse D Canterbury; Jason Weston; William Stafford Noble; Michael J MacCoss
Journal:  Nat Methods       Date:  2007-10-21       Impact factor: 28.547

10.  Finishing the euchromatic sequence of the human genome.

Authors: 
Journal:  Nature       Date:  2004-10-21       Impact factor: 49.962

View more
  10 in total

1.  Discovery of Protein Modifications Using Differential Tandem Mass Spectrometry Proteomics.

Authors:  Paolo Cifani; Zhi Li; Danmeng Luo; Mark Grivainis; Andrew M Intlekofer; David Fenyö; Alex Kentsis
Journal:  J Proteome Res       Date:  2021-03-22       Impact factor: 4.466

Review 2.  Proteotranscriptomics - A facilitator in omics research.

Authors:  Michal Levin; Falk Butter
Journal:  Comput Struct Biotechnol J       Date:  2022-07-09       Impact factor: 6.155

Review 3.  Identification and Quantification of Proteoforms by Mass Spectrometry.

Authors:  Leah V Schaffer; Robert J Millikin; Rachel M Miller; Lissa C Anderson; Ryan T Fellers; Ying Ge; Neil L Kelleher; Richard D LeDuc; Xiaowen Liu; Samuel H Payne; Liangliang Sun; Paul M Thomas; Trisha Tucholski; Zhe Wang; Si Wu; Zhijie Wu; Dahang Yu; Michael R Shortreed; Lloyd M Smith
Journal:  Proteomics       Date:  2019-05       Impact factor: 3.984

4.  Spritz: A Proteogenomic Database Engine.

Authors:  Anthony J Cesnik; Rachel M Miller; Khairina Ibrahim; Lei Lu; Robert J Millikin; Michael R Shortreed; Brian L Frey; Lloyd M Smith
Journal:  J Proteome Res       Date:  2020-10-07       Impact factor: 4.466

Review 5.  Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis.

Authors:  Chen Chen; Jie Hou; John J Tanner; Jianlin Cheng
Journal:  Int J Mol Sci       Date:  2020-04-20       Impact factor: 5.923

Review 6.  Exploiting Interdata Relationships in Next-generation Proteomics Analysis.

Authors:  Burcu Vitrinel; Hiromi W L Koh; Funda Mujgan Kar; Shuvadeep Maity; Justin Rendleman; Hyungwon Choi; Christine Vogel
Journal:  Mol Cell Proteomics       Date:  2019-05-24       Impact factor: 5.911

7.  Generation of ENSEMBL-based proteogenomics databases boosts the identification of non-canonical peptides.

Authors:  Husen M Umer; Enrique Audain; Yafeng Zhu; Julianus Pfeuffer; Timo Sachsenberg; Janne Lehtiö; Rui Branca; Yasset Perez-Riverol
Journal:  Bioinformatics       Date:  2021-12-14       Impact factor: 6.937

8.  Proteogenomic Analysis of Breast Cancer Transcriptomic and Proteomic Data, Using De Novo Transcript Assembly: Genome-Wide Identification of Novel Peptides and Clinical Implications.

Authors:  P S Hari; Lavanya Balakrishnan; Chaithanya Kotyada; Arivusudar Everad John; Shivani Tiwary; Nameeta Shah; Ravi Sirdeshmukh
Journal:  Mol Cell Proteomics       Date:  2022-02-26       Impact factor: 7.381

9.  Splice-Junction-Based Mapping of Alternative Isoforms in the Human Proteome.

Authors:  Edward Lau; Yu Han; Damon R Williams; Cody T Thomas; Rajani Shrestha; Joseph C Wu; Maggie P Y Lam
Journal:  Cell Rep       Date:  2019-12-10       Impact factor: 9.423

10.  Immunopeptidogenomics: Harnessing RNA-Seq to Illuminate the Dark Immunopeptidome.

Authors:  Katherine E Scull; Kirti Pandey; Sri H Ramarathinam; Anthony W Purcell
Journal:  Mol Cell Proteomics       Date:  2021-09-10       Impact factor: 5.911

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.