| Literature DB >> 24792918 |
Abstract
Mass spectrometry (MS)-based shotgun proteomics is an effective technology for global proteome profiling. The ultimate goal is to assign tandem MS spectra to peptides and subsequently infer proteins and their abundance. In addition to database searching and protein assembly algorithms, computational approaches have been developed to integrate genomic, transcriptomic, and interactome information to improve peptide and protein identification. Earlier efforts focus primarily on making databases more comprehensive using publicly available genomic and transcriptomic data. More recently, with the increasing affordability of the Next Generation Sequencing (NGS) technologies, personalized protein databases derived from sample-specific genomic and transcriptomic data have emerged as an attractive strategy. In addition, incorporating interactome data not only improves protein identification but also puts identified proteins into their functional context and thus facilitates data interpretation. In this paper, we survey the major integrative bioinformatics approaches that have been developed during the past decade and discuss their merits and demerits.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24792918 PMCID: PMC4059263 DOI: 10.1021/pr500194t
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Figure 1A typical workflow of shotgun proteomics.
List of Published Orthogonal Data Assisted Proteomics Studies
| Genomic Information | ||
|---|---|---|
| Choudhary et al. | six-frame translation using the draft of human genome | ( |
| Fermin et al. | six-frame translation of whole human genome | ( |
| Sevinsky et al. | six-frame translation of whole human genome | ( |
| peptide isoelectric point (pI) | ||
| Bitton et al. | prescreening searches on databases translated from individual chromosomes; matched entries were then combined with the Celera database entries and used for a second time search | ( |
| Mo et al. | exon–exon junction database | ( |
| Power et al. | noncontiguous junction peptides in a “full length transcript” | ( |
| Gatlin et al. | generating dynamically all possible SNPs | ( |
| Roth et al. | creating a highly annotated database, including splicing, PTMs, and SNPs | ( |
| Bunger et al. | reference protein database | ( |
| tryptic peptide database created from dbSNP | ||
| peptide pI | ||
| Schandorff et al. | elongating IPI sequences with theoretical N-terminal peptides, variant peptides from cSNP, variant peptides from conflict annotation in Swiss-Prot, and proteolytic enzyme and keratin sequences | ( |
| Xi et al. | human disease-related variants from OMIM, PMD, and Swiss-Prot | ( |
| Nijveen et al. | 20-mer variant peptides generated by three-frame translation from mRNA sequences including SNPs in dbSNP | ( |
| Li et al. | combined database of normal proteins and variant peptides | ( |
| modified FDR estimation | ||
| Su et al. | a pipeline of nontargeted proteomics for identifying SAP peptides in human plasma and quantifying them using targeted proteomics | ( |
| Khatun et al. | whole genome proteogenomic mapping to identify novel protein coding regions for ENCODE cell line proteomics data | ( |
Figure 2Orthogonal data assisted proteomics studies.
Figure 3Methods for increasing database completeness using publicly available genomic and transcriptomic data.