| Literature DB >> 26860878 |
Paul A Rudnick1,2, Sanford P Markey2, Jeri Roth2, Yuri Mirokhin2, Xinjian Yan2, Dmitrii V Tchekhovskoi2, Nathan J Edwards3, Ratna R Thangudu4, Karen A Ketchum4, Christopher R Kinsinger5, Mehdi Mesri5, Henry Rodriguez5, Stephen E Stein2.
Abstract
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and nonreference markers of cancer. The CPTAC laboratories have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these data sets were produced from 2D liquid chromatography-tandem mass spectrometry analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false-discovery rate-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the data sets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level ("rolled-up") precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data to enable comparisons between different samples and cancer types as well as across the major omics fields.Entities:
Keywords: CPTAC; bioinformatics; cancer; data analysis pipeline; proteomics data resource
Mesh:
Substances:
Year: 2016 PMID: 26860878 PMCID: PMC5117628 DOI: 10.1021/acs.jproteome.5b01091
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466