| Literature DB >> 26844380 |
Rui-Xiang Sun1, Lan Luo1,2, Long Wu1,2, Rui-Min Wang1,2, Wen-Feng Zeng1,2, Hao Chi1, Chao Liu1, Si-Min He1.
Abstract
There has been tremendous progress in top-down proteomics (TDP) in the past 5 years, particularly in intact protein separation and high-resolution mass spectrometry. However, bioinformatics to deal with large-scale mass spectra has lagged behind, in both algorithmic research and software development. In this study, we developed pTop 1.0, a novel software tool to significantly improve the accuracy and efficiency of mass spectral data analysis in TDP. The precursor mass offers crucial clues to infer the potential post-translational modifications co-occurring on the protein, the reliability of which relies heavily on its mass accuracy. Concentrating on detecting the precursors more accurately, a machine-learning model incorporating a variety of spectral features was trained online in pTop via a support vector machine (SVM). pTop employs the sequence tags extracted from the MS/MS spectra and a dynamic programming algorithm to accelerate the search speed, especially for those spectra with multiple post-translational modifications. We tested pTop on three publicly available data sets and compared it with ProSight and MS-Align+ in terms of its recall, precision, running time, and so on. The results showed that pTop can, in general, outperform ProSight and MS-Align+. pTop recalled 22% more correct precursors, although it exported 30% fewer precursors than Xtract (in ProSight) from a human histone data set. The running speed of pTop was about 1 to 2 orders of magnitude faster than that of MS-Align+. This algorithmic advancement in pTop, including both accuracy and speed, will inspire the development of other similar software to analyze the mass spectra from the entire proteins.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26844380 DOI: 10.1021/acs.analchem.5b03963
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 6.986