Literature DB >> 35713864

Text Mining and Machine Learning Protocol for Extracting Human-Related Protein Phosphorylation Information from PubMed.

Krishnamurthy Arumugam1, Raja Ravi Shanker2.   

Abstract

In the modern health care research, protein phosphorylation has gained an enormous attention from the researchers across the globe and requires automated approaches to process a huge volume of data on proteins and their modifications at the cellular level. The data generated at the cellular level is unique as well as arbitrary, and an accumulation of massive volume of information is inevitable. Biological research has revealed that a huge array of cellular communication aided by protein phosphorylation and other similar mechanisms imply different and diverse meanings. This led to a collection of huge volume of data to understand the biological functions of human evolution, especially for combating diseases in a better way. Text mining, an automated approach to mine the information from an unstructured data, finds its application in extracting protein phosphorylation information from the biomedical literature databases such as PubMed. This chapter outlines a recent text mining protocol that applies natural language parsing (NLP) for named entity recognition and text processing, and support vector machines (SVM), a machine learning algorithm for classifying the processed text related human protein phosphorylation. We discuss on evaluating the text mining system which is the outcome of the protocol on three corpora, namely, human Protein Phosphorylation (hPP) corpus, Integrated Protein Literature Information and Knowledge corpus (iProLink), and Phosphorylation Literature corpus (PLC). We also present a basic understanding on the chemistry and biology that drive the protein phosphorylation process in a human body. We believe that this basic understanding will be useful to advance the existing text mining systems for extracting protein phosphorylation information from PubMed.
© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Entities:  

Keywords:  Machine learning; Natural language parsing; Protein phosphorylation; Support vector machine; Text mining

Mesh:

Substances:

Year:  2022        PMID: 35713864     DOI: 10.1007/978-1-0716-2305-3_9

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  29 in total

Review 1.  Proteomic analysis of post-translational modifications.

Authors:  Matthias Mann; Ole N Jensen
Journal:  Nat Biotechnol       Date:  2003-03       Impact factor: 54.908

2.  UniProt: the Universal Protein knowledgebase.

Authors:  Rolf Apweiler; Amos Bairoch; Cathy H Wu; Winona C Barker; Brigitte Boeckmann; Serenella Ferro; Elisabeth Gasteiger; Hongzhan Huang; Rodrigo Lopez; Michele Magrane; Maria J Martin; Darren A Natale; Claire O'Donovan; Nicole Redaschi; Lai-Su L Yeh
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

Review 3.  Oxidative stress in neurodegeneration: cause or consequence?

Authors:  Julie K Andersen
Journal:  Nat Med       Date:  2004-07       Impact factor: 53.440

4.  Literature mining and database annotation of protein phosphorylation using a rule-based system.

Authors:  Z Z Hu; M Narayanaswamy; K E Ravikumar; K Vijay-Shanker; C H Wu
Journal:  Bioinformatics       Date:  2005-04-06       Impact factor: 6.937

5.  Identification and validation of eukaryotic aspartate and glutamate methylation in proteins.

Authors:  Robert Sprung; Yue Chen; Kai Zhang; Dongmei Cheng; Terry Zhang; Junmin Peng; Yingming Zhao
Journal:  J Proteome Res       Date:  2008-01-26       Impact factor: 4.466

Review 6.  Oxidative stress and covalent modification of protein with bioactive aldehydes.

Authors:  Paul A Grimsrud; Hongwei Xie; Timothy J Griffin; David A Bernlohr
Journal:  J Biol Chem       Date:  2008-04-29       Impact factor: 5.157

7.  PhosphoBase, a database of phosphorylation sites: release 2.0.

Authors:  A Kreegipuu; N Blom; S Brunak
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

8.  Strengths of hydrogen bonds involving phosphorylated amino acid side chains.

Authors:  Daniel J Mandell; Ilya Chorny; Eli S Groban; Sergio E Wong; Elisheva Levine; Chaya S Rapp; Matthew P Jacobson
Journal:  J Am Chem Soc       Date:  2007-01-31       Impact factor: 15.419

Review 9.  Protein carbonylation, cellular dysfunction, and disease progression.

Authors:  Isabella Dalle-Donne; Giancarlo Aldini; Marina Carini; Roberto Colombo; Ranieri Rossi; Aldo Milzani
Journal:  J Cell Mol Med       Date:  2006 Apr-Jun       Impact factor: 5.310

10.  Variation and genetic control of protein abundance in humans.

Authors:  Linfeng Wu; Sophie I Candille; Yoonha Choi; Dan Xie; Lihua Jiang; Jennifer Li-Pook-Than; Hua Tang; Michael Snyder
Journal:  Nature       Date:  2013-05-15       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.