| Literature DB >> 17531799 |
Toshihide Ono1, Haretsugu Hishigaki.
Abstract
Understanding the coupling specificity between G protein-coupled receptors (GPCRs) and specific classes of G proteins is important for further elucidation of receptor functions within a cell. Increasing information on GPCR sequences and the G protein family would facilitate prediction of the coupling properties of GPCRs. In this study, we describe a novel approach for predicting the coupling specificity between GPCRs and G proteins. This method uses not only GPCR sequences but also the functional knowledge generated by natural language processing, and can achieve 92.2% prediction accuracy by using the C4.5 algorithm. Furthermore, rules related to GPCR-G protein coupling are generated. The combination of sequence analysis and text mining improves the prediction accuracy for GPCR-G protein coupling specificity, and also provides clues for understanding GPCR signaling.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17531799 PMCID: PMC5054072 DOI: 10.1016/S1672-0229(07)60004-7
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Prediction Accuracy of GPCR-G Protein Coupling for Three G Protein Subfamilies
| Subfamily | No. of GPCR sequences | Sensitivity (%) | Specificity (%) |
|---|---|---|---|
| Gi | 84 | 96.4 | 96.4 |
| Gq | 33 | 90.9 | 98.3 |
| Gs | 36 | 83.3 | 93.2 |
Total Prediction Accuracy for Different Prediction Orders
| Prediction order | No. of correct predictions | Accuracy (%) | ||
|---|---|---|---|---|
| 1st | 2nd | 3rd | ||
| Gq | Gi | Gs | 141 | 92.2 |
| Gi | Gq | Gs | 139 | 90.8 |
| Gi | Gs | Gq | 139 | 90.8 |
| Gs | Gq | Gi | 139 | 90.8 |
| Gq | Gs | Gi | 138 | 90.2 |
| Gs | Gi | Gq | 137 | 89.5 |
Fig. 1The total prediction accuracy using different feature sets.
Examples of the Rules Produced by the C4.5 Algorithm*
| Rule | Antecedent | Consequence | Frequency of use | Error rate | |
|---|---|---|---|---|---|
| Feature of biological function | Feature of sequence | Coupled G | |||
| 1 | calcium signaling pathway=0 | pHMM_Gi_1 ≤ 0.0051; pHMM_Gi_7 ≤ 8.4e-5 | Gi | 66 | 0 |
| 2 | potassium ≤ 2.46 | pHMM_Gi_10 > 0.00086; pHMM_Gq_5 ≤ 2.3e-5 | Gq | 17 | 0 |
| 3 | cAMP ≤ 4.48; inositol phosphate > 11.17 | Gq | 13 | 8% | |
| 4 | 5-hydroxytryptamine ≥ 8.19; cAMP > 28.61 | Gs | 14 | 7% | |
The features of biological functions contain the information and key words extracted from the literature that are related to the GPCR function. Their values are indicated by the scores according to our calculation (see Materials and Methods). The features of sequences contain HMM profiles with the format “pHMM_(Gɑ type)_(number of pHMM)”. These values are indicated as the E-values obtained from hmmpfam. Frequency of use: the number of uses required to discriminate coupling specificity. Error rate: the error rate of discrimination.