| Literature DB >> 15980445 |
Yukimitsu Yabuki1, Takahiko Muramatsu, Takatsugu Hirokawa, Hidehito Mukai, Makiko Suwa.
Abstract
We describe a novel system, GRIFFIN (G-protein and Receptor Interaction Feature Finding INstrument), that predicts G-protein coupled receptor (GPCR) and G-protein coupling selectivity based on a support vector machine (SVM) and a hidden Markov model (HMM) with high sensitivity and specificity. Based on our assumption that whole structural segments of ligands, GPCRs and G-proteins are essential to determine GPCR and G-protein coupling, various quantitative features were selected for ligands, GPCRs and G-protein complex structures, and those parameters that are the most effective in selecting G-protein type were used as feature vectors in the SVM. The main part of GRIFFIN includes a hierarchical SVM classifier using the feature vectors, which is useful for Class A GPCRs, the major family. For the opsins and olfactory subfamilies of Class A and other minor families (Classes B, C, frizzled and smoothened), the binding G-protein is predicted with high accuracy using the HMM. Applying this system to known GPCR sequences, each binding G-protein is predicted with high sensitivity and specificity (>85% on average). GRIFFIN (http://griffin.cbrc.jp/) is freely available and allows users to easily execute this reliable prediction of G-proteins.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15980445 PMCID: PMC1160255 DOI: 10.1093/nar/gki495
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Feature quantities used in SVM training as feature vector elements
| Feature quantities from structural information of ligands and GPCRs |
|---|
| 1. Length of N-terminal loop |
| 2. Length of the first intracellular loop between TMH1 and TMH2 |
| 3. Length of the first extracellular loop between TMH2 and TMH3 |
| 4. Length of the second intracellular loop between TMH3 and TMH4 |
| 5. Length of the second extracellular loop between TMH4 and TMH5 |
| 6. Length of the third intracellular loop between TMH5 and TMH6 |
| 7. Length of the third extracellular loop between TMH6 and TMH7 |
| 8. Length of the C-terminal loop |
| 9. Averaged hydrophobicity of TMH1 |
| 10. Averaged hydrophobicity of TMH2 |
| 11. Averaged hydrophobicity of TMH3 |
| 12. Averaged hydrophobicity of TMH4 |
| 13. Averaged hydrophobicity of TMH5 |
| 14. Averaged hydrophobicity of TMH6 |
| 15. Averaged hydrophobicity of TMH7 |
| 16. Bit score calculated when a query is searched against a profile which is made from sequences of amine binding GPCRs |
| 17. Bit score calculated when a query is searched against a profile which is made from sequences of peptide binding GPCRs |
| 18. Existence of Pro on the position corresponding to the 170th residue on BOVIN rhodopsin |
| 19. Existence of Lys or Arg on the position corresponding to 148th residue on BOVIN rhodopsin |
| 20. Molecular weight of the ligand |
| 21. Number of Lys or Arg corresponding to the 244th, 247th, 248th and 251st residues on the third intracellular loop of BOVIN rhodopsin |
| 22. Number of Lys or Arg corresponding to the 243rd, 244th, 247th, 248th and 251st residues on the third intracellular loop of BOVIN rhodopsin |
| 23. Number of Phe, His, Tyr or Trp that exist in the C-terminal residue of the third intracellular loop to the 9th residue in the N-terminal residue of this loop |
| 24. Number of Asp or Glu that exist in the third intracellular loop |
TMH: transmembrane helix.
The prediction accuracy of SVM part performed with 132 GPCRs
| G-protein type | n | Sensitivity (%) | Specificity (%) | Number of cross-validations | Best kernel function |
|---|---|---|---|---|---|
| Gi/o | 61 | 77.0 | 78.3 | 4 | RBF |
| Gq/11 | 47 | 68.1 | 72.7 | 4 | RBF |
| Gs | 24 | 83.3 | 95.2 | 4 | RBF |
The prediction accuracy of SVM part performed with 108 GPCRs
| G-protein type | n | Sensitivity (%) | Specificity (%) | Fold number of cross-validation | Best kernel function |
|---|---|---|---|---|---|
| Gi/o | 61 | 91.8 | 94.9 | 4 | Polynomial |
| Gq/11 | 47 | 93.6 | 89.8 | 4 | Polynomial |
The prediction accuracy of HMM part performed with 4-fold cross-validation
| Family | G-protein type | Sensitivity (%) | Specificity (%) | Threshold of bit score |
|---|---|---|---|---|
| Opsin | Gt | 99.7 | 100.0 | 153.9 |
| Olfactory | Golf | 100.0 | 100.0 | 151.2 |
| Class B | Gs | 100.0 | 100.0 | 68.0 |
| Class C | Gi/o | 93.5 | 100.0 | 1054.6 |
| Gq/11 | 100.0 | 100.0 | 1325.3 | |
| Frizzled | Unclear | 100.0 | 100.0 | 168.7 |
| Smoothened | Unclear | 100.0 | 100.0 | 627.6 |
Figure 1A flowchart of the integrated system for predicting GPCR–G-protein coupling selectivity.
Figure 2(a) The top of the GRIFFIN website, where the GPCR sequence and ligand molecular weight can be entered. (b) The result page of a GRIFFIN calculation, where the predicted G-proteins of the user-defined sequence are indicated together with physicochemical parameters used in the SVM or HMM calculation.