| Literature DB >> 19132080 |
Kemal Sonmez1, Naunihal T Zaveri, Ilan A Kerman, Sharon Burke, Charles R Neal, Xinmin Xie, Stanley J Watson, Lawrence Toll.
Abstract
There are currently a large number of "orphan" G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19132080 PMCID: PMC2603333 DOI: 10.1371/journal.pcbi.1000258
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1Prohormone hierarchical grammar of evolutionary MPHMM modules.
Figure 2Hierarchical functional-element multiple alignment of Pronociceptin across human, chimpanzee, mouse, rat, and cow.
Figure 3HIGHER MPHMM modules.
(a) Signal sequence, (b) cleavage site, and (c) peptide/divergent region modules.
Modules and their abbreviations.
| Functional Element | Symbol |
| Signal sequence | SS |
| Cleavage site (double basic) | CSd |
| Cleavage site (single basic) | CSs |
| Peptide hormone region | PR |
| Divergent region | DR |
Figure 4Multiple alignment of functional element sequences across genomes.
Matches Found in Swiss-Prot Database.
| Hormones | Sequence Matching “Hits” | Hormones | Sequence Matching “Hits” |
| ACTH | x | MCH (melanin concentrating hormone) | |
| ADM (adrenalmedulin) | x | Motilin | x |
| Agouti-related peptides | x | MSH (melanocyte stimulating hormone) | x |
| Amylin | x | Neuromedin U | x |
| ANP (atrial natruretic peptide) | Neurotensin | x | |
| Apelin | Neurturin | x | |
| Calcitonin | x | Nociceptin | x |
| CART (cocaine and amphetamine regulated transcript) | x | NPY (neuropeptide Y) | x |
| CCK (cholecystokinin) | x | Orexins | x |
| CGRP (calcitonin gene related protein) | x | Oxytocin | x |
| CNP (C-type natriuretic factor) | x | PACAP (pituitary adenylate cyclase activating polypeptide) | x |
| Cortistatin | PPY (pancreatic hormone) | x | |
| CRF (corticotropin releasing factor) | x | PHI (same precursor with VIP) | x |
| Dynorphin | x | PrRP (prolactin-releasing peptide) | |
| β-Endorphin | x | PTH (parathyroid hormone) | x |
| Endothelin 1 | x | PTH-RP (parathyroid releasing hormone) | |
| Endothelin 2 | x | PYY (peptide YY) | x |
| Endothelin 3 | x | Secretin | x |
| Enkephalin | x | Somatostatin | |
| Galanin | x | Substance K ( = neurokinin A) | |
| Gastrin | x | Substance P | x |
| Glucagon | x | TEGT (testis enhanced gene transcript) | x |
| GRF (growth hormone releasing factor) | x | TRH (thyroid releasing hormone) | x |
| GRP (gastrin releasing peptide) | Vasopressin | x | |
| Guanylin | VIP (vasoactive intestinal peptide) | x | |
| LHRH1 (luetinizing hormone releasing hormone) | x | PSP94 (prostate secretory protein) | x |
False Positives
Other signaling molecules: FGF-3,5,7,10,17,18; GDNF; CD8,28; PDGF-2; TGF; VEGF (vascular endothelial growth factor); HBNF-1; MIP; NGF (nerve growth factor); Cytokine A21, IFN-α (interferon alpha); IGF binding protein 1B,2,3; IL7 (interleukin 7).
Other: MAGF (microfibril associated protein), MINK (K-channel), K-channel related peptide, L-type Ca2+ channel, gamma subunit, myelin Po protein, Dif-2, Eosinophil, Syntaxin 1B (vesicle docking), Syntaxin 2, TMP21 (vesicle trafficking protein), Coagulation factor III, PGD2 synthase, syndecans, FKBP12 (FK506 binding protein), Folate receptor, ERp29, COMT, Connexin 32, Cytostatin.
Figure 5Amino acid sequence of preproNPQ.
Sequences shown were obtained from GenBank. The human and rat sequences were verified by nucleotide sequencing as described in Materials and Methods. Putative neuropeptides highlighted. They begin at the end of the signal sequence and end at the fourth set of basic residues. Residues that are not conserved between human and other species are in bold.
Figure 6Northern Blot Analysis of preproNPQ mRNA.
Ambion's First Choice Human Blot was prehybridized and probed with human NPQ cDNA prepared from the human DNA clone in pOTB7 vector from ATCC (Cat # 6710068, Manassas, VA). This clone contained the putative sequence for human NPQ. Random-prime labeling was performed using 32P-dCTP and Klenow DNA polymerase was conducted as described in Materials and Methods. 1. Brain, 2. Placenta, 3. Skeletal muscle, 4. Heart, 5. Kidney, 6. Pancreas, 7. Liver, 8. Lung, 9. Spleen, 10. Colon.
Figure 7In situ hybridization of preproNPQ mRNA.
Expression of preproNPQ mRNA in the rat brain at the level of Barrington's nucleus and locus coeruleus. In situ hybridizations (ISHs) for preproNeuropeptide Q (NPQ; A, D, G, and J), corticopin-releasing factor (CRF; B), tyrosine hydroxylase (TH; E), choline acetyltransferase (ChAT; H), and tryptophan hydroxylase 2 (TPH2; K) were carried out on adjacent 10 µm-thick sections of the rat brain. ISH autoradiograms were digitized; images were then inverted and pseudocolored according to the following scheme: NPQ – green, CRF – red, TH – cyan, ChAT – white, and TPH2 – blue. To determine whether NPQ signal overlapped with any of the other signals, the sections were aligned and overlaid with each other (C, F, I, L). Arrow in panel A indicates location of NPQ mRNA, while arrow in panel B indicates location of CRF mRNA; note the mixing of red and green (to yield yellow) in panel C (arrow) that suggests co-localization of NPQ and CRF. Arrow in panel E indicates locus coeruleus and its TH-positive neurons. Panel F shows that TH and NPQ signals are spatially very close without overlap. Arrow in panel H indicates the cholinergic laterodorsal tegmental nucleus, while panel I illustrates close spatial relationship between ChAT and NPQ mRNAs. At this level of the neuraxis there is little overlap between TPH2 mRNA (blue signal in panel K, which represents serotonergic neurons) and NPQ (L).
Figure 8In situ hybridization of preproNPQ mRNA.
Expression of preproNPQ mRNA at the level of the caudal ventrolateral periaqueductal gray (PAG). ISH autoradiograms were digitized and pseudocolored according to the same scheme as in Figure 7. NPQ signal was visible in the ventrolateral quadrant of the PAG as well as within the underlying reticular formation (A, D, G). ISHs for TH (B), ChAT (E) and TPH2 (H) were carried out on adjacent sections. Arrow in panel B indicates location of dopaminergic TH-positive neurons of the ventrolateral PAG that appear to overlap with a subset of NPQ mRNA (C). There is also close spatial relationship between NPQ and ChAT (F) and NPQ and TPH2 (I). Abbreviations are the same as in Figure 7.