| Literature DB >> 23638006 |
Daryanaz Dargahi1, David Baillie, Frederic Pio.
Abstract
BACKGROUND: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB) fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%). Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46) compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344) or fruit fly D. melanogaster (n=84). Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2013 PMID: 23638006 PMCID: PMC3636199 DOI: 10.1371/journal.pone.0062204
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Tools used in this study.
| Tools | Description | Reference |
|
| ||
| PSI-BLAST | Position-Specific Iterative Basic Local Alignment Search Tool |
|
| MEME | Motif based Hidden Markov Model of protein families |
|
| HMMER | Bio-sequence analysis tool using profile hidden Markov Models |
|
| HHpred | Homology detection & structure prediction tool by HMM-HMM comparison |
|
| COMPASS | Alignment tool of multiple protein sequence profiles |
|
| HHsenser | Exhaustive intermediate profile search tool using HMM-HMM comparison |
|
| Saturated-BLAST | Automated toolbox that implement the multiple intermediate sequence search method |
|
|
| ||
| MetaServer | A Server that submit and collect fold recognition results from different methods andmakes 3D-prediction using a consensus approach called 3D-jury. |
|
| I-Tasser | Protein 3D-structure prediction server that uses threading methods |
|
| Modeller | Protein 3D-structure modeling tool from target-template sequence alignment based on satisfaction of spatial restraints |
|
| TM-Align | Protein 3D-structure alignment algorithm that compute the TM-Score |
|
|
| ||
| BioGrid | Database of Protein and Genetic Interactions |
|
| STRING | Database of Functional protein association networks |
|
| Worm Interactome | A high quality yeast two-hybrid protein-protein interactions database of |
|
| WoLF PSORT | Protein sub-cellular localization predictor |
|
| Kihara PFP | Protein function predictor |
|
Figure 1Superimposition of the novel OB-fold 3D-model with their templates.
(Light blue): Predicted 3D-models, (Wheat) PDB template. (.XXXX.)-nxxx name correspond to the protein name followed by the PDB code of the template.
Model quality of novel OB-fold protein coding genes.
| OB-fold Candidates target | Template | RMSD | TM-score | Equivalent Calpha superimposed |
|
| 3E0J | 0.9 | 0.91618 | 104/110 |
|
| 2QGQ | 1.08 | 0.79684 | 57/64 |
|
| 2Z0S | 0.39 | 0.91855 | 80/86 |
|
| 2Z0S | 1.33 | 0.93357 | 66/66 |
|
| 2Z0S | 0.97 | 0.83856 | 76/85 |
|
| 2IX1 | 2.15 | 0.77503 | 81/92 |
|
| 1UEB | 3.66 | 0.51393 | 76/98 |
|
| 3BBN | 1.22 | 0.90075 | 43/45 |
|
| 1HZA | 1.88 | 0.8186 | 77/82 |
|
| 2C35 | 1.27 | 0.91183 | 58/59 |
|
| 1XJV | 1.98 | 0.81487 | 61/61 |
|
| 1L1O | 1.11 | 0.89915 | 128/135 |
|
| 1XJV | 3.43 | 0.43998 | 74/115 |
|
| 3MXN | 1 | 0.83903 | 115/133 |
|
| 1XJV | 1.49 | 0.90455 | 132/139 |
|
| 3KJO | 0.4 | 0.86529 | 110/126 |
|
| 2QGQ | 1.03 | 0.83313 | 115/135 |
|
| 1YZ6 | 0.62 | 0.89597 | 56/60 |
|
| 3E0J | 2.27 | 0.64349 | 66/81 |
Functional analysis of Novel OB folds protein coding genes.
| OB folds | WB ID | Biblio | RNAi Phenotype | Knockout | Function | Homologues Paralogues | Paralogs |
| F12F6.7 | WBGene00008722 | NA | Embryonic lethal | ok2252 | DNA replication, DNA binding,DNA-directed DNA polymerase activity | POLD2 polymerase (DNA directed),delta 2, regulatory subunit 50kDa | NA |
| F25B5.5 | WBGene00017776 | NA | NA | NA | RNA modification, iron-sulfur cluster binding,4 iron, 4 sulfur cluster binding, catalytic activity | CDK5RAP1 CDK5 | Y92H12BL.2 |
| exos-2 | WBGene00022232 |
| late larval arrest | NA |
| EXOSC2 | NA |
| exos-3 | WBGene00010325 |
| Embryonic lethal | NA | growth,nematode larval development, receptor-mediated endocytosis | EXOSC3 | NA |
| exos-1 | WBGene00012966 |
| Embryonic lethal, lethal | ok807 | positive regulation of growth rate,reproduction | EXOSC1 | NA |
| dis-3 | WBGene00001001 |
| Slow growth, sick, sterile progeny | ok357 | RNA binding, ribonuclease activity, sequence-specific DNA binding, reproduction | DIS3 mitotic control homolog(S. cerevisiae) | F48E8.6 |
| ZK470.2 | WBGene00022745 |
| NA | ok5876 |
| NA | NA |
| C05D11.10 | WBGene00015487 | NA | Embryonic lethal, lethal, slow growth | ok5298 | growth, nematode larval development,positive regulation of growth rate, reproduction | NA | NA |
| W08A12.2 | WBGene00021079 | NA | NA | NA |
| NA | NA |
| F10E9.4 | WBGene00017356 | NA | Slow growth, larval lethal | NA | growth, nematode larval development, positive regulation of growth rate, reproduction | NA | NA |
| Pot-1 | WBGene00015105 |
| organism development variant, telomere homeostasis variant | NA |
| Pot1 | NA |
| brc-2 | WBGene00020316 |
| Embryonic lethal, lethal,embryonic arrest | ok1629 | strand invasion, double-strand break repair, reproduction, single-stranded DNA and protein binding | Brca1 Breast Cancer type 1 susceptibility protein | NA |
| Pot-3 | WBGene00007065 |
| lethal | ok1530 |
| Pot1 | pot-2, mrt-1 |
| T07C12.12 | WBGene00011576 |
| Embryonic lethal | NA |
| RMI1, RecQ | NA |
| Pot-2 | WBGene00010195 |
| NA | NA |
| NA | pot-3, mrt-1 |
| MRT-1 | WBGene00045237 | [20,42,56–59] | Sterile, lethal | oK758 | Nuclear excision repair, telomere maintenancevia telomerase, reproduction, Singlestranded DNA binding | NA | pot-2, pot-3 |
| Y92H12BL.2 | WBGene00022363 | NA | NA | NA | Iron-sulfur cluter binding | CDKAL1, CDK5 regulatory subunit | F25B5.5 |
| F48E8.6 | WBGene00018612 | NA | NA | NA | RNA binding, ribonuclease activity | DIS3L2, DIS3 mitotic control homolog(S. cerevisiae)- | dis-3 |
refers to predicted functions. Homologues and paralogues referred to human.
Figure 2Discovery Pipeline of novel OB fold protein coding genes.
It contains 3 Discovery Modules. SeqDIM: Sequence alignment DIscovery Module; StrucDIM:3D Structure prediction Discovery Module; and a Functional prediction Discovery Module FuncDIM.