| Literature DB >> 26982336 |
K H Dhanyalakshmi1, Mahantesha B N Naika2, R S Sajeevan1, Oommen K Mathew2, K Mohamed Shafi2, Ramanathan Sowdhamini2, Karaba N Nataraja1.
Abstract
The modern sequencing technologies are generating large volumes of information at the transcriptome and genome level. Translation of this information into a biological meaning is far behind the race due to which a significant portion of proteins discovered remain as proteins of unknown function (PUFs). Attempts to uncover the functional significance of PUFs are limited due to lack of easy and high throughput functional annotation tools. Here, we report an approach to assign putative functions to PUFs, identified in the transcriptome of mulberry, a perennial tree commonly cultivated as host of silkworm. We utilized the mulberry PUFs generated from leaf tissues exposed to drought stress at whole plant level. A sequence and structure based computational analysis predicted the probable function of the PUFs. For rapid and easy annotation of PUFs, we developed an automated pipeline by integrating diverse bioinformatics tools, designated as PUFs Annotation Server (PUFAS), which also provides a web service API (Application Programming Interface) for a large-scale analysis up to a genome. The expression analysis of three selected PUFs annotated by the pipeline revealed abiotic stress responsiveness of the genes, and hence their potential role in stress acclimation pathways. The automated pipeline developed here could be extended to assign functions to PUFs from any organism in general. PUFAS web server is available at http://caps.ncbs.res.in/pufas/ and the web service is accessible at http://capservices.ncbs.res.in/help/pufas.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26982336 PMCID: PMC4794119 DOI: 10.1371/journal.pone.0151323
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The schematic representation of different events staring from library generation by 454 sequencing, assembly, annotation and other analysis performed.
Fig 2Distribution of selected GO terms based on NCBI-BLAST analysis of mulberry transcriptome generated from drought stressed leaf tissue.
Annotated PUFs using the computational tools.
| Name of the contig | Annotation from NCBI | Predicted function |
|---|---|---|
| contig00286 | Unknown | Thioredoxin like protein |
| contig00529 | Unknown | ACT domain containing protein |
| contig01194 | Unknown | Universal stress protein like |
| contig01391 | Unknown | Ankyrin repeat protein |
| contig01796 | Unknown | Transferase like |
| contig01963 | Unknown | CBS domain containing protein |
| contig02017 | Unknown | GroEL-like chaperone |
| contig02101 | Unknown | Oxidoreductase like |
| contig02328 | Unknown | RNA binding protein |
| contig02670 | Unknown | Aldo-keto reductase like |
| contig04385 | Unknown | Major intrinsic protein |
| contig04437 | Unknown | Transketolase like |
| contig04699 | Unknown | GDP-fucose protein O-fucosyltransferase |
| contig04820 | Unknown | Aldehyde reductase |
| contig04823 | Unknown | GDP-fucose protein O-fucosyltransferase |
| contig04921 | Unknown | Ankyrin repeat protein |
| contig5180 | Unknown | Dehydrogenase like |
| contig05224 | Unknown | Protein phosphatase like |
| contig05320 | Unknown | Dehydrogenase like |
| contig05347 | Unknown | Ferritin like protein |
| contig05421 | Unknown | Dehydrogensase/ reductase |
| contig05454 | Unknown | Late embryogenesis abundant protein like |
| contig05496 | Unknown | Acyl esterase like |
| contig05505 | Unknown | Metallophosphatase |
| contig05537 | Unknown | Ubiquitin like protein |
| contig05592 | Unknown | Dehydrogenase like |
| contig05650 | Unknown | ATP synthase |
| contig05740 | Unknown | Transferase like |
| contig05864 | Unknown | Hydrolase like |
| contig05991 | Unknown | TPR protein |
| contig06030 | Unknown | Sulphite exporter like |
| contig06333 | Unknown | Chlorophyll a-b binding protein |
| contig06433 | Unknown | Chlorophyll a-b binding protein |
| contig06595 | Unknown | Trios phosphate isomerase |
| contig06596 | Unknown | Rab5 like protein |
| contig06640 | Unknown | Phosphoglucomutase |
| contig06735 | Unknown | PLATZ like transcription factor |
| contig06750 | Unknown | Elongation factor protein |
| contig06773 | Unknown | Protein kinase like |
| contig06932 | Unknown | RNA binding protein like |
| contig07145 | Unknown | Rab like protein |
| contig07540 | Unknown | Dormancy/ auxin associated protein |
| contig07570 | Unknown | Ras related protein |
| contig07599 | Unknown | Cytochrome c oxidase |
| contig07639 | Unknown | Proteosome regulatory complex subunit like |
| contig08042 | Unknown | Aldolase like |
| contig08184 | Unknown | Ubiquitin conjugating enzyme like |
| contig08330 | Unknown | Rab like protein |
| contig08474 | Unknown | Ribosomal protein like |
| contig08640 | Unknown | Hydrolase like |
| contig08772 | Unknown | Metal transport protein |
| contig08856 | Unknown | Pholem protein like |
| contig08939 | Unknown | Ubiquinol-cytochrome c reductase complex subunit like |
| contig09315 | Unknown | Dehalogenase like |
| contig09355 | Unknown | Elongation factor like protein |
| contig09421 | Unknown | Ribosomal protein like |
| contig00062 | Hypothetical | Metallothionein like |
| contig00355 | Hypothetical | Pepsin like |
| contig00754 | Hypothetical | Ca binding epidermal growth factor like protein |
| contig01002 | Hypothetical | Late embryogenesis abundunt protein like |
| contig01413 | Hypothetical | Heavy metal binding protein like |
| contig01734 | Hypothetical | Leucine rich repeat protein |
| contig01852 | Hypothetical | Esterase like |
| contig01866 | Hypothetical | KH RNA binding protein |
| contig01896 | Hypothetical | Protein kinase like |
| contig01986 | Hypothetical | Armadillo repeat protein |
| contig02310 | Hypothetical | Perforin like |
| contig02336 | Hypothetical | Ribosomal protein like |
| contig02443 | Hypothetical | SKP1 like protein |
| contig02598 | Hypothetical | Dehydrogense lke |
| contig02665 | Hypothetical | Hydrolase like |
| contig02872 | Hypothetical | Plant retinoblastoma associated protein |
| contig03421 | Hypothetical | Transport protein like |
| contig03453 | Hypothetical | Phosphoglucomutase |
| contig04442 | Hypothetical | WD repeat protein |
| contig04444 | Hypothetical | Acid protease like |
| contig04469 | Hypothetical | SART 1 family protein |
| contig04481 | Hypothetical | Tubulin associated protein |
| contig04536 | Hypothetical | Myb like protein |
| contig04717 | Hypothetical | Transporter like |
| contig04834 | Hypothetical | Vacuolar sorting associated protein |
| contig04885 | Hypothetical | PH domain containing protein |
| contig04956 | Hypothetical | Transferase like |
| contig04977 | Hypothetical | Phosphoprotein like |
| contig00200 | Uncharacterized | DNA binding protein |
| contig00467 | Uncharacterized | Late embryogenesis abundant protein like |
| contig00638 | Uncharacterized | Transferase like |
| contig00917 | Uncharacterized | Ribosomal protein like |
| contig01190 | Uncharacterized | Zinc finger protein |
| contig01205 | Uncharacterized | Transferase like |
| contig01222 | Uncharacterized | Pseudouridine synthase like |
| contig01234 | Uncharacterized | Transferase like |
| contig01406 | Uncharacterized | Peroxidase like |
| contig01408 | Uncharacterized | Peroxidase like |
| contig01644 | Uncharacterized | Hydrolase like |
| contig01646 | Uncharacterized | Transferase like |
| contig01671 | Uncharacterized | RNA binding protein |
| contig01710 | Uncharacterized | Hydrolase like |
| contig01711 | Uncharacterized | Hydrolase like |
Fig 3Web interface of PUFAS.
a) Option to input of contigs identified from a genome by using next-generation sequencing, b) Output of Pfam and CDD, c) Gene identification and its elements, d) Amino acid sequence of a given contig.
Sequence and structure based annotation of selected PUFs.
| Contig ID | PUFs | Sequence based annotation | Structure based annotation | Annotation | |||
|---|---|---|---|---|---|---|---|
| CDD | PFam | 3DPSSM | PHYRE | GenTHREADER | |||
| Contig 01194 | PUF3 | Universal stress protein family | Universal stress protein family | ETFP adenine nucleotide-binding domain-like | Adenine nucleotide alpha hydrolase-like | Adenine nucleotide alpha hydrolase-like | |
| Contig 06735 | PUF39 | PLATZ1 transcription factor | PLATZ1 transcription factor | Immunoglobulin-like beta-sandwich | Gene regulation, hydrolase | DNA clamp | |
| Contig 06932 | PUF42 | RNA recognition motif | RNA recognition motif | No Prediction | RNA binding protein | Ferredoxin-like | |
Fig 4Phylogenetic analysis of selected PUF.
a) Multiple sequence alignment of mulberry PUF39 with homologous genes from other plant genomes. b) Unrooted phylogenetic tree of PUF39. The Genbank accession numbers of the sequences are Ricinus communis (XP_002510750.1), Populus trichocarpa (XP_002301890.1), Malus domestica (XP_008381650.1), Theobroma cacao (XP_007018633.1), Brassica rapa (XP_009149581.1), Arabidopsis lyrata subsp. Lyrata (XP_002890421.1), Arabidopsis thaliana (NP_001117322.1), Solanum tuberosum (XP_006352862.1), Phaseolus vulgaris (XP_007131503.1) and Glycine max (XP_003540084.1).
Fig 5Expression analysis of selected PUFs by qRT-PCR in leaf tissue of mulberry genotype, Dudia white.
Relative transcript levels of MaUSP-like (a), MaPLATZ1-like (b), and MaRRM1-like (c) genes under salinity and oxidative stresses. Total RNA was isolated from the leaf tissues of mulberry twigs exposed to 250mM NaCl (salinity stress) and 15μM methyl viologen (oxidative stress) at 6, 12, 24, and 48 hours after stress imposition. Transcripts were normalized to the expression of the elongation factor gene. The data shown are mean ± SE from three independent experiments. Asterisk indicates the significant difference between control and treatments at p = 0.05.
Tools used in the study.
| Sl.No | Tool | Purpose | URL |
|---|---|---|---|
| I. Transcriptome assembly | |||
| 1 | Newbler | - | |
| II. Annotation tool | |||
| 1 | NCBI BLAST | Preliminary annotation of transcripts | |
| III. Gene prediction | |||
| 1 | FGENESH | Prediction of open reading frame | |
| 2 | AUGUSTUS | Prediction of open reading frame | |
| IV. Sequence analysis | |||
| 1 | CDD | Identification of conserved domain | |
| 2 | Pfam | Protein family classification | |
| V. Fold analysis | |||
| 1 | PSIPRED | Fold recognition | |
| 2 | PHYRE2 | Fold recognition | |
| 3 | GenTHREADER | Fold recognition | |
| VI. Gene Ontology | |||
| 1 | Gene Ontology | Annotation | |
| VII. Phylogenetic analysis | |||
| 1 | MEGA5 | ||
| VIII. Protein-protein interaction | |||
| 1 | STRING | To find out interacting partners | |
| IX. Expression analysis | |||
| 1 | qRT-PCR | To analyze the functional significance | - |