| Literature DB >> 28663752 |
Muhammad Farooq1, Shahid Mansoor1,2, Hui Guo2, Imran Amin1, Peng W Chee2, M Kamran Azim3, Andrew H Paterson2.
Abstract
MicroRNAs (miRNAs) are small 20-24nt molecules that have been well studied over the past decade due to their important regulatory roles in different cellular processes. The mature sequences are more conserved across vast phylogenetic scales than their precursors and some are conserved within entire kingdoms, hence, their loci and function can be predicted by homology searches. Different studies have been performed to elucidate miRNAs using de novo prediction methods but due to complex regulatory mechanisms or false positive in silico predictions, not all of them express in reality and sometimes computationally predicted mature transcripts differ from the actual expressed ones. With the availability of a complete genome sequence of Gossypium arboreum, it is important to annotate the genome for both coding and non-coding regions using high confidence transcript evidence, for this cotton species that is highly resistant to various biotic and abiotic stresses. Here we have analyzed the small RNA transcriptome of G. arboreum leaves and provided genome annotation of miRNAs with evidence from miRNA/miRNA∗ transcripts. A total of 446 miRNAs clustered into 224 miRNA families were found, among which 48 families are conserved in other plants and 176 are novel. Four short RNA libraries were used to shortlist best predictions based on high reads per million. The size, origin, copy numbers and transcript depth of all miRNAs along with their isoforms and targets has been reported. The highest gene copy number was observed for gar-miR7504 followed by gar-miR166, gar-miR8771, gar-miR156, and gar-miR7484. Altogether, 1274 target genes were found in G. arboreum that are enriched for 216 KEGG pathways. The resultant genomic annotations are provided in UCSC, BED format.Entities:
Keywords: Gossypium arboreum; bioinformatics; microRNA; next generation sequencing; transcriptome
Year: 2017 PMID: 28663752 PMCID: PMC5471329 DOI: 10.3389/fpls.2017.00969
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of short RNA raw NGS data.
| Sample1 | Sample2 | Sample3 | Sample4 | |
|---|---|---|---|---|
| Total raw reads | 7,640,730 | 5,095,200 | 4,635,233 | 4,899,320 |
| Total unique clusters | 1,608,017 | 870,343 | 409,940 | 724,817 |
| Avg. read length after trimming | 15–30 | 15–30 | 15–30 | 15–30 |
| Average quality per read | 39 | 39 | 39 | 39 |
| Max N’s allowed per read | 1 | 1 | 1 | 1 |
| rRNA, tRNA, snRNA, and snoRNA (total raw reads count) | 460,425 | 363,352 | 90441 | 279084 |
| rRNA, tRNA, snRNA, and snoRNA (unique reads count) | 61,919 | 59,294 | 24,255 | 58,744 |
| Mapped to Uniprot (total reads cunt) | 1071 | 849 | 290 | 1299 |
| Mapped to Uniprot (unique reads count) | 649 | 395 | 219 | 387 |
MicroRNAs prediction and origin statistics.
| Conserved | Total | ||
|---|---|---|---|
| Total miRNAs predicted | 48 | 176 | 224 |
| Total intronic miRNAs | 8 | 11 | 19 |
| Total intergenic miRNAs | 40 | 165 | 205 |
| Total miRNA genomic origins | 188 | 258 | 446 |