| Literature DB >> 23469176 |
Nisha Joy1, Srinivasan Asha, Vijayan Mallika, Eppurathu Vasudevan Soniya.
Abstract
Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23469176 PMCID: PMC3587635 DOI: 10.1371/journal.pone.0056694
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Bioinformatic pipeline followed for annotation of unigenes.
Figure 2Bioinformatic pipeline followed for identification of SSR bearing pre-miRNAs and its possible targets.
Figure 3Summary of de novo transcriptome sequencing and assembly of black pepper.
(A) The length distribution of contigs (B) The length distribution of Unigenes (C) Histogram showing unigene classification based on clusters of orthologous groups (COG) (D) Gene Ontology classification of unigenes (E) KEGG functional classification of unigenes.
Figure 4Summary of microsatellite repeats identified in the generated transcriptome.
(A) Classification of microsatellites based on different types of motifs (B) Classification of microsatellites based on nucleotide string (C) Characterisation of dinucleotide repeats detected in transcripts (D) Characterisation of trinucleotide repeats detected in transcripts (E) Characterisation of tetranucleotide repeats detected in transcripts (F) Characterisation of pentanucleotide repeats detected in transcripts.
List of putative ‘miRNA candidates’ identified.
| Sl no | Unigene ID | SSR motif | Predicted pre-miRNA position | MFOLD Delta G value | Direction | Predicted mature miRNA position | Mature miRNA sequence | A+U content |
| 1 | 2414 | TCC | 184–303 | 41.8 | Plus | 230–251 | AGCGCGAGUUCGCCUUCGCCGU | 31.82 |
| 2 | 2535 | CGG | 8–127 | 44.29 | Plus | 67–88 | UGCAGCGGUUUGGGAGGAGGGG | 31.82 |
| 3 | 3169 | AG | 218–337 | 33.2 | Minus | 282–302 | AUGCCGUCUUUCUGGGGUGAG | 42.86 |
| 4 | 11980 | AG,GGT | 37–156 | 48.18 | Minus | 114–135 | CCUCCUUCUUCGCCUUCCCCCU | 36.36 |
| 5 | 12508 | AG,TG | 78–197 | 35.3 | Minus | 88–108 | UCUUGGGGGUUGGGGUGAGAG | 38.10 |
| 6 | 25668 | TCG | 126–245 | 40.4 | Plus | 172–293 | AAGGCUGUGAUGGACGUUUUUG | 54.55 |
| 7 | 26112 | CCT,AGG | 195–314 | 39.6 | Minus | 246–267 | GGCACCGACGCCAGGGUGUUGU | 31.82 |
| 8 | 30088 | CTC | 34–153 | 52.4 | Plus | 111–132 | UUGUCGGAGGUGUCGCCAGCGU | 36.36 |
| 9 | 31243 | CCG | 240–359 | 37.0 | Plus | 272–292 | GGGAGCUUGUUGGUGAUGGUC | 42.86 |
| 10 | 32324 | GGC | 19–138 | 50.7 | Minus | 47–68 | CUCUAGCACCUUCUCCACUCCG | 40.91 |
| 11 | 45182 | ACA | 229–348 | 32.6 | Minus | 283–304 | CAAUACUAUGAGCUUGAAUUGG | 63.64 |
| 12 | 49665 | AAG | 147–266 | 37.9 | Plus | 243–263 | AUGGGGACAACGCAGGUGUUG | 42.86 |
| 13 | 49856a | AT | 552–671 | 47.67 | Minus | 580–500 | CAAAUGAACAAUAUAAUUACG | 76.19 |
| 14 | 49856b | AT | 51–170 | 37.18 | Minus | 56–76 | UCCUCCACCGUGUAGCUCAUG | 42.86 |
| 15 | 49856c | AT | 40–159 | 34.08 | Minus | 139–159 | GCUGCCUGUGGUAGAAUGCGU | 42.86 |
| 16 | 50113a | CCT | 270–389 | 41.3 | Plus | 307–328 | AGGAUGAGUUGAAGAAGUUGGU | 59.10 |
| 17 | 50113b | CCT | 248–367 | 40.5 | Minus | 324–345 | CGUCGGUGUCGGCCACCACCAA | 31.82 |
| 18 | 50626 | GAG | 77–196 | 51.1 | Minus | 113–134 | CCAUCGCUGAGCUCGUAGUAGU | 45.46 |
| 19 | 51352 | CCG | 28–147 | 42.34 | Plus | 48–69 | UUGCCUCUCCUCUCCCAAAGAC | 45.46 |
| 20 | 54811 | ACC,CCG | 29–148 | 53.86 | Minus | 111–132 | UCACCCCAGAGGUCCGACUCGU | 36.36 |
| 21 | 64214a | CCA | 59–178 | 34.04 | Minus | 106–127 | GUAGCCUUUUCGUGUCUGUGUC | 50.00 |
| 22 | 64214b | CCA | 28–147 | 38.3 | Minus | 85–106 | CAGUCGAACAUUCGGGCAAGCG | 40.91 |
| 23 | 65213 | AG | 42–161 | 37.65 | Plus | 84–104 | UCUCUCUUUGUAAGUUUUCUG | 66.67 |
| 24 | 72663a | AAG | 70–189 | 36.1 | Minus | 96–117 | AGAUUGAGCUCCUCUUCUUGGU | 54.55 |
| 25 | 72663b | AAG | 55–174 | 38.3 | Minus | 136–157 | CCAUGAAUCCAGACUCGGCUGC | 40.91 |
| 26 | 74589 | AGG | 122–241 | 36.2 | Plus | 184–204 | GUUGGGCGGGGAGGAGAAGAU | 38.10 |
| 27 | 88484 | AG | 50–169 | 34.5 | Plus | 131–151 | AUCAUAUCUACUCGCUUCAAA | 66.67 |
| 28 | 90461 | AAG | 130–249 | 46.07 | Minus | 138–158 | ACUCCUUGGUGGUAGACGCCU | 42.86 |
| 29 | 93292 | AGG | 169–288 | 32.74 | Plus | 178–198 | UGAUGUUGGCGAGGUGAUGAA | 52.38 |
| 30 | 94407 | TC | 216–335 | 38.1 | Minus | 245–265 | CGACCAUUGCAUGACCAGGGC | 38.10 |
| 31 | 94456 | GCA | 0–119 | 50.5 | Plus | 28–49 | GCAUUUUGAUGGCCGAUAUGGU | 54.55 |
| 32 | 94870 | GGA | 0–119 | 41.8 | Plus | 45–66 | UGGGAAGCAGAGGUAUGGCCGC | 36.36 |
| 33 | 95728 | CTC | 18–137 | 32.77 | Plus | 57–77 | CCUUUCUUGCCGGUUGGAGGA | 42.86 |
| 34 | 98341 | TC | 29–148 | 35.8 | Minus | 126–147 | GAGAGAGAUGGGGGAUGUCACC | 40.91 |
| 35 | 103545a | CGA | 61–180 | 40.6 | Plus | 84–105 | CUGCUCGUCGUCGGAUGGCGAA | 36.36 |
| 36 | 103545b | CGA | 89–208 | 42.9 | Plus | 112–133 | GAAGUUAACUUUUCAACGCCGU | 59.10 |
| 37 | 106552 | ACG | 26–145 | 45.66 | Plus | 69–90 | GCUUCCACUUUGUCAUCCCCCG | 40.91 |
| 38 | 111157 | TCG | 0–119 | 47.0 | Plus | 37–58 | GAAGAACUCGUCGUCACCGUCG | 40.91 |
| 39 | 118746a | GAG | 112–231 | 44.2 | Plus | 158–179 | CUCGCCGAUUUGAGCGGCAGCG | 31.82 |
| 40 | 118746b | GAG | 86–205 | 45.6 | Minus | 102–123 | UCCUCAUCCUGCGGCUGCUCUU | 40.91 |
| 41 | 120734a | AG | 30–149 | 47.8 | Plus | 51–71 | UCGAGAUGGAGGGAGGUCUGG | 38.10 |
| 42 | 120734b | AG | 133–252 | 50.6 | Minus | 189–209 | UCUCCCUCUCUCUCUACUUCU | 52.39 |
| 43 | 122117 | TTG | 233–352 | 32.0 | Plus | 291–312 | GGGGAUCCGCCAUGAAAGCUCC | 36.36 |
List of all the potential targets for the ‘miRNA candidates’.
| Sl no | Unigene | Target | MFEvalues(kcal/mol) | Target gene description |
| ID | Gene ID | |||
| 1 | 74589 |
| –41.2 |
|
| 2 | 94407 |
| –35.6 |
|
| 3 | 120734 |
| –36.5 |
|
| 120734 | unigene124275 | –9.6 |
| |
| 4 | 11980 | unigene34919 | –24.9 |
|
| 11980 | unigene19725 | –24.9 |
| |
| 5 | 26112 |
| –54.3 | NA |
| 6 | 30088 |
| –47.3 | NA |
| 7 | 50113 | unigene50112 | –24.3 | NA |
|
| ||||
| 50113 |
| –33 |
| |
| 8 | 50113 |
| –53.5 | NA |
|
| ||||
|
| –51.4 |
| ||
| 9 | 50626 |
| –41.6 |
|
| 10 | 72663 |
| –44.9 |
|
| 72663 |
| –33.9 |
| |
| 11 | 94870 |
| –45.1 |
|
|
| –45.1 |
| ||
| unigene14227 | –25.8 | NA | ||
| 12 | 103545 |
| –33 | NA |
| 13 | 103545 | unigene103544 | –24.7 | NA |
| 14 | 106552 |
| –41.5 |
|
| unigene16846 | –22.8 |
| ||
| 15 | 122117 | unigene122119 | –26.8 |
|
Nr-annotation;
Swissprot-annotation;
most probable targets.
A comparison of high throughput sequencing data from recently reported root transcriptome with our generated leaf transcriptome of black pepper.
| Leaf transcriptome | Root transcriptome | |
| Type of sequencing | illuminaHiSeq™ | SOLiD platform |
| Assembly of transcripts | Trinity | multiple-k method |
| Total number of reads | 55072366 (4.9Gbp) | 13300000 (665Mbp) |
| Number of Contigs | 223386 | 22363 |
| Number of Unigenes | 128157 | 10338 |
| Unigene (Total size) (bp) | 28740830 | 1787600 |
| Predicted proteins | 73507 | 4472 |
| RPKM_Most expressed transcript | 29348.92 | 68250 |
Figure 5Heat map showing summary of changes in gene expression based on RPKM values.
Figure 6Relative position of microsatellite motifs with respect to potential ‘pre-miRNA candidates’.
Figure 7Pie chart showing the relative number of SSR bearing ‘pre-miRNAs’ among different taxa (Viridiplantae, Viruses, Arthropoda, Nematoda, Platyhelminthes, Urochordata, Vertebrata, Mycetozoa and Protistae).
Figure 8Pie chart showing the relative number of SSR bearing ‘pre-miRNAs’ among different species of Viridiplantae.
Figure 9A comparative study between A. thaliana and P. nigrum on SSRs occurring within the ‘pre-miRNAs’.