| Literature DB >> 30532149 |
Susana L Torales1, Máximo Rivarola2,3, Sergio Gonzalez2, María Virginia Inza1, María F Pomponio1, Paula Fernández2,3, Cintia V Acuña2, Noga Zelener1, Luis Fornés4, H Esteban Hopp2,5, Norma B Paniego2,3, Susana N Marcucci Poltri2.
Abstract
The endangered Cedrela balansae C.DC. (Meliaceae) is a high-value timber species with great potential for forest plantations that inhabits the tropical forests in Northwestern Argentina.Research on this species is scarce because of the limited genetic and genomic information available. Here, we explored the transcriptome of C. balansae using 454 GS FLX Titanium next-generation sequencing (NGS) technology. Following de novo assembling, we identified 27,111 non-redundant unigenes longer than 200 bp, and considered these transcripts for further downstream analysis. The functional annotation was performed searching the 27,111 unigenes against the NR-Protein and the Interproscan databases. This analysis revealed 26,977 genes with homology in at least one of the Database analyzed. Furthermore, 7,774 unigenes in 142 different active biological pathways in C. balansae were identified with the KEGG database. Moreover, after in silico analyses, we detected 2,663 simple sequence repeats (SSRs) markers. A subset of 70 SSRs related to important "stress tolerance" traits based on functional annotation evidence, were selected for wet PCR-validation in C. balansae and other Cedrela species inhabiting in northwest and northeast of Argentina (C. fissilis, C. saltensis and C. angustifolia). Successful transferability was between 77% and 93% and thanks to this study, 32 polymorphic functional SSRs for all analyzed Cedrela species are now available. The gene catalog and molecular markers obtained here represent a starting point for further research, which will assist genetic breeding programs in the Cedrela genus and will contribute to identifying key populations for its preservation.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30532149 PMCID: PMC6285271 DOI: 10.1371/journal.pone.0203768
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Overview of the sequencing and assembly of C.balansae leaf transcriptome.
| Description | Statistics |
|---|---|
| Total number of raw read sequences | 212,589 |
| Mean length (bp) | 434 |
| Total Number of assembled read sequences | 149,572 |
| Total number of isotigs (>200 bp) | 1,531 |
| Average length of isotigs (>200 bp) | 976.7 |
| Range of isotig length (>200 bp) | 214–9,135 |
| Total number of singletons (>200 bp) | 25,580 |
| Average length of singletons (bp) | 424.7 |
| Range of singleton length (bp) | 200–728 |
| Total number of unigenes | 27,111 |
Fig 1Frequency distribution of isotigs length.
The histograms represent the number of isotig sequences in relation to their length.
Summary of functional annotation of assembled C. balansae unigenes.
| Database | Tools | # of annotated transcripts | % of annotated transcripts |
|---|---|---|---|
| Nr Protein | BLASTx | 20,953 | 77.33 |
| InterProScan | Full interpro suite | 26,977 | 99.65 |
| GO | BLAST2GO | 19,029 | 64.00 |
| KEGG | BLAST2GO | 7,774 | 29.00 |
Fig 2Summary of functional annotation of assembled C.balansae unigenes.
Gene Ontology terms were assigned successfully to 19,029 of the BLASTX annotated unigenes using BLAST2GO. These unigenes were classified in three main groups: biological process, cellular components and molecular functions. For biological processes, the most represented GO term was cellular process followed by metabolic process and response to stimulus. For cellular components, genes associated with cell parts and organelles were the most highly represented, while genes related to binding and catalytic activity represented the largest proportion of genes with molecular functions. Fig 3 shows more information on the functional categorization.
Fig 3Gene ontology (GO) classification of annotated C. balansae unigenes.
Fig 4The terpenoid biosynthesis pathway.
Color boxes indicate the identified genes in C. balansae transcriptome.
Polymorphic SSRs primer pairs derived from C. balansae unigenes.
| Locus name | Marker ID | Motif | Primer sequence 5'-3' | Amplicon length expected | Sequence description |
|---|---|---|---|---|---|
| isotig00700c | (gga)5 | 196 | dehydrin 2 Vitis yeshanensis | ||
| isotig00766a | (ggc)7 | 153 | unnamed protein product Thellungiella halophila | ||
| isotig01103a | (ctgc)3 | 136 | predicted protein Populus trichocarpa | ||
| GR7D2IN01A9N43 | (ggc)4 | 262 | Predicted glycine-rich protein 2-like Vitis vinifera | ||
| GR7D2IN02JVDNY | (aattt)3 | 201 | predicted protein P. trichocarpa | ||
| isotig00209b | (aga)5 | 132 | carbonic anhydrase, putative Ricinus communis | ||
| isotig00797a | (ta)5 | 208 | Dehydration-responsive protein RD22 precursor | ||
| isotig00125a | (ta)6 | 242 | Inositol-3-phosphate synthase | ||
| GR7D2IN01C6ECWa | (gac)5 | 209 | hypothetical protein ARALYDRAFT | ||
| GR7D2IN01B4HGH | (ctc)5 | 178 | cold shock protein, putative Ricinus communis | ||
| GR7D2IN02HQKGO | (acat)3 | 251 | temperature-induced lipocalin P.tremuloides | ||
Results of genotyping 51 C. balansae samples with 11 SSRs.
| Locus name | Na | He | Ho | PIC | LD | Fis |
|---|---|---|---|---|---|---|
| TrCbal8 | 2 | 0.075 | 0.078 | 0.073 | 0.643 | -0.045 |
| TrCbal9 | 5 | 0.165 | 0.176 | 0.159 | 0.664 | -0.052 |
| TrCbal15 | 2 | 0.251 | 0.294 | 0.219 | 0.604 | -0.171 |
| TrCbal16 | 4 | 0.403 | 0.314 | 0.343 | 0.215 | 0.248 |
| TrCbal27 | 4 | 0.076 | 0.078 | 0.075 | 0.660 | -0.051 |
| TrCbal38 | 2 | 0.128 | 0.137 | 0.120 | 0.613 | -0.092 |
| TrCbal42 | 2 | 0.251 | 0.255 | 0.219 | 0.570 | -0.025 |
| TrCbal43 | 2 | 0.095 | 0.100 | 0.090 | 0.718 | -0.306 |
| TrCbal47 | 2 | 0.491 | 0.471 | 0.370 | 0.357 | 0.040 |
| TrCbal61 | 3 | 0.147 | 0.157 | 0.140 | 0.641 | -0.127 |
| TrCbal64 | 2 | 0.483 | 0.082 | 0.366 | 0.020 | 0.769 |
Na: number of alleles. Ho and He: observed and expected heterozygosity.PIC: polymorphism index content..LD: linkage desequilibrium Fis:estimated inbreeding coefficient.