| Literature DB >> 29800113 |
Ramasamy Yasodha1, Ramesh Vasudeva2, Swathi Balakrishnan3, Ambothi Rathnasamy Sakthi1, Nicodemus Abel1, Nagarajan Binai1, Balaji Rajashekar4,5, Vijay Kumar Waman Bachpai1, Chandrasekhara Pillai3, Suma Arun Dev3.
Abstract
Teak (Tectona grandis L. f.) is one of the precious bench mark tropical hardwood having qualities of durability, strength and visual pleasantries. Natural teak populations harbour a variety of characteristics that determine their economic, ecological and environmental importance. Sequencing of whole nuclear genome of teak provides a platform for functional analyses and development of genomic tools in applied tree improvement. A draft genome of 317 Mb was assembled at 151× coverage and annotated 36, 172 protein-coding genes. Approximately about 11.18% of the genome was repetitive. Microsatellites or simple sequence repeats (SSRs) are undoubtedly the most informative markers in genotyping, genetics and applied breeding applications. We generated 182,712 SSRs at the whole genome level, of which, 170,574 perfect SSRs were found; 16,252 perfect SSRs showed in silico polymorphisms across six genotypes suggesting their promising use in genetic conservation and tree improvement programmes. Genomic SSR markers developed in this study have high potential in advancing conservation and management of teak genetic resources. Phylogenetic studies confirmed the taxonomic position of the genus Tectona within the family Lamiaceae. Interestingly, estimation of divergence time inferred that the Miocene origin of the Tectona genus to be around 21.4508 million years ago.Entities:
Mesh:
Year: 2018 PMID: 29800113 PMCID: PMC6105116 DOI: 10.1093/dnares/dsy013
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1.Cross section of teak wood showing major features (a) pith; (b) heart wood; (c) sap wood; (d) growth ring; and (e) medullary rays.
Details of plant materials used in this study
| Accession ID | Sample code | Name of the provenance | State | Latitude | Longitude | Altitude (m) | Rainfall (mm) |
|---|---|---|---|---|---|---|---|
| 1 | NR | Nilambur | Kerala | 11° 17' N | 76° 19' E | 49 | 2,600 |
| 2 | AU | Arienkavu | Kerala | 8° 96’ N | 77° 14’ E | 240 | 2,600 |
| 3 | WR | Walayar | Kerala | 10° 52' N | 76° 46' E | 216 | 1,500 |
| 4 | DI | Dandeli | Karnataka | 15° 07' N | 74° 35' E | 510 | 2,200 |
| 5 | HI | Hojai | Assam | 26° 39’ N | 92° 36’ E | 69 | 1,750 |
| 6 | TP | Topslip | Tamilnadu | 10° 26' N | 76° 50' E | 640 | 1,350 |
Details on 10 SSR markers used for amplification of teak germplasm
| SSR code | SSR Motif | Primers (5’-3’) | Annealing temperature (oC) | Number of alleles | Product size (bp) |
|---|---|---|---|---|---|
| IFT83 | (AG)21 | 54 | 5 | 312-393 | |
| IFT821 | (AC)24 | 53.9 | 3 | 331-350 | |
| IFT63 | (ATG)12 | 54 | 3 | 250-275 | |
| IFT168 | (TCT)12 | 55 | 4 | 287-306 | |
| IFT479b | (GGA)11 | 55 | 3 | 331-343 | |
| IFT382 | (TATG)7 | 56 | 3 | 337-350 | |
| IFT 28 | (AAAG)6 | 53.6 | 3 | 381-393 | |
| IFT14 | (TTCT)9 | 54 | 5 | 265-278 | |
| IFT3 | (GAAAG)5 | 54 | 1 | 330 | |
| IFT777 | (TCAGG)6 | 56 | 4 | 312-343 |
Raw data statistics of Illumina PE, MP and Nanopore reads of teak genome
| Platform | Chemistry | Number of raw reads | Total bases of raw reads (bp) | No of processed reads (bp) | Total bases after processing (bp) | Coverage (×) |
|---|---|---|---|---|---|---|
| Illumina HiSeq | PE (150× 2) | 137,231,716 | 41,443,978,232 | 128,115,515 | 37,507,019,850 | 109 |
| Illumina HiSeq ( | MP (150× 2) | 25,436,869 | 7,681,934,438 | 10,776,772 | 2,408,197,618 | 20 |
| Illumina HiSeq ( | MP (150× 2) | 19,268,378 | 5,819,050,156 | 8,470,072 | 1,898,380,817 | 15 |
| Nanopore | Long read (5-1,345,484) | 782,591 | 2,685,280,348 | 782,591 | 2,685,280,348 | 7.06 |
| Total coverage | 151× | |||||
Draft genome assembly statistics of teak
| Parameters | Contig | Scaffold | Gap closer | Draft genome |
|---|---|---|---|---|
| Contigs generated | 3,500 | 3,004 | 3,004 | 2,993 |
| Maximum contig length (bp) | 1,718,119 | 1,718,322 | 1,718,606 | 1,718,606 |
| Minimum contig length (bp) | 332 | 445 | 445 | 1,100 |
| Average contig length (bp) | 90,394 | 105,594 | 105,712 | 106,098 |
| Total contigs length (bp) | 316,377,938 | 317,203,315 | 317,558,121 | 317,551,182 |
| Total number of non-ATGC characters | 2,084,563 | 2,374,171 | 808,446 | 808,446 |
| Percentage of non-ATGC characters | 0.659 | 0.748 | 0.255 | 0.255 |
| Contigs ≥ 500 bp | 3,484 | 3,003 | 3,003 | 2,993 |
| Contigs ≥ 1 Kbp | 3,478 | 2,993 | 2,993 | 2,993 |
| Contigs ≥ 10 Kbp | 2,431 | 2,069 | 2,069 | 2,069 |
| Contigs ≥ 1 Mbp | 8 | 18 | 18 | 18 |
| N50 value (bp) | 277,872 | 357,576 | 357,576 | 357,576 |
Overview of repeat elements in teak genome
| Type | Number of elements | Length occupied (bp) | Percentage in genome |
|---|---|---|---|
| Retroelements | 6,976 | 5,153,927 | 1.62 |
| SINEs | 1 | 46 | 0.00 |
| LINEs | 457 | 122,681 | 0.04 |
| L1/CIN4 | 457 | 122,681 | 0.04 |
| LTR elements | 6,518 | 5,031,200 | 1.58 |
| Ty1/Copia | 2,687 | 2,212,758 | 0.70 |
| Gypsy/DIRS1 | 2,499 | 2,564,424 | 0.81 |
| DNA transposons | 4,431 | 749,766 | 0.24 |
| hobo-Activator | 647 | 139,917 | 0.04 |
| Tc1-IS630-Pogo | 1,298 | 216,636 | 0.07 |
| Tourist/Harbinger | 285 | 70,800 | 0.02 |
| Unclassified | 242 | 63,153 | 0.02 |
| Total interspersed repeats | − | 5,966,846 | 1.88 |
| Small RNA | 158 | 104,298 | 0.03 |
| Satellites | 1 | 54 | 0.00 |
| Simple repeats | 253,260 | 10,655,131 | 3.36 |
| Low complexity | 46,994 | 2,328,770 | 0.73 |
| Total bases masked | 19,046,577 | 6.00 | |
Figure 2.Characterization of teak genome sequence by gene ontology categories: (a) Biological process; (b) Molecular function; (c) Cellular component.
Characteristics of six types of SSRs in teak genome
| SSR type | Total counts | Total length (bp) | Average length (bp) | Frequency (loci/Mb) | Density (bp/Mb) |
|---|---|---|---|---|---|
| cd | 6,309 | 248,207 | 39.34 | 19.87 | 781.63 |
| cx | 227 | 13,835 | 60.95 | 0.71 | 43.57 |
| icd | 4,049 | 170,803 | 42.18 | 12.75 | 537.88 |
| icx | 670 | 43,668 | 65.18 | 2.11 | 137.51 |
| ip | 883 | 34,519 | 39.09 | 2.78 | 108.7 |
| p | 170,574 | 3,006,200 | 17.62 | 537.15 | 9,466.82 |
The number, length, frequency and density of six different types of SSRs
| Nucleotide | Total counts | Total length (bp) | Average length (bp) | Frequency (loci/Mb) | Density (bp/Mb) | SSRs in the whole genome (%) |
|---|---|---|---|---|---|---|
| Mononucleotide | 88,766 | 1,321,753 | 14.89 | 279.53 | 4,162.3 | 45.32 |
| Dinucleotide | 81,215 | 1,664,278 | 20.49 | 255.75 | 5,241 | 41.4 |
| Trinucleotide | 14,654 | 286,074 | 19.52 | 46.15 | 900.88 | 7.48 |
| Tetranucleotide | 8,086 | 146,728 | 18.15 | 25.46 | 462.06 | 4.13 |
| Pentanucleotide | 1,967 | 42,960 | 21.84 | 6.19 | 135.29 | 1 |
| Hexanucleotide | 1,161 | 29,724 | 25.6 | 3.66 | 93.604 | 0.59 |
Cd, compound; cx, complex; icd, interrupted compound; icx, interrupted complex; ip, imperfect; p, perfect.
Read Statistics of the teak samples sequenced at low depth coverage for identification of polymorphic SSRs
| Accession ID | Sample code | Total raw reads (bp) | Total processed reads (bp) | Total reference covered (%) | Coverage (×) |
|---|---|---|---|---|---|
| 1 | NR | 22,503,220 | 21,315,714 | 90.1 | 9.6 |
| 3 | WR | 28,219,627 | 26,462,008 | 90.6 | 11.6 |
| 4 | DI | 19,670,649 | 18,335,390 | 87.6 | 7.8 |
| 5 | HI | 17,029,396 | 16,043,025 | 85.8 | 7.3 |
| 6 | TP | 18,838,068 | 18,451,073 | 94.3 | 8.2 |
| Average | 8.9 | ||||
Figure 3.Bayesian tree generated by analysis of two plastid sequences psb and ycf2. Posterior probabilities are shown above the branches. Scale bar specifies mean branch length. Six major clades representing subfamilies of the Lamiaceae family are indicated.
Figure 4.Chronogram of Lamiaceae (genus Tectona) based on two plastid sequences psb and ycf2, estimated from secondary calibration strategies as implemented in BEAST. Calibration points are indicated with black dots. Node bar indicates 95% HPD interval for node ages. Geological time scale is given in Mya.