| Literature DB >> 30175293 |
Huanlei Zhang1,2, Jianjun Jin1,2, Michael J Moore3, Tingshuang Yi1, Dezhu Li1.
Abstract
Cannabaceae is an economically important family that includes ten genera and ca. 117 accepted species. To explore the structure and size variation of their plastomes, we sequenced ten plastomes representing all ten genera of Cannabaceae. Each plastome possessed the typical angiosperm quadripartite structure and contained a total of 128 genes. The Inverted Repeat (IR) regions in five plastomes had experienced small expansions (330-983 bp) into the Large Single-Copy (LSC) region. The plastome of Chaetachme aristata has experienced a 942-bp IR contraction and lost rpl22 and rps19 in its IRs. The substitution rates of rps19 and rpl22 decreased after they shifted from the LSC to IR. A 270-bp inversion was detected in the Parasponia rugosa plastome, which might have been mediated by 18-bp inverted repeats. Repeat sequences, simple sequence repeats, and nucleotide substitution rates varied among these plastomes. Molecular markers with more than 13% variable sites and 5% parsimony-informative sites were identified, which may be useful for further phylogenetic analysis and species identification. Our results show strong support for a sister relationship between Gironniera and Lozanell (BS = 100). Celtis, Cannabis-Humulus, Chaetachme-Pteroceltis, and Trema-Parasponia formed a strongly supported clade, and their relationships were well resolved with strong support (BS = 100). The availability of these ten plastomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the intergeneric phylogeny of Cannabaceae.Entities:
Keywords: IR expansion/contraction; Phylogenomics; Plastome; Repeats; SSR; Sequence divergence
Year: 2018 PMID: 30175293 PMCID: PMC6114266 DOI: 10.1016/j.pld.2018.04.003
Source DB: PubMed Journal: Plant Divers ISSN: 2468-2659
Assembly statistics and genome features for newly sequenced Cannabaceae plastomes.
| Species | Total PE reads | Matched PE reads | Mean coverage (×) | Genome length (bp) | LSC length (bp) | SSC length (bp) | IR length (bp) | GC content (%) |
|---|---|---|---|---|---|---|---|---|
| 1,695,716 | 374,611 | 583.7 | 157,687 | 86,135 | 19,442 | 26,015 | 36.4 | |
| 2,040,500 | 1,880,700 | 1351.8 | 153,910 | 84,059 | 17,829 | 26,011 | 36.7 | |
| 289,464 | 257,965 | 120.3 | 159,001 | 86,072 | 19,171 | 26,879 | 36.3 | |
| 1,142,608 | 1,045,891 | 1415.4 | 157,939 | 86,743 | 20,064 | 25,566 | 36.1 | |
| 396,352 | 374,583 | 583.6 | 157,807 | 86,215 | 18,942 | 26,325 | 36.3 | |
| 1,010,646 | 839,251 | 1436.6 | 153,776 | 83,885 | 17,751 | 26,070 | 36.9 | |
| 1,077,002 | 1,026,115 | 1573.4 | 156,711 | 85,928 | 19,133 | 25,825 | 36.6 | |
| 586,024 | 498,328 | 627.5 | 157,434 | 86,961 | 19,313 | 25,580 | 36.3 | |
| 1,051,832 | 992,380 | 1711.1 | 158,504 | 87,620 | 18,856 | 26,014 | 36.3 | |
| 4,807,452 | 4,346,229 | 2569.3 | 157,192 | 86,859 | 19,309 | 25,512 | 36.3 |
PE = paired-end; LSC = Large Single-Copy region; SSC = Small Single-Copy region; IR = Inverted Repeat region.
Fig. 1Gene maps of the plastome of Genes are indicated by boxes on the inside (clockwise transcription) and outside (counterclockwise transcription) of the outermost circle. The inner circle identifies the major structural components of the plastome (LSC, IR, and SSC). Genes belonging to different functional groups are color-coded. Dashed area in the inner circle indicates the GC content of the plastome. * represents the tRNA with an intron.
Gene content in Cannabaceae plastomes.
| Category | Gene groups | Name of genes |
|---|---|---|
| Self- replication | Large subunit of ribosomal proteins | |
| Small subunit of ribosomal proteins | ||
| DNA-dependent RNA polymerase | ||
| Ribosomal RNA genes | ||
| Transfer RNA genes | ||
| Photosynthesis | Photosystem I | |
| Photosystem II | ||
| NADH dehydrogenase | ||
| Cytochrome b/f complex | ||
| ATP synthase | ||
| RubisCo large subunit | ||
| Other genes | Maturase K | |
| Envelope membrane protein | ||
| Subunit of acetyl- CoA carboxylase | ||
| c-type cytochrome synthesis gene | ||
| Protease | ||
| Proteins of unknown function |
(×2) = gene present twice due to position within the IR; a Contains two introns; b Contains one intron; c Exons separated and joined by trans-splicing; d gene present in the IRs in the IR-expanded species; e Gene present in the IR of Celtis blondii; f Gene present in the IR of Chaetachme aristata.
Fig. 2Comparison of IR/SC boundaries among Cannabaceae plastomes. JSB, JSA and JLA refer to junctions of SSC/IRB, SSC/IRA, and LSC/IRA, respectively. Ψ indicates a pseudogene copy of a gene partially duplicated in the IR.
Fig. 3The best maximum likelihood (ML) tree based on RAxML analysis. Bootstrap support values are provided next to each node.
Fig. 4mVISTA-based identity plot showing sequence identity among Cannabaceae plastomes. Humulus scandens is set as the reference. Coding and noncoding regions are colored in blue and red, respectively.
Fig. 5Percentages of variable (blue, top line) and parsimony-informative (red, bottom line) sites across coding and non-coding loci. A coding regions; B noncoding regions. Regions are oriented according to their genome locations.
Fig. 6Analyses of repeated sequences in Cannabaceae plastomes. A Numbers of the three dispersed repeat types; B Numbers of tandem repeats; C Frequency of dispersed repeats by length; D Frequency of tandem repeats by length; E The locations of repeats.
Fig. 7The distribution of the simple sequence repeats (SSRs) in Cannabaceae plastomes.