| Literature DB >> 34718556 |
Dhanushya Ramachandran1, Cynthia D Huebner1,2, Mark Daly3, Jasmine Haimovitz3, Thomas Swale3, Craig F Barrett1.
Abstract
The invasive Japanese stiltgrass (Microstegium vimineum) affects a wide range of ecosystems and threatens biodiversity across the eastern USA. However, the mechanisms underlying rapid adaptation, plasticity, and epigenetics in the invasive range are largely unknown. We present a chromosome-level assembly for M. vimineum to investigate genome dynamics, evolution, adaptation, and the genomics of phenotypic plasticity. We generated a 1.12-Gb genome with scaffold N50 length of 53.44 Mb respectively, taking a de novo assembly approach that combined PacBio and Dovetail Genomics Omni-C sequencing. The assembly contains 23 pseudochromosomes, representing 99.96% of the genome. BUSCO assessment indicated that 80.3% of Poales gene groups are present in the assembly. The genome is predicted to contain 39,604 protein-coding genes, of which 26,288 are functionally annotated. Furthermore, 66.68% of the genome is repetitive, of which unclassified (35.63%) and long-terminal repeat (LTR) retrotransposons (26.90%) are predominant. Similar to other grasses, Gypsy (41.07%) and Copia (32%) are the most abundant LTR-retrotransposon families. The majority of LTR-retrotransposons are derived from a significant expansion in the past 1-2 Myr, suggesting the presence of relatively young LTR-retrotransposon lineages. We find corroborating evidence from Ks plots for a stiltgrass-specific duplication event, distinct from the more ancient grass-specific duplication event. The assembly and annotation of M. vimineum will serve as an essential genomic resource facilitating studies of the invasion process, the history and consequences of polyploidy in grasses, and provides a crucial tool for natural resource managers.Entities:
Keywords: Poaceae; genome evolution; invasion genomics; long read sequencing; polyploidy; rapid adaptation; transposable elements
Mesh:
Substances:
Year: 2021 PMID: 34718556 PMCID: PMC8598173 DOI: 10.1093/gbe/evab238
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
(A) Linkage density heatmap of the Microstegium vimineum genome. The x and y axes represent the mapping positions of the first and second read in a read pair, respectively. The diagonal lines from lower left to upper right in the plot represent each of the 23 M. vimineum pseudochromosomes. Dots (sequences) outside the diagonal are likely repetitive sequences that occur in multiple chromosomes. (B) Circos plot of M. vimineum genome assembly showing distributions of genes (green), Gypsy LTR-RTs (red), and Copia LTR-RTs (blue). (C) Insertion age estimates of LTR-retrotransposons in Ma based on a grass-specific LTR mutation rate (Ma and Bennetzen 2004). (D) BUSCO assessment results of orthologs among M. vimineum, closely related diploids (Sorghum bicolor, Coix lacryma-jobi, Zea mays), and polyploids (Miscanthus sinensis and Cenchrus purpureus). (E) Interchromosomal synteny with links representing syntenic blocks between M. vimineum chromosomes. (F) Macrosynteny dotplot of M. vimineum and S. bicolor chromosomes displaying large-scale duplications, inversions, and translocations. (G) The frequency distributions of synonymous substitution rates (Ks) of homologous gene pairs located in the collinearity blocks of M. vimineum. The Ks distribution for M. vimineum is shown in gray, with two WGD peaks indicated in blue and red. The vertical lines labeled “a” and “b” indicate the modes of these peaks, which are taken as Ks-based WGD age estimates. The numbered vertical lines represent rate-adjusted mode estimates of one-to-one ortholog Ks distributions between M. vimineum and closely related species, representing speciation events. (H) Distributions of gene duplicate origins across each chromosome in M. vimineum genome.
Summary of the Genome Assembly and Annotation
| Genome assembly | Estimated genome size | 1.2 Gb |
|---|---|---|
| N50 scaffold length | 53.04 Mb | |
| L50 | 10 | |
| N90 scaffold length | 33.01 Mb | |
| L90 | 20 | |
| Longest scaffold | 68.32 Mb | |
| No. of scaffolds | 463 | |
| BUSCO | Complete | 3930 (80.2%) |
| Duplicate | 1159 | |
| Fragmented | 108 | |
| Missing | 859 | |
| Total BUSCO groups searched | 489 | |
| Transposable elements | LTR-retrotransposons | 25.77% |
| LINEs | 1.13% | |
| DNA-transposons | 3.96% | |
| Rolling circles | 0.13% | |
| Unclassified/unknown | 35.63% | |
| Total | 66.48% | |
| Protein-coding genes | No. of gene models | 39,604 |
| Functionally annotated | 26,288 | |
| Mean gene length | 1,394 bp | |
| Mean no. of exons per gene | 5 | |
| Mean exon length | 256 bp | |
| Mean intron length | 679 bp |