| Literature DB >> 34871392 |
Shane D Widanagama1, Joanna R Freeland2, Xinwei Xu3, Aaron B A Shafer4.
Abstract
Cattails (Typha species) comprise a genus of emergent wetland plants with a global distribution. Typha latifolia and Typha angustifolia are two of the most widespread species, and in areas of sympatry can interbreed to produce the hybrid Typha × glauca. In some regions, the relatively high fitness of Typha × glauca allows it to outcompete and displace both parent species, while simultaneously reducing plant and invertebrate biodiversity, and modifying nutrient and water cycling. We generated a high-quality whole-genome assembly of T. latifolia using PacBio long-read and high coverage Illumina sequences that will facilitate evolutionary and ecological studies in this hybrid zone. Genome size was 287 Mb and consisted of 1158 scaffolds, with an N50 of 8.71 Mb; 43.84% of the genome were identified as repetitive elements. The assembly has a BUSCO score of 96.03%, and 27,432 genes and 2700 RNA sequences were putatively identified. Comparative analysis detected over 9000 shared orthologs with related taxa and phylogenomic analysis supporting T. latifolia as a divergent lineage within Poales. This high-quality scaffold-level reference genome will provide a useful resource for future population genomic analyses and improve our understanding of Typha hybrid dynamics.Entities:
Keywords: zzm321990 de novozzm321990 ; PacBio long-read sequencing; Typhaceae; broadleaf cattail; bulrush; hybrids; illumina short-read
Mesh:
Year: 2022 PMID: 34871392 PMCID: PMC9210280 DOI: 10.1093/g3journal/jkab401
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.542
Figure 1Broadleaf cattail (Typha latifolia). Photo by Joanna Freeland.
Genome assembly statistics of our four-step hybrid assembly
| SR contig assembly—step 1 | DB2OLC contig assembly—step 2 | LR contig assembly—step 3 | Merged polished scaffolds—step 4 | |
|---|---|---|---|---|
| Genome size (Mb) | 263.74 | 193.16 | 287.63 | 286.77 |
| Contigs/scaffolds | 365,565 | 1,840 | 1,190 | 1,158 |
| N50/L50 | 11.43 Kb/5,314 | 132.07 Kb/412 | 8.71 Mb/13 | 8.71 Mb/13 |
| N90/L90 | 201 bp/193,744 | 52.95 Kb/1,358 | 58.92 Kb/530 | 58.94 Kb/523 |
| Max sequence length | 154.73 Kb | 934.40 Kb | 18.70 Mb | 18.70 Mb |
| Scaffolds > 10 Kb | 6,048 | 1,833 | 1,140 | 1,132 |
| Scaffolds > 25 Kb | 1,759 | 1,785 | 1,127 | 1,120 |
| Scaffolds > 50 Kb | 362 | 1,445 | 821 | 816 |
| % of scaffolds > 50 Kb | 9.46 | 92.34 | 95.54 | 95.59 |
| GC content (%) | 38.40 | 38.50 | 38.11 | 38.05 |
Short read (SR) contig assembly consists of contigs assembled from 100% of the Illumina reads using the ABySS assembler. The DBG2OLC assembler combined the 100% ABySS contigs and PacBio long reads. The long read (LR) contig assembly was generated from only PacBio long reads using Canu. The merged polished scaffolds are from merging the DBG2OLC assembly and LR contigs and were polished using Pilon.
Figure 2(A) Percentages of repeat types masked in the T. latifolia genome. Types of repeats include DNA transposons, long interspersed nuclear elements (LINEs), long terminal repeats (LTRs), unclassified repeats, and simple repeats. (B) Venn diagram of orthologous gene clusters among the broadleaf cattail (Typha latifolia), pineapple (Ananas comosus), thale cress (Arabidopsis thaliana), stiff brome (Brachypodium distachyon), rice (Oryza sativa Japonica), and broom-corn (Sorghum bicolor). Only the numbers of ortholog clusters of adjacent species, those common to all species, and those unique to each species are labeled.
Summary of repeats masked in the Typha latifolia genome
| Length (bp) | Percentage of genome (%) | |
|---|---|---|
| SINE | 0 | 0 |
| LINE | 3,499,426 | 1.22 |
| LTR elements | 44,172,983 | 15.35 |
| DNA elements | 3,735,882 | 1.30 |
| Unclassified | 68,736,281 | 23.88 |
| Small RNA | 0 | 0 |
| Satellites | 0 | 0 |
| Simple repeats | 5,689,134 | 1.98 |
| Low complexity | 670,871 | 0.23 |
| Total | 126,178,695 | 43.84 |
Figure 3Phylogenetic tree with divergence times, based on the alignment of 1900 single-copy gene clusters. Ninety-five percent credible divergence times are shown as blue bars and were estimated using MCMCTree. Divergence times and bootstrap values are shown above and below the nodes, respectively. The yellow dot indicates the calibration point.