| Literature DB >> 34599816 |
Nikolay Alabi1, Yihan Wu1, Oliver Bossdorf2, Loren H Rieseberg3, Robert I Colautti1.
Abstract
The emerging field of invasion genetics examines the genetic causes and consequences of biological invasions, but few study systems are available that integrate deep ecological knowledge with genomic tools. Here, we report on the de novo assembly and annotation of a genome for the biennial herb Alliaria petiolata (M. Bieb.) Cavara and Grande (Brassicaceae), which is widespread in Eurasia and invasive across much of temperate North America. Our goal was to sequence and annotate a genome to complement resources available from hundreds of published ecological studies, a global field survey, and hundreds of genetic lines maintained in Germany and Canada. We sequenced a genotype (EFCC3-3-20) collected from the native range near Venice, Italy, and sequenced paired-end and mate pair libraries at ∼70 × coverage. A de novo assembly resulted in a highly continuous draft genome (N50 = 121 Mb; L50 = 2) with 99.7% of the 1.1 Gb genome mapping to scaffolds of at least 50 Kb in length. A total of 64,770 predicted genes in the annotated genome include 99% of plant BUSCO genes and 98% of transcriptome reads. Consistent with previous reports of (auto)hexaploidy in western Europe, we found that almost one-third of BUSCO genes (390/1440) mapped to two or more scaffolds despite <2% genome-wide average heterozygosity. The continuity and gene space quality of our draft assembly will enable molecular and functional genomic studies of A. petiolata to address questions relevant to invasion genetics and conservation strategies.Entities:
Keywords: zzm321990 Alliaria petiolatazzm321990 ; EFCC3; Illumina; garlic mustard; invasion genetics; mate pairs
Mesh:
Year: 2021 PMID: 34599816 PMCID: PMC8664459 DOI: 10.1093/g3journal/jkab339
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Percentage of predicted single-copy plant genes from BUSCO that are found one (light blue) or more times (dark blue), or are missing (red) or fragmented (yellow) in the annotated genome assembly of Alliaria petiolata.
Assembly statistics for the Alliaria petiolata genome
| Statistic | Value |
|---|---|
| N scaffolds (≥1000 b) | 694 |
| N scaffolds (≥50,000 b) | 227 |
| Total length | 1,075,010,735 |
| Total length (≥50,000 b) | 1,071,536,925 |
| Largest scaffold | 485,611,451 |
| GC (%) | 37.2 |
| N50 | 121,941,980 |
| N75 | 40,840,077 |
| L50 | 2 |
| L75 | 5 |
| Mean sequence length | 1,549,006.82 |
Summary statistics of genes annotated for the Alliaria petiolata genome assembly
| Statistic | Value |
|---|---|
| Number of genes | 64,770 |
| Number of exons | 408,155 |
| Number of introns | 343,385 |
| Overlapping genes | 9,669 |
| Contained genes | 1,464 |
| Total gene length | 210,804,785 |
| Total exon length | 102,316,121 |
| Total intron length | 109,175,434 |
| Total CDS length | 842,788 |
| % of genome covered by genes | 19.6 |
| % of genome covered by CDS | 7.8 |
| Mean mRNAs per gene | 1 |
| Mean exons per mRNA | 6 |
| Mean introns per mRNA | 5 |
| Mean gene length | 3,255 |
| Mean intron length | 318 |
| Mean exon length | 251 |
| Mean CDS length | 1301 |
Figure 2Dot-plot showing blocks of synteny between the four largest scaffolds of the Alliaria petiolata assembly (x-axis) and five chromosomes of the model plant A. thaliana. Blue lines show aligned sequences with up to 20% divergence. Vertical dotted lines denote separation of the major scaffolds of the A. petiolata assembly.