| Literature DB >> 34294107 |
Yupeng Geng1, Yabin Guan1,2, Shugang Lu2, Miao An3, M James C Crabbe4,5,6, Ji Qi7, Fangqing Zhao8,9,10,11, Qin Qiao12, Ticao Zhang13.
Abstract
BACKGROUND: Understanding how organisms evolve and adapt to extreme habitats is of crucial importance in evolutionary ecology. Altitude gradients are an important determinant of the distribution pattern and range of organisms due to distinct climate conditions at different altitudes. High-altitude regions often provide extreme environments including low temperature and oxygen concentration, poor soil, and strong levels of ultraviolet radiation, leading to very few plant species being able to populate elevation ranges greater than 4000 m. Field pennycress (Thlaspi arvense) is a valuable oilseed crop and emerging model plant distributed across an elevation range of nearly 4500 m. Here, we generate an improved genome assembly to understand how this species adapts to such different environments.Entities:
Keywords: Adaptive evolution; FLOWERING LOCUS C; Population genomics; Qinghai-Tibet Plateau; Transposable elements
Mesh:
Year: 2021 PMID: 34294107 PMCID: PMC8296595 DOI: 10.1186/s12915-021-01079-0
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Genome assembly and annotation of field pennycress
| Genome features | Count |
|---|---|
| Illumina PE150 reads (Gb) | 76.62 |
| Nanopore reads (Gb) | 55.26 |
| Hi-C reads (Gb) | 60.97 |
| Total length of contigs (Mb) | 527.15 |
| Total number of contigs | 3790 |
| Longest length of contigs (Mb) | 22.18 |
| Length of contig N50 (Mb) | 4.18 |
| Number of contig N50 | 24 |
| Total assembly size (Mb) | 527.3 |
| Total anchored size (Mb) | 474.97 |
| Total number of scaffolds | 2298 |
| Longest length of scaffolds (Mb) | 75.83 |
| Length of scaffold N50 (Mb) | 70.79 |
| Number of scaffolds N50 | 4 |
| GC content (%) | 39.03 |
| Repeat content (%) | 70.19 |
| BUSCO assessment (%) | 95.9% |
| Number of predicted genes | 31,596 |
Fig. 1.Comparative genomic analyses of field pennycress with relatives. a Hi-C interaction heatmap for pennycress genome showing interactions among seven chromosomes (Chr1–7). b Genomic features of pennycress (Ta) vs. Brassica rapa (Br). Tracks from outside to inside (a–g) are as follows: chromosomes, retrotransposon density, DNA transposon density, long terminal repeat retrotransposon (LTR) density, gene density, GC content, and collinearity between both genomes. c Maximum likelihood tree and estimation of divergence times in Brassicaceae. d Genome collinearity dot plot and Ks distribution between pennycress (Ta) and A. thaliana (At). e Age distribution of transversion substitutions at fourfold degenerate sites (4DTv) distance values between orthologs of pennycress and its relatives. f Insertion time distribution of LTR of pennycress and its relatives
Fig. 2.Population genetic structure and demographic history of field pennycress. a Sampling locations. For four site names and two groups, refer to Table 2. b Population structure plots with the number of ancestral clusters (K) = 2. c Principal component analysis (PCA) plot of all samples in four populations. d Maximum likelihood (ML) tree of all samples based on high-quality single nucleotide polymorphisms (SNPs). e Effective population size inferred based on SNPs for four populations. f Linkage disequilibrium (LD) patterns for the four pennycress populations. X-axis: physical distances between two SNPs marked in kb; Y-axis: R2 used to measure linkage disequilibrium
Sampling locations, bio-climatic characterization, and genetic differentiation of field pennycress populations
| Groups | Pop. | Lat. (N) | Lon. (E) | Alt. (m) | MTWQ (°C) | PWQ (mm) | Tajima’s D | θπ | θW | Fst |
|---|---|---|---|---|---|---|---|---|---|---|
| LG | HF | 31.924 | 117.139 | 35 | 27.25 | 394 | 0.6521 | 0.00042 | 0.00027 | 0.1818 |
| XA | 34.022 | 109.116 | 705 | 24.45 | 286 | 0.7578 | ||||
| HG | MK | 29.658 | 98.565 | 3806 | 10.57 | 317 | 0.7188 | 0.00057 | 0.00045 | |
| ZG | 29.956 | 97.412 | 4006 | 10.78 | 336 | 0.4953 |
LG low-elevation group, HG high-elevation group, Pop populations, Lat latitude, Lon longitude, Alt altitude, MTWQ mean temperature of the warmest quarter, PWQ precipitation of the warmest quarter
Fig. 3.Candidate positively selected genes (PSGs) in high-elevation groups. a The distribution of the Fst and θπ values in HG and LG. The vertical and horizontal dashed lines correspond to the 3% right tails of the Fst and θπ value distribution, respectively. b Gene Ontology (GO) and KEGG functional classification of candidate selected genes in HG. c The composite likelihood ratio (CLR) result in the HG using the top 1% percentile outlier threshold
Fig. 4.Transitioning to early flowering in the high-elevation groups. a The Fst values between HG and LG in the genome region (Chr1:69,000–69,800 kb) which includes the FLC gene. b The FLC transcripts in the Kunming (KM) accession and corresponding gene model. The former fifth intron was transcribed (pale blue) and resulted in a new long transcript formed from the fifth to the sixth exons (blue). c Nucleic acid and amino acid sequence alignments of the fifth exon and partial fifth intron of the FLC gene from pennycress as well as A. thaliana and E. salsugineum. The mutations from “GT” to “CT” in the fifth intron are highlighted in red. HG, high-elevation group; MG, middle-elevation group; LG, low-elevation group; EF, early flowering; LF, late flowering