| Literature DB >> 30894495 |
Ticao Zhang1,2, Qin Qiao3, Polina Yu Novikova4,5, Qia Wang1, Jipei Yue1, Yanlong Guan1, Shengping Ming6, Tianmeng Liu6, Ji De6, Yixuan Liu6, Ihsan A Al-Shehbaz7, Hang Sun2, Marc Van Montagu8,5, Jinling Huang9,10,11, Yves Van de Peer8,5,12.
Abstract
Crucihimalaya himalaica, a close relative of Arabidopsis and Capsella, grows on the Qinghai-Tibet Plateau (QTP) about 4,000 m above sea level and represents an attractive model system for studying speciation and ecological adaptation in extreme environments. We assembled a draft genome sequence of 234.72 Mb encoding 27,019 genes and investigated its origin and adaptive evolutionary mechanisms. Phylogenomic analyses based on 4,586 single-copy genes revealed that C. himalaica is most closely related to Capsella (estimated divergence 8.8 to 12.2 Mya), whereas both species form a sister clade to Arabidopsis thaliana and Arabidopsis lyrata, from which they diverged between 12.7 and 17.2 Mya. LTR retrotransposons in C. himalaica proliferated shortly after the dramatic uplift and climatic change of the Himalayas from the Late Pliocene to Pleistocene. Compared with closely related species, C. himalaica showed significant contraction and pseudogenization in gene families associated with disease resistance and also significant expansion in gene families associated with ubiquitin-mediated proteolysis and DNA repair. We identified hundreds of genes involved in DNA repair, ubiquitin-mediated proteolysis, and reproductive processes with signs of positive selection. Gene families showing dramatic changes in size and genes showing signs of positive selection are likely candidates for C. himalaica's adaptation to intense radiation, low temperature, and pathogen-depauperate environments in the QTP. Loss of function at the S-locus, the reason for the transition to self-fertilization of C. himalaica, might have enabled its QTP occupation. Overall, the genome sequence of C. himalaica provides insights into the mechanisms of plant adaptation to extreme environments.Entities:
Keywords: Qinghai–Tibet Plateau; S-locus; adaptive evolution; extreme environment; natural selection
Mesh:
Substances:
Year: 2019 PMID: 30894495 PMCID: PMC6452661 DOI: 10.1073/pnas.1817580116
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Genome assembly of C. himalaica
| Genome features | Contigs | Scaffolds |
| Total length, bp | 230,905,116 | 234,722,603 |
| Total no. | 3,983 | 583 |
| Longest length, bp | 1,756,581 | 8,343,586 |
| No. of length ≥2,000 | 3,586 | 429 |
| Length of N50, bp | 136,392 | 2,088,603 |
| No. of N50 | 406 | 34 |
| Length of N90, bp | 32,421 | 470,087 |
| No. of N90 | 1,711 | 129 |
| GC content, % | 36.38 | — |
| No. of genes | — | 27,019 |
Fig. 1.Comparative analyses of genomic features of C. himalaica vs. A. thaliana (A) and C. himalaica vs. C. rubella (B). Tracks from inside to outside are collinearity between both genomes, number of chromosomes/scaffolds, gene density, and TE density.
Fig. 2.Evolutionary analyses of the C. himalaica genome. (A) Age distribution of 4DTv distance values between orthologs of C. himalaica and A. thaliana, C. himalaica and A. lyrata, and C. himalaica and C. rubella. (B) Insertion time distribution of LTR retrotransposons. (C) Estimation of divergence times of nine species in the Brassicaceae. (D) Venn diagram showing unique and shared gene families between genomes of C. himalaica and four close relatives.
Functional annotation of the most significantly expansive and contract gene families in C. himalaica
| Gene families | KEGG terms | Input no. | Background no. | |
| Expanded gene | Ubiquitin-mediated proteolysis | 19 | 137 | 5.59E-29 |
| families | Mismatch repair | 10 | 80 | 9.44E-12 |
| Homologous recombination | 10 | 105 | 9.12E-11 | |
| DNA replication | 10 | 103 | 9.12E-11 | |
| Nucleotide excision repair | 10 | 137 | 9.31E-10 | |
| RNA polymerase | 4 | 32 | 1.59E-06 | |
| Pyrimidine metabolism | 4 | 105 | 9.53E-05 | |
| Purine metabolism | 4 | 176 | 5.66E-04 | |
| Contracted gene | NF-κB signaling pathway | 64 | 93 | 1.96E-106 |
| families | Toll-like receptor signaling pathway | 64 | 106 | 5.32E-104 |
| Retinol metabolism | 21 | 65 | 3.60E-29 | |
| Metabolism of xenobiotics by P450 | 18 | 73 | 2.89E-23 | |
| Tryptophan metabolism | 13 | 40 | 2.89E-18 | |
| Steroid hormone biosynthesis | 13 | 58 | 1.73E-16 | |
| Metabolic pathways | 31 | 1,243 | 1.03E-12 | |
| Arachidonic acid metabolism | 10 | 62 | 1.35E-11 | |
| Linoleic acid metabolism | 8 | 29 | 4.96E-11 | |
| Lysosome | 7 | 123 | 1.27E-05 | |
| Apoptosis | 4 | 33 | 9.75E-05 |
Fig. 3.Size of the NBS gene family (A) and number of pseudo-NBS genes (B) in the C. himalaica genome compared with those of related species.
Fig. 4.Transitioning to self-fertilization in C. himalaica. (A) Protein sequence alignment of S-locus SCR genes from C. himalaica, A. halleri, and A. lyrata. Cysteine residues are highlighted in red, showing eight conserved sites important for structural and functional integrity of the protein (66, 74). The C. himalaica SCR protein appears to have lost two out of eight conserved cysteines and therefore is probably nonfunctional. (B) Structure of the S-locus part of scaffold 29 in C. himalaica genome assembly compared with A. halleri S15 haplogroup (70). Colors of the lines between S-loci correspond to BLAST scores from highest (blue) to lowest (gray).