| Literature DB >> 35216392 |
Cassandria G Tay Fernandez1, Benjamin J Nestor1, Monica F Danilevicz1, Jacob I Marsh1, Jakob Petereit1, Philipp E Bayer1, Jacqueline Batley1, David Edwards1.
Abstract
Pangenomes aim to represent the complete repertoire of the genome diversity present within a species or cohort of species, capturing the genomic structural variance between individuals. This genomic information coupled with phenotypic data can be applied to identify genes and alleles involved with abiotic stress tolerance, disease resistance, and other desirable traits. The characterisation of novel structural variants from pangenomes can support genome editing approaches such as Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated protein Cas (CRISPR-Cas), providing functional information on gene sequences and new target sites in variant-specific genes with increased efficiency. This review discusses the application of pangenomes in genome editing and crop improvement, focusing on the potential of pangenomes to accurately identify target genes for CRISPR-Cas editing of plant genomes while avoiding adverse off-target effects. We consider the limitations of applying CRISPR-Cas editing with pangenome references and potential solutions to overcome these limitations.Entities:
Keywords: CRISPR-Cas; gene editing; genomes; pangenomes; structural variations
Mesh:
Year: 2022 PMID: 35216392 PMCID: PMC8879065 DOI: 10.3390/ijms23042276
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Diagram of pangenome construction methods based on genome sequencing data. Genome sequencing reads for genomes A, B and C are shown at the top of the image, each colour represents a gene in the genome. The genome sequencing reads may be assembled into a pangenome using de novo, iterative and graph-based assemblies which may influence the positioning of the assembled genes. The * indicates that genome A is used as reference genome in the iterative assembly method.
A summary of plant pangenome studies.
| Species Name | Accessions | Size (Individuals) | Analysis | Reference |
|---|---|---|---|---|
|
| USDA collection | 1110 | PAV, GO, SNP discovery and population genetics analysis | [ |
| Chinese population | 26 | Synteny, SV, genetic variation and gene expression analysis | [ | |
|
| USDA Collection | 14,129 | GBS tagging, GWAS mapping, and PAV analysis | [ |
|
| NCBI SRA database | 725 | SNP calling, QTL mapping, expression analysis, and PAV analysis | [ |
|
| Chinese Kale/TO100 | 10 | Gene clustering, TE annotation, SNP calling, phylogenetic, PAV, and GO analysis | [ |
|
| Diversity set | 8 | Phylogenetic, SNP, InDels, SV, PAV, population analysis | [ |
| Diversity set | 53 | Candidate identification, QTL, and SNP analysis | [ | |
|
| Diversity set | 54 | Pan-gene clustering, variant calling, TE, and indel phylogenetic analysis | [ |
|
| Diversity set | 20 | GWAS, inversion calling, SNP calling, QTL mapping, and PAV analysis | [ |
|
| China National Rice Research Institute andNational Institute of Genetics in Japan | 1529 | Evolutionarily and PAV analysis | [ |
|
| ICRISAT | 89 | SNP and PAV analysis | [ |
| Diversity set | 3366 | SNP, SV, CNV, phylogenetic, GWAS analysis, and genomeprediction | [ | |
|
| Plant Genetic Resources Unit | 91 | Gene prediction, comparative analysis, and PAV/variant calling | [ |
|
| Diversity set | 5 | PAV and evolutionary analysis | [ |
|
| USDA Collection | 493 | SNP calling, genome positioning, and GO and GWAS analysis | [ |
|
| Diversity set | 57 | Haplotype sampling and genomic prediction | [ |
|
| Diversity set | 354 | PAV, SNP, GWAS, diversity, and population analysis | [ |
| Chibas sorghum breeding program | 24 | Genotype prediction, haplotype sampling, and WGS | [ | |
|
| Chinese Spring | 18 | PAV and SNP analysis | [ |
|
| MPI for Plant Breeding Research | 7 | Pangenomic, CNV, and synteny analysis | [ |
|
| 10 | PAV, GO, candidate gene, phylogenetic, and SNP analysis | [ | |
|
| Diversity set | 15 | Comparative genomic analysis, protein orthlog, diversity, and SV analysis | [ |
|
| NCBI database | 1961 | InDel, population structure, LD, SV, CNV, PAV, and metagenome association analysis | [ |
|
| Diversity set | 5 | PAV and GWAS analysis | [ |
Figure 2(A). Representation of a pangenome assembly composed of genomes from six individuals sourced from two populations. The core and variable regions of the pangenome are highlighted in this representation, in which the genetic diversity observed in the variable region can be caused by chromosome inversion or copy number variation (CNV). (B). Potential benefits of using pangenome reference for genetic modification, as the genetic diversity analysis can be used to define target sites in variant alleles, identify CNV that influence CRISPR-Cas mutation effectiveness and discover novel target alleles.
Figure 3Reversal of inversion through CRISPR to allow crossing of inverted genes. A pangenome is used to identify a non-recombinant inversion in individual A compared to individuals B and C. CRISPR-Cas proteins are then used to induce double-stranded breaks at specific target sites in the inverted region, leading to re-inversion of the genomic segment and accessibility of the locked genes for recombination. The previously inverted genes in individual A can then be crossed with other individuals in the population.