| Literature DB >> 36030243 |
Arnau Fiol1, Federico Jurado-Ruiz1, Elena López-Girona2, Maria José Aranzana3,4.
Abstract
BACKGROUND: Genome complexity is largely linked to diversification and crop innovation. Examples of regions with duplicated genes with relevant roles in agricultural traits are found in many crops. In both duplicated and non-duplicated genes, much of the variability in agronomic traits is caused by large as well as small and middle scale structural variants (SVs), which highlights the relevance of the identification and characterization of complex variability between genomes for plant breeding.Entities:
Keywords: Cas9 enrichment; Complex regions; Gene duplication; Long-read; MYB10; Targeted sequencing; Variability
Year: 2022 PMID: 36030243 PMCID: PMC9419362 DOI: 10.1186/s13007-022-00937-4
Source DB: PubMed Journal: Plant Methods ISSN: 1746-4811 Impact factor: 5.827
Fig. 1a Dot plot comparison of the three LG3-MYB10 assemblies in the two Japanese plum genomes published: ‘Zhongli No. 6’ (with two assemblies: Zhongli-1 and Zhongli-2) and ‘Sanyueli’. Top right of the diagonal: homologous hits; inversions colored in red. Bottom left of the diagonal: homologous blocks colored by their score, from green to colder colors while the homology value decreases. Coordinates of each region are written in base pairs for both axes. b Distribution of the homologous sequences in the three compared genome regions. Grey areas correspond to sequences not found in the Zhongli-1 MYB10 region. Homologous regions are represented in the same yellow to purple colors
Fig. 2Schematic representation of the CRISPR RNA (crRNA) design to specifically cleave and sequence a hypothetic LG3-MYB10 genomic region in Japanese plum. The guide RNAs (gRNAs) are formed by the interaction of trans-activating RNAs with crRNAs, then with the Cas9 enzyme are assembled into the Cas9- Ribonucleoprotein (Cas9-RNP) complex. The Cas9-RNPs allows the specific cleavage of the LG3-MYB10 region, generating cuts at the two DNA strands for further sequencing in both directions from the inside to the outside of the genes. Gene number, order and size of the region, as well as the expected number of fragments were unknown at the time of the crRNA design
Fig. 3a MYB10 amplicon bands before and after cleavage with a pool of the guide RNAs (gRNAs) targeting the genes, with their size annotated in base pairs (bp). Bands (a, b) correspond to the faint bands from the allele with a large intron 1, (c) to the remaining undigested product. Wells M1: Lambda DNA HindIII-EcoRI digested molecular weight marker; M2: DNA Ladder 50 bp ready-to-use (GeneON). b Expected fragment sizes after MYB10.1 and MYB10.2 digestion with the Cas9-Ribonucleoprotein (RNP) complex. Guide-RNAs assembled with the trans-activating CRISPR RNA (tracrRNA) and crRNA-s (s1, s2 or s3) cut the exon 2 of the gene, gRNAs with crRNA-a (a1, a2, a3 or a4) cut by intron 1. Fragment 6 was not obtained after cleavage, fragment 3 and 5 bands overlapped in the gel
Sequencing statistics for each plum variety after read demultiplexing
| Variety | Barcode | Total reads | Yield (Mb) | Mean length (bp) | N50 | Mean Q |
|---|---|---|---|---|---|---|
| ‘Angeleno’ | NB01 | 4995 | 38.85 | 7777.9 | 19,049 | 11.0 |
| ‘Black Gold’ | NB02 | 11,106 | 45.64 | 4109.7 | 10,288 | 10.9 |
| ‘Fortune’ | NB03 | 5417 | 29.32 | 5412.2 | 11,786 | 10.8 |
| ‘Golden Japan’ | NB04 | 5063 | 34.98 | 6908.0 | 16,740 | 11.0 |
| ‘TC Sun’ | NB05 | 7213 | 45.27 | 6275.8 | 14,686 | 10.9 |
Alignment of each cultivar to the three Japanese plum MYB10 reference regions
| Sample | MYB10 region | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| ‘Sanyueli’ | Zhongli-1 | Zhongli-2 | |||||||
| On-target reads | Cov. bases | Mean depth | On-target reads | Cov. bases | Mean depth | On-target reads | Cov. bases | Mean depth | |
| ‘Angeleno’ | 1,036 (5.63%) | 73.3 kb (54.49%) | 12.2x | 489 (4.17%) | 96.7 kb (35.7%) | 3.8x | 486 (4.15%) | 48.7 kb (54.34%) | 9.1x |
| ‘Black Gold’ | 1,300 (5.37%) | 76.8 kb (57.07%) | 19.1x | 572 (2.58%) | 123.0 kb (45.42%) | 5.5x | 611 (2.75%) | 77.7 kb (86.31%) | 14.1x |
| ‘Fortune’ | 1,692 (10.47%) | 78.1 kb (58.05%) | 23.2x | 841 (8.60%) | 165.1 kb (60.97%) | 9.1x | 616 (6.30%) | 72.7 kb (81.18%) | 18.5x |
| ‘Golden Japan’ | 168 (0.99%) | 71.1 kb (52.85%) | 2.0x | 67 (0.63%) | 133.8 kb (49.42%) | 1.2x | 45 (0.43%) | 31.6 kb (35.31%) | 1.0x |
| ‘TC Sun’ | 1,824 (7.04%) | 84.5 kb (62.75%) | 29.1x | 677 (6.23%) | 214.8 kb (79.33%) | 19.3x | 280 (2.58%) | 53.6 kb (59.84%) | 11.1x |
On-target reads: reads that aligned to the region and the percentage from the total reads aligned to the genome; Cov. bases: the bases covered with at least one read and the percentage of the region size; Mean depth: the mean depth of coverage for the whole region
Fig. 4a Venn diagram showing the number of SNPs identified in the MYB10 region from each of the five Japanese plum samples sequenced and the overlap between them. The number next to the cultivar stands for their total SNP count. b Classification and percentage of all the structural variants identified: breakends (BND), deletions (DEL), duplications (DUP) and inversions (INV)
Fig. 5Visualization of the de novo sequences in the PsMYB10.1a promoter region containing a G-box motif (underlined) associated with the red skin color, SNPs and two large InDels. The de novo sequences had resolution to the SNP level when compared between them and to the Sanger sequences provided in Fiol et al. Heterozygous positions (shadowed bases) on the de novo sequences matched the nucleotide on the same haplotype in a different variety (red rectangles). Sequencing mistakes are marked with asterisks
Strengths, weakness, improvements and new opportunities of the CRISPR-Cas9 sequencing enrichment strategy described in this study
| Strengths | Weakness | Improvements | New opportunities |
|---|---|---|---|
The design of crRNAs does not necessarily require a reference genome Pooling the crRNAs and barcoded DNAs results in a simple and cost-effective method Polymorphisms can be extracted and phased with high efficiency from genome alignments and/or de novo sequences The method is computationally inexpensive | The search for polymorphisms in the de novo sequences might require the manual isolation and comparison of contigs ONT technology is prone to sequence errors in homopolymer regions [ The Cas9 digestion with a single crRNA pool cannot produce overlapping fragments | Digesting the DNAs with sub-pools of crRNA might improve the region assembly, at the expenses of simplicity and cost-effectiveness The use of samples homozygous for the target region might increase the efficiency of polymorphism discovery and sequence scaffolding | Identification of methylation variants between the pooled samples in the target region [ Enrichment of genes scattered in more than one region of interest, such as the enrichment of specific gene or transposon families[ |
Commercial varieties selected for CRISPR-Cas9 targeted sequencing of their MYB10 region, their fruit color and the MYB10 haplotype combination described in Fiol et al. [26]
| Variety | Fruit color | MYB10 Haplotypes | |
|---|---|---|---|
| Skin | Flesh | ||
| ‘Angeleno’ | Black | Yellow | H1/H3 |
| ‘Black Gold’ | Black | Red | H1/H2 |
| ‘Fortune’ | Red | Yellow | H3/H6 |
| ‘Golden Japan’ | Yellow | Yellow | H4/H9 |
| ‘TC Sun’ | Yellow | Yellow | H4/H5 |