| Literature DB >> 35415698 |
Hanwen Zhang1, Yuchen Zhang1, Wenting Xu1, Rong Li1, Dabing Zhang1, Litao Yang1.
Abstract
Basic data for the safety assessment of transgenic line involves the molecular characterization of the integration site of exogenous DNA, flanking sequences, copy number, and unintended plasmid backbone residues. However, performing a full molecular characterization remains challenging, especially for GMOs that possess complex exogenous DNA integrations. We established two whole-genome sequencing strategies: paired-end and mate-pair, to characterize the exogenous DNA integration of a human serum albumin gene into rice line 114-7-2, and evaluated the performance of these two strategies in the molecular characterization of transgenic line. The results showed the existence of two exogenous DNA insertion loci (Chr 01 and Chr 04) and their corresponding flanking sequences, five copies of the exogenous rHSA gene, and the presence of unintended residual plasmid backbone sequences. However, the WGS-MP strategy demonstrated higher efficiency, lower cost, and lower background noise compared with the WGS-PE analysis, especially for identification of the exogenous DNA integration site.Entities:
Keywords: BHQ, black hole quencher; CTAB, Cetyltrimethyl ammonium bromide; FAM, 6-carboxyfluorescein; GM rice line 114-7-2; GMO, genetically modified organism; ISAAA, International Service for the Acquisition of Agri-Biotech Applications; MP, mate-pair; Mate pair; Molecular characterization; NGS, Next-generation sequencing; NOS, nopaline synthase; PE, paired-end; Paired-end; WGS, whole-genome sequencing; WT, Wild type; Whole-genome sequencing; ddPCR, Droplet digital polymerase chain reaction
Year: 2021 PMID: 35415698 PMCID: PMC8991703 DOI: 10.1016/j.fochms.2021.100061
Source DB: PubMed Journal: Food Chem (Oxf) ISSN: 2666-5662
Fig. 1The bioinformatics pipeline used for molecular characterization.
Summary of sequencing and bioinformatics data.
| WGS-PE | WGS-MP | |||
|---|---|---|---|---|
| 114-7-2 | WT | 114-7-4 | WT | |
| Total trimmed read pairs | 44,205,804 | 43,894,757 | 11,596,364 | 10,101,309 |
| General sequencing depth | 23.61 | 23.45 | 6.19 | 5.39 |
| Q20 (%) | 96.66 | 95.18 | 96.34 | 96.26 |
| GC (%) | 42.96 | 42.29 | 43.92 | 43.73 |
| Type C reads | 33,724 | 5346 | 9238 | 1338 |
| Type B, D & E reads | 11,795 | 2827 | 2036 | 403 |
| False-positive reads from | 8951 | 2654 | 1613 | 386 |
| Reads clustered around candidate insertion regions | 41 pairs for chr 1 site, 36 pairs for chr 4 site | / | 71 pairs for chr 1 site, 85 pairs for chr 4 site | / |
Fig. 2Comparison of MP resequencing data and PE resequencing data of transgenic rice line 114-7-2. (a) Visualization of the candidate insertion sites I. (b) Visualization of the candidate insertion sites II. (c) PCR results for the insertion site on chromosome 1. Lane GM: GM rice 114-7-2 event; Lane WT: non GM riceTP309 line; Lane N: No template control; Lane M: 1 kb Plus Opti-DNA Marker; (d) PCR results for the insertion site on chromosome 4. Lane GM: GM rice 114-7-2 event; Lane WT: non GM riceTP309 line; Lane N: No template control; Lane M: DL2000 Marker.
Fig. 3PCR verification of candidate insertion sites and the structure of exogenous sequences. (a) Exogenous sequence structure of the insertion site on chromosome 1. (b) Exogenous sequence structure of the insertion site on chromosome 4.
Fig. 4Inferred number of read pairs across insertion sites under different input data volumes. Candidate read pairs from the PE sequencing data are uniquely mapping reads around two insertion sites under the different sequencing coverages of 5×, 6.19×, 10×, 15×, and 20×. The number of mapped reads is the total of type B, D and E read pairs that span the up- and downstream flanking sequences of two insertion sites; site I: chromosome 1, 39,215,852–39,215,854 and site II: chromosome 4, 30,513,019–30,513,035.
Different properties presented in mate-pair and paired-end strategies.
| Different properties | WGS-MP | WGS-PE |
|---|---|---|
| Recommended sequencing depth | About 5× genome size | Not less than 10× genome size |
| Flanking sequence information | Many and valuable | Little |
| Insertion sites and flanking sequence validation | Easy | Very easy |
| Number of false positives | Few | Extremely high |
| True positives in total results | High | Intermediate |
| Evaluation of copy number | Easy | Easy |
| Plasmid backbone residue analysis | Easy | Easy |