| Literature DB >> 31179213 |
Xingkun Yang1,2,3,4, Qinghua Zhou5,6, Wanjun Zhou1,2,4, Mei Zhong7, Xiaoling Guo3, Xiaofeng Wang6, Xin Fan8, Shanhuo Yan9, Liyan Li7, Yunli Lai8, Yongli Wang6, Jin Huang1,2,4, Yuhua Ye1,2,4, Huaping Zeng6, Jun Chuan6, Yuanping Du6, Chouxian Ma6, Peining Li10, Zhuo Song6, Xiangmin Xu1,2,4.
Abstract
Noninvasive prenatal testing of common aneuploidies has become routine over the past decade, but testing of monogenic disorders remains a challenge in clinical implementation. Most recent studies have inherent limitations, such as complicated procedures, a lack of versatility, and the need for prior knowledge of parental genotypes or haplotypes. To overcome these limitations, a robust and versatile next-generation sequencing-based cell-free DNA (cfDNA) allelic molecule counting system termed cfDNA barcode-enabled single-molecule test (cfBEST) is developed for the noninvasive prenatal diagnosis (NIPD) of monogenic disorders. The accuracy of cfBEST is found to be comparable to that of droplet digital polymerase chain reaction (ddPCR) in detecting low-abundance mutations in cfDNA. The analytical validity of cfBEST is evidenced by a β-thalassemia assay, in which a blind validation study of 143 at-risk pregnancies reveals a sensitivity of 99.19% and a specificity of 99.92% on allele detection. Because the validated cfBEST method can be used to detect maternal-fetal genotype combinations in cfDNA precisely and quantitatively, it holds the potential for the NIPD of human monogenic disorders.Entities:
Keywords: cell‐free DNA barcode‐enabled single‐molecule test (cfBEST); molecule counting system; monogenic disorders; noninvasive prenatal diagnosis (NIPD); β‐thalassemia
Year: 2019 PMID: 31179213 PMCID: PMC6548944 DOI: 10.1002/advs.201802332
Source DB: PubMed Journal: Adv Sci (Weinh) ISSN: 2198-3844 Impact factor: 16.806
Figure 1Schematic representation of the cfBEST method. Dark blue and red dots denote wild‐type and mutant sites of interest, respectively. A) The cfBEST protocol. Blue bars denote cfDNA fragments and different colored bars adjacent to the blue bars denote the degenerated barcodes. A prelibrary was built (see the Experimental Section) by amplifying barcoded cfDNA fragments to generate sufficient templates, which was split into two equal portions (referred as “F” and “R”). Each portion was used for the following two PCR reactions: The first PCR with a universal primer (U1) and a target‐specific primer (F1/R1) that was close to the site of interest; the second PCR with the same universal primer (U1) and another primer (F2/R2) containing both a target‐specific part that bound to the region closer to the site of interest than F1/R1 and a universal tail part that was the same as U2. The two portions were pooled together and the third PCR with U1 and U2 was performed for the subsequent massively parallel paired‐end sequencing. In the first and second PCR reactions, two target‐specific primers in the same portion formed a “semi‐nested” PCR to increase the specificity. The design that both primers were bound to near the site of interest could minimize the bias caused by size differences. Each of the barcoded single‐allelic molecules was amplified and sequenced multiple times (reads) and the multiple reads containing the same barcode and breakpoint together were grouped to call a unique original allelic molecule. Therefore, the PCR efficiency did not cause bias, either. B) The strategy for eliminating sequences from pseudogenes or homologous genes (“noise” sequences). The regions flanking the site of interest were analyzed for primer design. A qualified primer was identical to the reference sequence, which was able to amplify the target region (blue lines) without producing noise sequences from other regions (green lines). In most cases, the variations in the primer binding region (orange dots) led to no amplification (case 1); in other cases, there was only one or no variations in the primer binding region, which resulted in an amplified product of noise sequence (case 2) or low‐efficiency amplification (case 3). In order to count reads accurately, a filtering process was designed to eliminate noise sequences. For noise case 1, the PCR did not amplify any noise product. For cases 2 and 3, the unique variation patterns (blue dots) between them and the reference sequence were exploited to filter noise sequences in the bioinformatic analysis step. The sequencing/amplifying error caused by accidental mismatches or SNP (purple dot) in sites different from the variation patterns were allowed. F1/F2 primers are shown as an example in the illustration for one side. For the other side, R1/R2 primers were the same as F1/F2.
Figure 2Overview of study design, which was conducted in three stages: a cfBEST proof‐of‐concept experiment, the development of cfBEST using β‐thalassemia as a model, and a blind clinical validation of the cfBEST method.
Figure 3Development and optimization of the cfBEST system using β‐thalassemia as a model. A) Evaluation of gDNA as a reference sample. There was no significant difference between cfDNA and gDNA tested as a reference material with three different types of heterozygous β‐thalassemia mutations. Data are means ± SD; n = 5; n.s., not significant (Student t‐test). Evaluation of the impact of eliminating noise sequences through primer design and bioinformatic analysis with B) gDNA and C) cfDNA samples. A comparison of the detected mutation ratios between including and excluding noise sequences using B) gDNA samples from three heterozygous and two homozygous β‐thalassemia mutations and C) cfDNA samples from three heterozygous β‐thalassemia mutations. Data are means ± SD; n = 5; * p < 0.05, n.s., not significant (Student t‐test). D) Evaluating the correlation between starting DNA and tested unique reads (Pearson correlation coefficient analysis). E) Determining the minimal amount of starting DNA required for the cfBEST assay. Different amounts of gDNA and cfDNA from a heterozygous carrier of HBB:c.79G>A was tested as starting DNA. Data are means ± SD; * p < 0.05; n = 5; n.s., not significant (Student t‐test). All comparison was done between the theoretical value 50% (gray bar, denoted as a reference indicator, Ref.) and the detected ratios. F) Determining the minimal single‐molecule sequencing reads required for the cfBEST assay. The gDNA and cfDNA of a heterozygous carrier for HBB:c.79G>A was tested using cfBEST and different depths of sequencing reads were analyzed. Data are means ± SD; * p < 0.05; n = 5; n.s., not significant (Student t‐test). All comparison was done between the theoretical value 50% (gray bar, denoted as a reference indicator, Ref.) and the detected ratios. G) Determining the minimal fetal DNA fraction in maternal plasma required for accurate quantitative genotyping of β‐thalassemia mutations using ultrasonically fragmented gDNA by preparing the mixtures of different ratios. A mixture was made up of the sonicated gDNA from a heterozygous mutation sample that mimicked the background maternal cfDNA (denoted as “AB”) and “fetal” DNA sample (denoted by “aa” for a wild‐type, “ab” for a heterozygote, or “bb” for a homozygote). Four different concentrations of five replicate gDNA samples of ABaa, ABab, and ABbb were applied to cfBEST for mutation ratio detection. Data are means ± SD; n = 5. H) A total of 67 samples with HBB:c.126_129delCTTT, including 27 cases of ABaa, 31 cases of ABab, and 9 cases of ABbb from the peripheral blood of pregnant women were used to determine the lower limit of fetal DNA fraction. Different concentrations of cfDNA samples of ABaa, ABab, and ABbb were applied to cfBEST for mutation ratio detection. As there were no sufficient ABbb samples for statistics, individual dots denoted the detected ratios. Green triangles denote ABbb, blue triangles denote ABab, and orange triangles denote ABaa in (G,H).
The concordance of maternal/fetal genotype combinations determined by cfBEST with the gold standard IMD. A total of 1859 genotype combinations (13 common β‐thalassemia mutation sites in each of 143 blood samples of pregnant women) were obtained. A and a denote maternal and fetal wild‐type alleles, respectively; B and b denote maternal and fetal mutation alleles, respectively. Green color highlights the cases in which cfBEST and IMD were concordant; red color highlights the cases when cfBEST had errors. Concordance rate is defined as the ratio of 1855 concordant cases among 1859 detected phenotypes; κ is Cohen's kappa coefficient
Individual allele concordance of the cfBEST test on β‐thalassemia with the gold standard IMD. In the table, a and b denote fetal wild‐type and mutated alleles, respectively. Thirteen out of 16 mutations were listed, as three mutations, HBB:c.‐78A>C, HBB:c.126_130delCTTT;insA and HBB:c.216_217insT, at each of the same sites, were not detected
| Mutation ID | cfBEST | IMD | Sensitivity | Specificity | |
|---|---|---|---|---|---|
| a | b | ||||
|
| a | 286 | 0 | 100.00% | |
| b | 0 | 0 | |||
|
| a | 273 | 0 | 100.00% | 100.00% |
| b | 0 | 13 | |||
|
| a | 282 | 0 | 100.00% | 100.00% |
| b | 0 | 4 | |||
|
| a | 285 | 0 | 100.00% | 100.00% |
| b | 0 | 1 | |||
|
| a | 286 | 0 | 100.00% | |
| b | 0 | 0 | |||
|
| a | 269 | 0 | 100.00% | 99.63% |
| b | 1 | 16 | |||
|
| a | 285 | 0 | 100.00% | 100.00% |
| b | 0 | 1 | |||
|
| a | 286 | 0 | 100.00% | |
| b | 0 | 0 | |||
|
| a | 221 | 1 | 98.41% | 99.10% |
| b | 2 | 62 | |||
|
| a | 285 | 0 | 100.00% | 100.00% |
| b | 0 | 1 | |||
|
| a | 280 | 0 | 100.00% | 100.00% |
| b | 0 | 6 | |||
|
| a | 284 | 0 | 100.00% | 100.00% |
| b | 0 | 2 | |||
|
| a | 269 | 0 | 100.00% | 100.00% |
| b | 0 | 17 | |||
| All | a | 3591 | 1 | 99.19% | 99.92% |
| b | 3 | 123 | |||