Literature DB >> 21131285

A simple method using PyrosequencingTM to identify de novo SNPs in pooled DNA samples.

Yeong-Shin Lin1, Fu-Guo Robert Liu, Tzi-Yuan Wang, Cheng-Tsung Pan, Wei-Ting Chang, Wen-Hsiung Li.   

Abstract

A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large number of individuals is to measure the allele frequencies in pooled DNA samples. Pyrosequencing(TM) has been frequently used for this application because signals generated by this approach are proportional to the amount of DNA templates. The Pyrosequencing(TM) pyrogram is determined by the dispensing order of dNTPs, which is usually designed based on the known SNPs to avoid asynchronistic extensions of heterozygous sequences. Therefore, utilizing the pyrogram signals to identify de novo SNPs in DNA pools has never been undertook. Here, in this study we developed an algorithm to address this issue. With the sequence and pyrogram of the wild-type allele known in advance, we could use the pyrogram obtained from the pooled DNA sample to predict the sequence of the unknown mutant allele (de novo SNP) and estimate its allele frequency. Both computational simulation and experimental Pyrosequencing(TM) test results suggested that our method performs well. The web interface of our method is available at http://life.nctu.edu.tw/∼yslin/PSM/.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 21131285      PMCID: PMC3061071          DOI: 10.1093/nar/gkq1249

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

In human genomes, single-nucleotide polymorphisms (SNPs) compose the majority of genetic variation, and may, therefore, largely determine the differences among individuals. SNPs among human populations have been extensively explored in this decade (1,2). Their abundance and high potential for automation make them become a powerful tool for identifying genetic factors, especially those contributing to complex disease susceptibility. However, it is still expensive and time consuming to perform SNP genotyping in a large number of individuals (3). An efficient and low-cost method is important for large-scale SNP scoring. The application of current genotyping platforms for pooled DNA samples might be a practical way (3), because allele frequencies in a group of individuals could be measured using far fewer reactions (4). DNA pooling combined with whole genome analysis is usually considered as the first step to identify potential genetic markers for subsequent genotyping of individuals (5–7). Several genotyping methods suitable for measuring frequencies of SNPs in DNA pools have been proposed in the literatures (3,8). PyrosequencingTM, which was first described in 1988 (9), might be one of the most successful non-Sanger methods developed in the two decades (10). Instead of using 3′-modified dNTPs to terminate DNA polymerization, PyrosequencingTM adds dNTP bases one at a time in limiting amounts to control DNA synthesis. The dNTPs are dispensed in a specific order. DNA polymerase extends the primer while the complementary dNTP is added and pauses when it encounters a noncomplementary base. The reinitiation of DNA synthesis follows the addition of the next complementary dNTP (10). As a nonfluorescence technique, PyrosequencingTM measures the release of inorganic pyrophosphate, which is proportionally transformed into visible light by a cascade of enzymatic reactions (11,12). The generated light is recorded as a series of peaks called a pyrogram, which represents the order of complementary dNTPs and implies the underlying DNA sequence (10). Because the light generated by the PyrosequencingTM reactions is proportional to the amount of DNA template, this technique was frequently used to measure allelic gene expression (13,14) or allele frequency, including in tumor tissue (15), in parasites or microbial community (16,17) and in DNA pools (18–22). PyrosequencingTM has been recommended for allele frequency studies because of its high reliability in detecting variations between populations (23,24). The ‘next-generation’ sequencing technology, including the array-based pyrosequencing (454 sequencing platform), has recently been applied for high-throughput resequencing and SNP genotyping (8,25). However, although this strategy is powerful, the expense makes it less applicable when our research interest only focuses on specific genes in specific populations. At present, most clinical laboratories use the low-throughput PyrosequencingTM platform to identify known alleles (among organisms, strains or SNPs) (26). In this study, ‘PyrosequencingTM’ refers to this core technology but not the array-based 454 sequencing platform. No study has applied PyrosequencingTM for de novo SNP discovery (10). It is because base-calling for de novo SNPs is difficult and still performed manually (27). The PyrosequencingTM pyrogram is determined by the dispensing order of dNTPs. To avoid asynchronistic extensions of heterozygous sequences, the dispensing order used to be carefully designed (10). Current sequencing software cannot detect new polymorphisms in pooled DNA samples (27), including the application of multiplex genotyping techniques (27–30). Here, in this study, we developed an algorithm based on the normality test and dynamic programming to automatically read the pyrogram profile when unexpected mutations occurred. The performance of our method was evaluated using both computational simulation and experimental PyrosequencingTM assays.

MATERIALS AND METHODS

The object of our method is using a pyrogram of a pooled DNA sample to estimate the frequency of the mutant allele in the sample and predict its sequence. The sequence and pyrogram from the wild-type allele have to be known in advance. The flowchart is shown in Figure 1.
Figure 1.

The flowchart of the algorithm developed in this study.

The flowchart of the algorithm developed in this study.

The expected pyrogram

To illustrate our method, we used a DNA fragment, GATCGGTTCACGTC, as an example, and assumed that this is the wild-type allele. The PyrosequencingTM dispensing order of dNTPs, GATCGTCACGTC, was designated to complement this DNA fragment. Figure 2A shows the pyrogram profile, W, for this wild-type fragment. The signal intensity for the nth dispensed dNTP in W is represented as w. To simulate the real experiments, we defined coefficient of variation (CV) here as the standard deviation divided by the mean, and therefore obtained w: CV reflects the degree of precision for the PyrosequencingTM experiments. In this example, we let CV = 0.5%.
Figure 2.

(A) The hypothetical pyrogram profile, W, for the wild-type DNA fragment, GATCGGTTCACGTC; (B) the hypothetical pyrogram profile, M, for the mutant allele, GAGCGGTTCACGTC; (C) the expected pyrogram profile, S, for the pooled DNA sample with 95% wild-type allele and 5% mutant allele (95% black bars + 5% white bars). All the three pyrogram profiles were simulated under the same PyrosequencingTM dispensing order of dNTPs, GATCGTCACGTC, with CV = 0.5%.

(A) The hypothetical pyrogram profile, W, for the wild-type DNA fragment, GATCGGTTCACGTC; (B) the hypothetical pyrogram profile, M, for the mutant allele, GAGCGGTTCACGTC; (C) the expected pyrogram profile, S, for the pooled DNA sample with 95% wild-type allele and 5% mutant allele (95% black bars + 5% white bars). All the three pyrogram profiles were simulated under the same PyrosequencingTM dispensing order of dNTPs, GATCGTCACGTC, with CV = 0.5%. For a mutant allele with a thymine-to-guanine substitution at the third nucleotide, GAGCGGTTCACGTC, asynchronistic extensions would occur under the designated dispensing order of dNTPs described above. Figure 2B displays the pyrogram profile, M, for this mutant allele. Similarly, we could also obtain m: In this circumstance, for a pooled DNA sample with 95% wild-type allele and 5% mutant allele, the expected pyrogram profile, S, would be nonsynchronistic as shown in Figure 2C. The pyrogram could be predicted using the equation where s is the signal intensity at the nth dispensing site for S, and a represents the proportion of wild-type allele in the DNA sample. In this example, a = 0.95.

The pyrogram to be tested

Assume that we have two unknown pooled DNA samples to be tested, and that one is actually composed of 95% wild-type allele and 5% mutant allele as in Figure 2C, while the other is composed of 100% wild-type allele as in Figure 2A. Their pyrograms, Sblue and Sred, respectively, were simulated with CV = 0.5% and represented in Figure 3A. To distinguish Sblue and Sred, we calculated the ratio profile, R: The obtained Rblue and Rred are shown in Figure 3B. Note that pyrogram Sblue has nonsynchronistic extensions. Therefore, when the added nucleotide during PyrosequencingTM is not complementary to the mutant allele (for Sblue, n = 3, 4, 6, 8 and 9), decreased signal would be detected. For these dispensing sites, ; while for the other sites, because > 0. As a result, the values of Rblue would not be normally distributed. By contrast, the distribution of the values of Rred should be normal, and . We performed the Shapiro–Wilk test (31) on the normality of R, and sorted the values of R to obtain another profile, Q: The relative cumulative frequencies of Qblue and Qred are shown in Figure 3C. When the normality of R is rejected, possible nonsynchronistic extensions are implied. We therefore constructed an expected cumulative normal distribution, E, with the same mean and standard deviation as Q, and compared Q with E. In our example, the blue circles and the blue crosses represent Qblue and Eblue, respectively (Figure 3C). As described above, for certain dispensing sites, , which corresponds to a group of the smallest values of Qblue. To estimate the value of ablue, we looked for a variable i that can maximize , and then found another variable j that can minimize . We then speculated that In our example, i = 5, j = 4, and (Figure 3C).
Figure 3.

(A) The blue bars represent the pyrogram, Sblue, of a pooled DNA sample composed of 95% wild-type allele and 5% mutant allele as in Figure 2C. The red bars represent the pyrogram, Sred, of a DNA sample composed of 100% wild-type allele. The two pyrogram profiles were simulated with CV = 0.5%. (B) The ratio profiles Rblue and Rred. (C) The relative cumulative frequencies of profiles Qblue (blue circles) and Qed (red triangles). The blue crosses represent the expected cumulative normal distribution, Eblue, which has the same mean and standard deviation as Qblue. See the main text for the details.

(A) The blue bars represent the pyrogram, Sblue, of a pooled DNA sample composed of 95% wild-type allele and 5% mutant allele as in Figure 2C. The red bars represent the pyrogram, Sred, of a DNA sample composed of 100% wild-type allele. The two pyrogram profiles were simulated with CV = 0.5%. (B) The ratio profiles Rblue and Rred. (C) The relative cumulative frequencies of profiles Qblue (blue circles) and Qed (red triangles). The blue crosses represent the expected cumulative normal distribution, Eblue, which has the same mean and standard deviation as Qblue. See the main text for the details.

The sequence of the mutant allele

Because a ≈ q, we used q to construct another profile, T: The obtained Tblue is shown in Figure 4A. T is basically proportional to M, and could be used to infer it. However, it is inappropriate to read the sequence of the mutant allele directly from profile T, because its values are highly influenced by the coefficient of variation. Since profiles W and M could be perfectly aligned by adding gaps to W (Figure 2A and B), we used T to replace the unknown profile M, and used dynamic programming to align W and T (Figure 4). The obtained alignment was thus used to speculate the sequence of the mutant allele.
Figure 4.

The alignment between (A) the profile Tblue, which is basically proportional to the unknown profile M, and (B) the profile W. See the main text for the details.

The alignment between (A) the profile Tblue, which is basically proportional to the unknown profile M, and (B) the profile W. See the main text for the details. Before we perform the dynamic programming, it is worth to emphasize the ad hoc nature of PyrosequencingTM: It should be noted that the dynamic programming is performed when the normality of profile R has been rejected, which implies possible nonsynchronistic extensions. The nonsynchronistic extensions could result from either substitutions or insertions in the mutant allele. On the other hand, mutations are rare. We do not expect that a mutant allele with more than one de novo SNP in the short fragment would frequently be discovered. Therefore, the scoring scheme for the dynamic programming used in this study is defined as follows: We can only add gaps to profile W, because the dispensing order was designated to complement the wild-type DNA fragment. The implied sequence of the mutant allele is the set of nucleotides in T that are aligned to nucleotides in W (skipping nucleotides in T that are aligned to the added gaps). In our example, the implied sequence is GAGCGGTTC according to the alignment result in Figure 4. When one gap is added to W, the corresponding nucleotide in T is suggested to be the added dNTP during PyrosequencingTM that is noncomplementary to the mutant allele. The extension was therefore paused at that time. In our example (Figure 4), the third and fourth nucleotides in T (thymine and cytosine) are aligned to the gap in W. This alignment implies that both thymine and cytosine are not complementary to the third nucleotide of the mutant allele. When the gap added to W is elongated, the set of the corresponding nucleotides in T cannot include all the four dNTPs. Otherwise, all the four dNTPs are suggested to be noncomplementary to the next base of the mutant allele. In our example (Figure 4), for the first gap in W, only two dNTPs, thymine and cytosine, are included in the set of the corresponding nucleotides in T. When the extension is reinitiated, the added dNTP (the nucleotide in T that is aligned to the current nucleotide in W) should be complementary, and therefore cannot be one of these noncomplementary dNTPs that have appeared in the positions of T that correspond to the adjacent prior gap of W. In our example (Figure 4), when the extension is reinitiated following the first gap in W, the added complementary dNTP is guanine. This dNTP cannot be thymine or cytosine. For the two sites flanking the gap, the corresponding nucleotides in T cannot be the same, because the second added dNTP should be noncomplementary to the first nucleotide. In our example (Figure 4), for the two sites flanking the first gap in W, the corresponding nucleotides in T are adenine and guanine. The match score: ; and are used to even the values of the two profiles. The mismatch score: −∞ The gap penalty for profile W: The gap penalty for profile T: −∞ One mismatch site with score or one gap inserted to profile W with penalty 0 is allowed.

The estimated proportion of the wild-type allele in the pooled DNA sample

In the previous example, we assumed that the DNA quantity used for the pyrograms, W and S, are the same. However, this may not always hold. We therefore introduced another parameter, c, to represent the DNA quantity ratio: Similar to previous sections, we speculated that . We could also obtain two equations: Although is unknown, we could use the alignment result to infer it. Assume that there are x elements in the pyrogram W, and y of them are aligned to profile T, which suggests that there are (x – y) gap sites in the alignment. We could speculate that . Therefore, the proportion of the wild-type allele in the pooled DNA sample was estimated as Considering that in some cases the predicted mutant alleles may be derived from insertions, for example, an insertion at site z, we modified the equation as the following for these alleles:

The position of the mutation site

It should be noted that the value of i, which maximizes e – q, depends on the position of the mutant site. When the mutant site is located close to the end of the pyrogram, the value of i (and the proportion of i to x) would be small. In this circumstance, the normality of profile R may not be rejected because the signals of nonsynchronistic extensions are likely to be diluted. To overcome this problem, we tested the normality in a sliding window. The window size was designated as 30 in our study. As the window slides, if the normality is rejected for a certain window, we would use this window and its downstream pyrogram to derive the profile Q, and variables i, j and q.

Performance testing by computational simulation

We utilized simulation tests to evaluate the performance of our algorithm. The tested DNA fragments are listed below: The PyrosequencingTM dispensing order of dNTPs, ACACAGTCGTGTCACAGTGCTAGTCGCAGCTCAC, was designated to complement the wild-type allele. The tested DNA pools contained 0%, 1%, 2%, 4%, 8%, 16%, 32% or 64% mutant allele. The pyrograms of these pooled DNA samples were simulated with different degrees of experimental precision (CV = 0.01%, 0.02%, 0.04%, 0.08%, 0.16%, 0.32%, 0.64%, 1.28%, 2.56%, 5.12%, 10.24% and 20.48%). When the normality of profile R was rejected (P < 0.01, Shapiro–Wilk test), dynamic programming was performed to speculate the sequence of the mutant allele; otherwise, no mutant allele was inferred. If the speculated sequence of the mutant allele was identical to the wild-type (except for the last couple nucleotides, which may not be well aligned when CV is high), no mutant allele was inferred, either. The simulation tests were repeated 10 000 times. If our method positively identified a mutant allele, we estimated the proportion of the wild-type allele in the DNA pool, despite whether the speculated sequence is correct or not. The mean and standard deviation of the estimated proportion of the wild-type allele in the DNA pool were thus calculated. ACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCAGCCTCAC—the wild-type allele; ACGCCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCAGCCTCAC—the mutant allele with an adenosine-to-guanine substitution at the third nucleotide; ACAGCCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCAGCCTCAC—the mutant allele with a guanine inserted between the third and fourth nucleotides; ACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCATCCTCAC—the mutant allele with a guanine-to-thymine substitution at the 37th nucleotide; and ACACCAAGTCGTGTTCACAGTGGCTAAGTTCCGCCAGCCACAC—the mutant allele with a thymine-to-adenosine substitution at the 40th nucleotide.

Performance testing by real PyrosequencingTM

We first used a real PyrosequencingTM assay as an example. The DNA samples were obtained from mitochondrial cytochrome b gene of Pseudorasbora parva specimens. The test region was amplified using a specific primer pair: forward – GTGTGAAGTTGTCGGGGTCT; reverse – CCGCAACGGTTATCCATCTT. The Biotin tag was attached on the reverse primer. Polymerase chain reaction (PCR) was conducted using Taq DNA polymerase (Biokit Biotechnology, Taiwan) in a reaction mixture containing 25 ng of DNA template, 100 nM of biotin-labeled reverse primer and 100 nM of the forward primer. The PCR cycling program consisted of denaturation at 94°C for 1 min; followed by 40 cycles of denaturation at 94°C for 20 s, annealing at 60°C for 20 s, and extension at 72°C for 15 s; and the final extension at 72°C for 7 min. PCR products were purified with PCR clean-up kit (Biokit Biotechnology). The pooled DNA sample contained 90% PCR products of one allele (CCTAACAGGTTAGGGGAAAATAGCGCTAGAGATGTAAGGGCCAACAATATTAATACAAAGCCAAGAAGGTCTTTGT for the first 76 bases) as the wild-type and 10% PCR products of another allele with a cytosine-to-thymine substitution at the 6th nucleotide (CCTAATAGGTTAGGGGAAAATAGCGCT for the first 27 bases) as the mutant allele. The concentrations of the DNA samples were measured using ND-1000 (Nanodrop Technologies, Wilmington, DE, USA) at OD260. Biotinylated single-stranded DNA in 40 µl PCR solution containing 600 ng pooled DNA samples and the forward primer were used for the PyrosequencingTM reaction, which was performed in accordance with the manufacturer’s instructions (www.pyrosequencing.com) using Pyro Gold SQA Reagents (Qiagen, Hilden, Germany) by model PyroMark ID (Biotage AB, Uppsala, Sweden). To reveal how practical our method is in real experiments, another large-scale PyrosequencingTM assay was conducted. A partial region of YBR114W gene was amplified for both the two yeast strains, BY4741 (BY, a laboratory strain) and RM11-1a (RM, a wild strain) with a specific primer pair: forward – AAGCAAAGTATTGTTAGCCGTCTA; reverse – ATCCAGCTCTTTTCAATCTCC. The Biotin tag was also attached on the reverse primer. Another forward sequencing primer, GCCGTCTAAACATGAGT, was used for the PyrosequencingTM reaction. The sequences to be read in the PyrosequencingTM reactions for BY and RM are GGCAAGTGGCAATCATCAACGAAAATCGAAGCACT and GGTAAGTGGCAATCATCAACGAAAATCGAAGCACT, respectively. A cytosine-to-thymine substitution is at the third nucleotide. We prepared the wild-type sample using 100% RM and the unknown pooled DNA sample using 90% RM + 10% BY. Both samples were repeated 12 times. One hundred and forty-four sample pairs could therefore be obtained. The derived pyrograms are represented in Supplementary Data.

RESULTS AND DISCUSSION

The simulation results are listed in Tables 1 and 2. When the variation in the pyrogram signals was limited (the level of precision was high), e.g. CV < 0.1%, in most cases, our method could perfectly predict the DNA sequence of the mutant allele, either a substitution or an insertion, and its proportion in the DNA pools. However, when the signal variation was high (the level of precision was low), the prediction power of our method decreased with the proportion of the mutant allele in the DNA pool. For example, in Table 1, when CV = 2.56%, we precisely estimated the proportion of the mutant allele (with one substitution at the third nucleotide) in the DNA pool while its real proportion is 16% (estimated as 16.00 ± 2.87%); however, when the real proportion decreased to 1%, our method tended to overestimate its value (3.32 ± 2.69%). Similarly, in Table 2, when CV = 2.56%, we accurately predicted the sequence of the mutant allele (with one substitution at the third nucleotide) in all the 10 000 repeats while its proportion in the DNA pool is 32%; however, when the real proportion decreased to 1%, we only identified a mutant allele 507 times from the 10 000 repeats, and only nine of them had their sequence accurately predicted. Note that the standard deviation of the estimated allele frequencies also increased with CV (Table 1). These results suggested that the performance of our method is highly correlated to the variation in the pyrogram signals (the level of experimental precision) and the proportion of the mutant allele in the DNA pool. We also examined the possibility that we inaccurately predicted the existence of a mutant allele in a DNA pool consisting of 100% wild-type allele. The false positive ratio was <5% when CV < 5% (Table 2). Moreover, even in these cases, the estimated proportion of the wild-type allele in the DNA pool did not deviate from 100% too much when the signal variation was limited (Table 1).
Table 1.

The estimated proportion of the wild-type allele in the DNA pool under various simulated conditions

CVThe mean ± standard deviation of the estimated proportion of the wild-type allele in the DNA pool
a = 1.00a = 0.99a = 0.98a = 0.96a = 0.92a = 0.84a = 0.68a = 0.36
Mutant allele with an adenosine-to-guanine substitution at the third nucleotide
0.01%0.9999 ± 0.00010.9900 ± 0.00010.9800 ± 0.00010.9600 ± 0.00010.9200 ± 0.00010.8400 ± 0.00010.6800 ± 0.00010.3600 ± 0.0001
0.02%0.9997 ± 0.00020.9900 ± 0.00030.9800 ± 0.00030.9600 ± 0.00030.9200 ± 0.00020.8400 ± 0.00020.6800 ± 0.00020.3600 ± 0.0001
0.04%0.9994 ± 0.00040.9900 ± 0.00060.9800 ± 0.00060.9600 ± 0.00050.9200 ± 0.00050.8400 ± 0.00040.6800 ± 0.00030.3600 ± 0.0002
0.08%0.9988 ± 0.00090.9900 ± 0.00110.9800 ± 0.00110.9600 ± 0.00110.9200 ± 0.00100.8400 ± 0.00090.6800 ± 0.00070.3600 ± 0.0004
0.16%0.9977 ± 0.00180.9899 ± 0.00240.9800 ± 0.00220.9600 ± 0.00210.9200 ± 0.00200.8400 ± 0.00180.6800 ± 0.00130.3600 ± 0.0008
0.32%0.9954 ± 0.00360.9898 ± 0.00590.9798 ± 0.00490.9599 ± 0.00420.9200 ± 0.00400.8400 ± 0.00350.6800 ± 0.00260.3600 ± 0.0017
0.64%0.9910 ± 0.00700.9881 ± 0.00860.9796 ± 0.01170.9598 ± 0.00920.9199 ± 0.00790.8399 ± 0.00690.6799 ± 0.00520.3600 ± 0.0033
1.28%0.9822 ± 0.01380.9830 ± 0.01420.9763 ± 0.01700.9593 ± 0.02290.9196 ± 0.01670.8401 ± 0.01380.6799 ± 0.01050.3599 ± 0.0066
2.56%0.9654 ± 0.02750.9668 ± 0.02690.9652 ± 0.02830.9541 ± 0.03400.9206 ± 0.04490.8400 ± 0.02870.6802 ± 0.02130.3599 ± 0.0132
5.12%0.9352 ± 0.05460.9336 ± 0.05280.9363 ± 0.05320.9335 ± 0.05650.9111 ± 0.06750.8437 ± 0.08370.6801 ± 0.04570.3597 ± 0.0267
10.24%0.8807 ± 0.10690.8871 ± 0.10730.8862 ± 0.10710.8848 ± 0.10760.8784 ± 0.11260.8395 ± 0.13870.6919 ± 0.15050.3603 ± 0.0526
20.48%0.8315 ± 0.24660.8314 ± 0.25460.8280 ± 0.26270.8282 ± 0.24670.8283 ± 0.27430.8130 ± 0.25430.7490 ± 0.32710.3464 ± 4.3826
Mutant allele with a guanine inserted between the third and fourth nucleotides
0.01%0.9999 ± 0.00010.9900 ± 0.00010.9800 ± 0.00010.9600 ± 0.00010.9200 ± 0.00010.8400 ± 0.00010.6800 ± 0.00010.3600 ± 0.0001
0.02%0.9998 ± 0.00020.9900 ± 0.00030.9800 ± 0.00030.9600 ± 0.00030.9200 ± 0.00020.8400 ± 0.00020.6800 ± 0.00020.3600 ± 0.0001
0.04%0.9997 ± 0.00050.9900 ± 0.00050.9800 ± 0.00050.9600 ± 0.00050.9200 ± 0.00050.8400 ± 0.00040.6800 ± 0.00030.3600 ± 0.0002
0.08%0.9994 ± 0.00100.9900 ± 0.00110.9800 ± 0.00100.9600 ± 0.00100.9200 ± 0.00100.8400 ± 0.00080.6800 ± 0.00060.3600 ± 0.0004
0.16%0.9986 ± 0.00190.9899 ± 0.00230.9801 ± 0.00210.9600 ± 0.00200.9200 ± 0.00190.8400 ± 0.00170.6800 ± 0.00130.3600 ± 0.0008
0.32%0.9974 ± 0.00370.9903 ± 0.00590.9798 ± 0.00440.9601 ± 0.00410.9200 ± 0.00380.8399 ± 0.00340.6800 ± 0.00260.3600 ± 0.0016
0.64%0.9947 ± 0.00750.9888 ± 0.00890.9811 ± 0.01150.9597 ± 0.00930.9199 ± 0.00760.8399 ± 0.00670.6800 ± 0.00510.3600 ± 0.0032
1.28%0.9892 ± 0.01570.9867 ± 0.01590.9789 ± 0.01760.9615 ± 0.02200.9195 ± 0.01690.8400 ± 0.01340.6799 ± 0.01020.3600 ± 0.0063
2.56%0.9808 ± 0.02960.9756 ± 0.02890.9697 ± 0.02840.9580 ± 0.03420.9251 ± 0.04530.8393 ± 0.02910.6797 ± 0.02050.3601 ± 0.0128
5.12%0.9502 ± 0.05470.9538 ± 0.05600.9508 ± 0.05440.9413 ± 0.06060.9254 ± 0.07730.8492 ± 0.08630.6801 ± 0.04770.3599 ± 0.0257
10.24%0.9170 ± 0.12060.9188 ± 0.11930.9223 ± 0.12690.9144 ± 0.12220.9006 ± 0.13690.8573 ± 0.15650.7043 ± 0.18120.3556 ± 0.5902
20.48%0.8898 ± 0.31290.8969 ± 0.36990.9005 ± 0.53210.8804 ± 0.32530.8773 ± 0.31340.8779 ± 0.59530.7953 ± 1.61430.3580 ± 3.9309
Mutant allele with a guanine-to-thymine substitution at the 37th nucleotide
0.01%0.9999 ± 0.00010.9900 ± 0.00010.9800 ± 0.00010.9600 ± 0.00010.9200 ± 0.00010.8400 ± 0.00010.6800 ± 0.00010.3600 ± 0.0000
0.02%0.9998 ± 0.00020.9900 ± 0.00020.9800 ± 0.00020.9600 ± 0.00020.9200 ± 0.00020.8400 ± 0.00020.6800 ± 0.00010.3600 ± 0.0001
0.04%0.9997 ± 0.00050.9900 ± 0.00040.9800 ± 0.00040.9600 ± 0.00030.9200 ± 0.00030.8400 ± 0.00030.6800 ± 0.00030.3600 ± 0.0002
0.08%0.9993 ± 0.00090.9900 ± 0.00070.9800 ± 0.00070.9600 ± 0.00070.9200 ± 0.00070.8400 ± 0.00060.6800 ± 0.00050.3600 ± 0.0004
0.16%0.9985 ± 0.00180.9900 ± 0.00150.9800 ± 0.00150.9600 ± 0.00140.9200 ± 0.00140.8400 ± 0.00120.6800 ± 0.00100.3600 ± 0.0008
0.32%0.9975 ± 0.00370.9900 ± 0.00470.9800 ± 0.00310.9601 ± 0.00280.9200 ± 0.00270.8400 ± 0.00240.6800 ± 0.00200.3600 ± 0.0016
0.64%0.9948 ± 0.00760.9916 ± 0.00850.9803 ± 0.00960.9598 ± 0.00580.9201 ± 0.00540.8400 ± 0.00490.6801 ± 0.00410.3600 ± 0.0032
1.28%0.9885 ± 0.01440.9881 ± 0.01460.9853 ± 0.01700.9595 ± 0.01720.9198 ± 0.01110.8401 ± 0.00980.6799 ± 0.00830.3600 ± 0.0065
2.56%0.9774 ± 0.02960.9774 ± 0.02760.9741 ± 0.02770.9700 ± 0.03420.9173 ± 0.03040.8395 ± 0.01980.6799 ± 0.01640.3601 ± 0.0129
5.12%0.9531 ± 0.05350.9590 ± 0.05770.9552 ± 0.05420.9533 ± 0.05670.9367 ± 0.06400.8463 ± 0.06700.6798 ± 0.03290.3600 ± 0.0255
10.24%0.9108 ± 0.12050.9112 ± 0.11750.9142 ± 0.11610.9162 ± 0.11990.9059 ± 0.11640.8901 ± 0.13050.6954 ± 0.12580.3595 ± 0.0516
20.48%0.8806 ± 0.30440.8952 ± 0.37880.8890 ± 0.31200.8957 ± 0.32570.8863 ± 0.30630.8816 ± 0.31280.8718 ± 0.31270.5032 ± 0.4090
Mutant allele with a thymine-to-adenosine substitution at the 40th nucleotide
0.01%0.9999 ± 0.00011.0008 ± 0.00031.0015 ± 0.00051.0031 ± 0.00111.0061 ± 0.00211.0124 ± 0.00431.0250 ± 0.00881.0518 ± 0.0187
0.02%0.9998 ± 0.00021.0008 ± 0.00031.0015 ± 0.00061.0030 ± 0.00111.0061 ± 0.00211.0123 ± 0.00431.0248 ± 0.00881.0511 ± 0.0185
0.04%0.9997 ± 0.00051.0008 ± 0.00061.0015 ± 0.00071.0031 ± 0.00121.0061 ± 0.00221.0124 ± 0.00441.0250 ± 0.00881.0515 ± 0.0186
0.08%0.9992 ± 0.00091.0011 ± 0.00131.0016 ± 0.00121.0031 ± 0.00141.0061 ± 0.00231.0124 ± 0.00441.0251 ± 0.00891.0515 ± 0.0187
0.16%0.9985 ± 0.00181.0003 ± 0.00351.0022 ± 0.00251.0033 ± 0.00251.0062 ± 0.00271.0124 ± 0.00471.0252 ± 0.00911.0516 ± 0.0188
0.32%0.9974 ± 0.00380.9974 ± 0.00611.0004 ± 0.00701.0044 ± 0.00511.0066 ± 0.00471.0124 ± 0.00551.0250 ± 0.00951.0517 ± 0.0191
0.64%0.9947 ± 0.00730.9939 ± 0.00900.9948 ± 0.01241.0010 ± 0.01391.0089 ± 0.00981.0132 ± 0.00931.0249 ± 0.01091.0513 ± 0.0201
1.28%0.9881 ± 0.01410.9873 ± 0.01540.9869 ± 0.01870.9908 ± 0.02461.0037 ± 0.02691.0178 ± 0.01991.0263 ± 0.01771.0517 ± 0.0233
2.56%0.9757 ± 0.02980.9751 ± 0.02990.9751 ± 0.02850.9710 ± 0.03310.9808 ± 0.04871.0100 ± 0.04971.0358 ± 0.03801.0547 ± 0.0376
5.12%0.9578 ± 0.05840.9507 ± 0.05590.9510 ± 0.05440.9506 ± 0.05800.9489 ± 0.06250.9658 ± 0.08961.0293 ± 0.09101.0742 ± 0.0820
10.24%0.9179 ± 0.12610.9043 ± 0.10420.9236 ± 0.12980.9050 ± 0.10950.9180 ± 0.12200.9013 ± 0.12170.9420 ± 0.17001.0736 ± 0.1752
20.48%0.8923 ± 0.32680.8918 ± 0.35250.8920 ± 0.34030.8880 ± 0.33780.8931 ± 0.30380.8825 ± 0.29350.8715 ± 0.29860.9362 ± 0.3909

a indicates the real proportion of the wild-type allele in the DNA pool.

Table 2.

The accuracy of the mutant allele identification in the DNA pool under various simulated conditions

CVTrue positive/positive
a = 1.00a = 0.99a = 0.98a = 0.96a = 0.92a = 0.84a = 0.68a = 0.36
Mutant allele with an adenosine-to-guanine substitution at the third nucleotide
0.01%– / 38710 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.02%– / 37010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.04%– / 38010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.08%– / 40110 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.16%– / 3719242 / 950410 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.32%– / 3631932 / 30579304 / 956810 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.64%– / 363209 / 10061978 / 30879436 / 967610 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
1.28%– / 40131 / 609234 / 10842224 / 33119616 / 980110 000 / 10 00010 000 / 10 00010 000 / 10 000
2.56%– / 4459 / 50738 / 718285 / 12362742 / 39229818 / 993910 000 / 10 00010 000 / 10 000
5.12%– / 5415 / 63411 / 69067 / 995452 / 16973801 / 51669932 / 999710 000 / 10 000
10.24%– / 10993 / 115810 / 120535 / 1391117 / 1758820 / 28566132 / 75039969 / 10 000
20.48%– / 328715 / 330327 / 340045 / 356595 / 3909361 / 44302058 / 58748086 / 9374
Mutant allele with a guanine inserted between the third and fourth nucleotides
0.01%– / 36610 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.02%– / 36810 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.04%– / 35410 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.08%– / 3789995 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.16%– / 3668604 / 88829998 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010000 / 10 000
0.32%– / 3811351 / 23898667 / 89369994 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.64%– / 384166 / 9461425 / 23858894 / 913810 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
1.28%– / 35819 / 604181 / 10421571 / 26109184 / 939610 000 / 10 00010 000 / 10 00010 000 / 10 000
2.56%– / 4246 / 54721 / 698219 / 11881971 / 31049600 / 979310 000 / 10 00010 000 / 10 000
5.12%– / 5730 / 6637 / 69931 / 903260 / 14992829 / 41259856 / 998710 000 / 10 000
10.24%– / 10321 / 10793 / 122115 / 138979 / 1795621 / 27655134 / 64949934 / 10 000
20.48%– / 32314 / 32886 / 343216 / 355332 / 3875208 / 45131613 / 57377559 / 8853
Mutant allele with a guanine-to-thymine substitution at the 37th nucleotide
0.01%– / 40210 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.02%– / 38210 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.04%– / 39410 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.08%– / 39310 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.16%– / 3858867 / 894410 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.32%– / 427918 / 11529005 / 906010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
0.64%– / 39277 / 386855 / 10919050 / 910710 000 / 10 00010 000 / 10 00010 000 / 10 00010 000 / 10 000
1.28%– / 39424 / 39776 / 401869 / 10569220 / 925610 000 / 10 00010 000 / 10 00010 000 / 10 000
2.56%– / 4523 / 44313 / 42960 / 385868 / 10269480 / 949710 000 / 10 00010 000 / 10 000
5.12%– / 5523 / 5293 / 57617 / 52199 / 504947 / 11129767 / 977510 000 / 10 000
10.24%– / 10482 / 10585 / 105411 / 105243 / 1111160 / 9631145 / 13329820 / 9824
20.48%– / 32278 / 32829 / 346110 / 326031 / 343999 / 3428480 / 30431901 / 2481
Mutant allele with a thymine-to-adenosine substitution at the 40th nucleotide
0.01%– / 367111 / 10 000111 / 10 00095 / 10 00082 / 10 00096 / 10 000116 / 10 00097 / 10 000
0.02%– / 372100 / 10 000112 / 10 00099 / 10 00098 / 10 00096 / 10 000126 / 10 000122 / 10 000
0.04%– / 379110 / 10 000109 / 10 000118 / 10 000100 / 10 000117 / 10 00091 / 10 000110 / 10 000
0.08%– / 384115 / 10 000121 / 10 000123 / 10 00099 / 10 000109 / 10 00092 / 10 000118 / 10 000
0.16%– / 379934 / 7882126 / 10 000111 / 10 000120 / 10 000112 / 10 000114 / 10 000103 / 10 000
0.32%– / 358386 / 1438921 / 7837100 / 10 000105 / 10 000117 / 10 00088 / 10 000111 / 10 000
0.64%– / 38299 / 566423 / 1465929 / 7948106 / 10 000120 / 10 000110 / 10 000117 / 10 000
1.28%– / 40137 / 418105 / 541373 / 1460806 / 820196 / 10 00099 / 10 000104 / 10 000
2.56%– / 40321 / 46850 / 484109 / 566421 / 1523663 / 8525105 / 10 00087 / 10 000
5.12%– / 56629 / 57235 / 56443 / 559122 / 677400 / 1592479 / 9017121 / 10 000
10.24%– / 108330 / 110040 / 106358 / 110981 / 1092184 / 1182424 / 2003253 / 9156
20.48%– / 323862 / 334464 / 327186 / 3374102 / 3374183 / 3520403 / 3644550 / 4127

Positive: the total number of simulation repeats that positively identified a mutant allele in the DNA pool.

True positive: the number of simulation repeats that correctly identified the mutant allele.

a indicates the real proportion of the wild-type allele in the DNA pool.

The estimated proportion of the wild-type allele in the DNA pool under various simulated conditions a indicates the real proportion of the wild-type allele in the DNA pool. The accuracy of the mutant allele identification in the DNA pool under various simulated conditions Positive: the total number of simulation repeats that positively identified a mutant allele in the DNA pool. True positive: the number of simulation repeats that correctly identified the mutant allele. a indicates the real proportion of the wild-type allele in the DNA pool. Since sufficient signals of nonsynchronistic extensions are crucial for our algorithm, one might argue that it would be difficult to identify a mutant allele if its mutant site was located close to the end of the pyrogram. Our simulation revealed that, when the substitution was located at the 40th nucleotide, our algorithm almost did not have the identification power (Tables 1 and 2) because the generated profile R had only two sites with . In this circumstance, it was difficult to obtain a reasonable i, and also the variables j, and q. We therefore were unable to correctly align the profiles and predict the mutant sequence. However, when the substitution was located at the 37th nucleotide instead (with four sites ), our algorithm performed almost the same as when the substitution was located at the third nucleotide (Tables 1 and 2). This result suggested that our method should have a wide application. We also performed real PyrosequencingTM assays to reveal how our algorithm works. In our first example (Figure 5), the mitochondrial cytochrome b gene of P. parva was used. Figure 5A and B display the pyrograms for the wild-type DNA fragment and the pooled DNA sample containing 10% mutant allele, respectively. Although it might not be easy to distinguish these two pyrograms by eyes, our algorithm successfully identified the sequence of the mutant allele (Figure 5D and E), and estimated its proportion in the DNA pool as 12.0%. The deviation of this estimated value is likely due to the variation in the pyrogram signals. This variation could be revealed from the constructed profile T in Figure 5D. According to the PyrosequencingTM dispensing order of dNTPs and the sequence of the mutant allele, the 29th–39th and 42nd–45th sites were supposed to have no signal being detected; however, unexpected high values (due to the signal variation) were represented on some of these sites (Figure 5D). Our dynamic programming overcame this difficulty by considering the ad hoc nature of PyrosequencingTM. We were therefore able to correctly align the profiles T and W, and predicted the sequence of the mutant allele (Figure 5D and E).
Figure 5.

The real PyrosequencingTM examination of the mitochondrial cytochrome b gene of P. parva: (A) the pyrogram of the wild-type DNA fragment, W; (B) the pyrogram of a pooled DNA sample containing 10% mutant DNA, S; (C) the profile R; (D) the profile T; (E) the profile W which is aligned to profile T. See the main text for the details.

The real PyrosequencingTM examination of the mitochondrial cytochrome b gene of P. parva: (A) the pyrogram of the wild-type DNA fragment, W; (B) the pyrogram of a pooled DNA sample containing 10% mutant DNA, S; (C) the profile R; (D) the profile T; (E) the profile W which is aligned to profile T. See the main text for the details. Given that the performance of our algorithm heavily depends on the level of experimental precision as described above, it is worth to know the reproducibility of general PyrosequencingTM reactions. Previous studies indicated that, when the same PCR products were sequenced several times, the standard deviation of the signals ranged 0.006–0.024 (32) and 0.008–0.031 (15). Doostzadeh et al. (22) further suggested that it is possible to reduce the values of standard deviation to 0.0003–0.0018 if the signal intensity was appropriately measured. If the coefficient of variation was limited in this range, our method could easily be used to detect rare mutant alleles (Tables 1 and 2). It should be emphasized that the purpose of our study was not to improve the quality of PyrosequencingTM reactions and our experiments were not performed by experienced technicians. However, the result of our large-scale assay indicates that the proposed algorithm still performs well for such general PyrosequencingTM tests (Table 3). Among all the 144 sample pairs, only one pair failed to satisfy the criteria: Shapiro–Wilk test, P < 0.05. Moreover, we accurately predicted the sequence of BY strain (the unknown allele) for 141 of the rest 143 pairs. The proportion of BY strain in the pooled DNA sample was estimated as 12.82 ± 3.81%. We also tested the false-positive ratio using the 12 repeats with 100% RM as both wild-type sample and pooled DNA sample. In the possible 132 sample pairs, only three pairs were inaccurately predicted as with the existence of a mutant allele, i.e. W3/W6, W5/W8 and W6/W3 as the wild-type sample/the pooled sample, respectively. These examinations are consistent with our computational simulation results.
Table 3.

The estimated proportion of BY strain (the unknown allele) in the pooled DNA samples in our large-scale PyrosequencingTM assay

S1S2S3S4S5S6S7S8S9S10S11S12
W10.09540.18510.09770.14340.09160.14430.08750.16620.10450.13870.11990.1587
W20.11470.13870.10490.13800.17050.14170.11580.07550.12990.10260.14450.1651
W30.11850.14690.14800.13120.08940.13610.16230.16380.14840.13490.11300.1665
W40.10320.17740.12730.12810.13940.16550.15690.15930.17350.16680.15000.1456
W50.12520.19790.14240.11540.19800.16510.14480.15830.14510.11650.14330.0787
W60.06180.04120.08290.13350.13040.03300.20730.07080.2122*0.09090.1078
W70.10840.10650.14600.10550.10990.15790.15530.13830.07560.16670.09010.1161
W80.07020.15620.19790.13600.04640.12640.14520.07790.20010.10950.07790.1028
W90.07040.08740.10020.15960.14590.1398−0.06160.18460.19940.18710.15510.0929
W100.12620.10780.09040.15500.12370.11110.12740.07210.12490.1502*0.09930.1038
W110.11830.10430.12400.13200.11620.16010.10490.13720.12260.17030.12560.1482
W120.11360.13000.14080.10780.09830.11640.11950.14720.13660.17600.14140.1352

The 12 wild-type samples (100% RM) are denoted as W1–W12, while the 12 pooled DNA samples (90% RM + 10% BY) are denoted as S1–S12.

The sample pair failed to satisfy the criteria: Shapiro–Wilk test, P < 0.05, is marked with (–), and the two pairs we failed to identify the correct sequence of the unknown allele are marked with (*).

The estimated proportion of BY strain (the unknown allele) in the pooled DNA samples in our large-scale PyrosequencingTM assay The 12 wild-type samples (100% RM) are denoted as W1–W12, while the 12 pooled DNA samples (90% RM + 10% BY) are denoted as S1–S12. The sample pair failed to satisfy the criteria: Shapiro–Wilk test, P < 0.05, is marked with (–), and the two pairs we failed to identify the correct sequence of the unknown allele are marked with (*). The deficiency of our algorithm is that it might fail if the pooled DNA sample contained more than one unexpected mutant allele (de novo SNP). Combining more than two pyrograms into one would make the derived pyrogram become too complicated to be decomposed. Fortunately, we could design a specific dispensing order of dNTPs for all the known haplotypes, and our method only has to deal with de novo SNPs. It is unlikely that we would frequently find two or more de novo SNPs in a short PyrosequencingTM read. The other difficulty is that one haplotype might include more than one mutant site. Modifying the scoring scheme of our dynamic programming (e.g. reducing the penalty for the second mismatch site) might help to identify some of these haplotypes. This is especially true if the mutant sites were located close to the start of the pyrogram, because sufficient signals of nonsynchronistic extensions could thus be provided to overcome the penalty of the mismatch sites. However, this kind of modifications would also increase the false-positive ratio and decrease the specificity of our prediction. Therefore, our method only focused on haplotypes with one mutant site, since mutations are supposed to be rare. In recent years, PyrosequencingTM has been frequently utilized to estimate the frequencies or expression levels of known alleles (13–24,26). Because the dispensing order of dNTPs was designed based on the known SNPs, the de novo SNPs probably used to be ignored, especially if their frequencies were not high enough to generate obvious signals of asynchronistic extensions. For this kind of studies, our method could easily be applied to examine the existence of unexpected mutant alleles in the DNA samples by comparing the obtained pyrograms. This is a simple and economical strategy for SNP genotyping surveys. On the other hand, our algorithm also has the potential to be applied for the high-throughput PyrosequencingTM (454 platform) data. An appropriate DNA-to-bead ratio is essential for the 454 platform because only beads carrying single type of amplified templates could generate readable signals (flowgrams) (33–35). The mixed signals generated from either wells each containing multiple beads or beads each carrying multiple amplified templates are usually filtered out. In some of these cases, asynchronistic extensions may occur and our algorithm could be modified to identify these mixed DNA templates. More information could therefore be obtained. In other words, the method proposed in this study not only creates a new application for the low-throughput PyrosequencingTM platform, but also provides a possible strategy to improve the high-throughput PyrosequencingTM platform that might be useful in the future.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Science Council, Taiwan (NSC 97-2621-B-009-001 and 98-2621-B-009-001-MY3); NCTU under the grant from MoE ATU Plan. Funding for open access charge: National Science Council, Taiwan. Conflict of interest statement. None declared.
  34 in total

1.  SNPs, microarrays and pooled DNA: identification of four loci associated with mild mental impairment in a sample of 6000 children.

Authors:  Lee M Butcher; Emma Meaburn; Jo Knight; Pak C Sham; Leonard C Schalkwyk; Ian W Craig; Robert Plomin
Journal:  Hum Mol Genet       Date:  2005-03-30       Impact factor: 6.150

Review 2.  Genetic variation analyses by Pyrosequencing.

Authors:  Taimour Langaee; Mostafa Ronaghi
Journal:  Mutat Res       Date:  2005-06-03       Impact factor: 2.433

Review 3.  Emerging technologies in DNA sequencing.

Authors:  Michael L Metzker
Journal:  Genome Res       Date:  2005-12       Impact factor: 9.043

4.  High-density single-nucleotide polymorphism maps of the human genome.

Authors:  Raymond D Miller; Michael S Phillips; Inho Jo; Miriam A Donaldson; Joel F Studebaker; Nicholas Addleman; Steven V Alfisi; Wendy M Ankener; Hamid A Bhatti; Chad E Callahan; Benjamin J Carey; Cheryl L Conley; Justin M Cyr; Vram Derohannessian; Rachel A Donaldson; Carolina Elosua; Stacey E Ford; Angela M Forman; Craig A Gelfand; Nicole M Grecco; Susan M Gutendorf; Cricket R Hock; Mark J Hozza; Soyoung Hur; Sun Mi In; Diana L Jackson; Sangmee Ahn Jo; Sung-Chul Jung; Sook Kim; Kuchan Kimm; Ellen F Kloss; Daniel C Koboldt; Jennifer M Kuebler; Feng-Shen Kuo; Jessica A Lathrop; Jong-Keuk Lee; Kathy L Leis; Stephanie A Livingston; Elizabeth G Lovins; Maria L Lundy; Sima Maggan; Matthew Minton; Michael A Mockler; David W Morris; Eric P Nachtman; Bermseok Oh; Chan Park; Chang-Wook Park; Nicholas Pavelka; Adrienne B Perkins; Stephanie L Restine; Ravi Sachidanandam; Andrew J Reinhart; Kathryn E Scott; Gira J Shah; Jatana M Tate; Shobha A Varde; Amy Walters; J Rebecca White; Yeon-Kyeong Yoo; Jong-Eun Lee; Michael T Boyce-Jacino; Pui-Yan Kwok
Journal:  Genomics       Date:  2005-08       Impact factor: 5.736

Review 5.  The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases I. DNA pooling.

Authors:  N Risch; J Teng
Journal:  Genome Res       Date:  1998-12       Impact factor: 9.043

6.  Real-time DNA sequencing using detection of pyrophosphate release.

Authors:  M Ronaghi; S Karamohamed; B Pettersson; M Uhlén; P Nyrén
Journal:  Anal Biochem       Date:  1996-11-01       Impact factor: 3.365

7.  Sensitive sequencing method for KRAS mutation detection by Pyrosequencing.

Authors:  Shuji Ogino; Takako Kawasaki; Mohan Brahmandam; Liying Yan; Mami Cantor; Chungdak Namgyal; Mari Mino-Kenudson; Gregory Y Lauwers; Massimo Loda; Charles S Fuchs
Journal:  J Mol Diagn       Date:  2005-08       Impact factor: 5.568

8.  A new method of sequencing DNA.

Authors:  E D Hyman
Journal:  Anal Biochem       Date:  1988-11-01       Impact factor: 3.365

9.  The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases. II. Individual genotyping.

Authors:  J Teng; N Risch
Journal:  Genome Res       Date:  1999-03       Impact factor: 9.043

10.  Multiplex SNP typing by bioluminometric assay coupled with terminator incorporation (BATI).

Authors:  Guo-Hua Zhou; Mari Gotou; Tomoharu Kajiyama; Hideki Kambara
Journal:  Nucleic Acids Res       Date:  2005-09-01       Impact factor: 16.971

View more
  2 in total

1.  Efficient identification of SNPs in pooled DNA samples using a dual mononucleotide addition-based sequencing method.

Authors:  Changchang Cao; Rongfang Pan; Jun Tan; Xiao Sun; Pengfeng Xiao
Journal:  Mol Genet Genomics       Date:  2017-06-13       Impact factor: 3.291

2.  Rhizodegradation of PAHs differentially altered by C3 and C4 plants.

Authors:  Anithadevi Kenday Sivaram; Suresh Ramraj Subashchandrabose; Panneerselvan Logeshwaran; Robin Lockington; Ravi Naidu; Mallavarapu Megharaj
Journal:  Sci Rep       Date:  2020-09-30       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.