Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Adjust quality scores from alignment and improve sequencing accuracy.

Literature DB >> 15459287

Adjust quality scores from alignment and improve sequencing accuracy.

Abstract

In shotgun sequencing, statistical reconstruction of a consensus from alignment requires a model of measurement error. Churchill and Waterman proposed one such model and an expectation-maximization (EM) algorithm to estimate sequencing error rates for each assembly matrix. Ewing and Green defined Phred quality scores for base-calling from sequencing traces by training a model on a large amount of data. However, sample preparations and sequencing machines may work under different conditions in practice and therefore quality scores need to be adjusted. Moreover, the information given by quality scores is incomplete in the sense that they do not describe error patterns. We observe that each nucleotide base has its specific error pattern that varies across the range of quality values. We develop models of measurement error for shotgun sequencing by combining the two perspectives above. We propose a logistic model taking quality scores as covariates. The model is trained by a procedure combining an EM algorithm and model selection techniques. The training results in calibration of quality values and leads to a more accurate construction of consensus. Besides Phred scores obtained from ABI sequencers, we apply the same technique to calibrate quality values that come along with Beckman sequencers.

Entities: Disease Species

Mesh：

Year: 2004 PMID： 15459287 PMCID： PMC521663 DOI： 10.1093/nar/gkh850

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

5 in total

1. The accuracy of DNA sequences: estimating sequence quality.

Authors: G A Churchill; M S Waterman
Journal: Genomics Date: 1992-09 Impact factor: 5.736

Review 2. Assessing the quality of the DNA sequence from the Human Genome Project.

Authors: A Felsenfeld; J Peterson; J Schloss; M Guyer
Journal: Genome Res Date: 1999-01 Impact factor: 9.043

3. Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors: B Ewing; L Hillier; M C Wendl; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

4. Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors: B Ewing; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

5. The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences.

Authors: J Parkhill; B W Wren; K Mungall; J M Ketley; C Churcher; D Basham; T Chillingworth; R M Davies; T Feltwell; S Holroyd; K Jagels; A V Karlyshev; S Moule; M J Pallen; C W Penn; M A Quail; M A Rajandream; K M Rutherford; A H van Vliet; S Whitehead; B G Barrell
Journal: Nature Date: 2000-02-10 Impact factor: 49.962

5 in total

15 in total

1. Inference of population genetic parameters in metagenomics: a clean look at messy data.

Authors: Philip L F Johnson; Montgomery Slatkin
Journal: Genome Res Date: 2006-09-05 Impact factor: 9.043

2. Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing.

Authors: Wei Qu; Shin-Ichi Hashimoto; Shinichi Morishita
Journal: Genome Res Date: 2009-05-13 Impact factor: 9.043

3. SEME: a fast mapper of Illumina sequencing reads with statistical evaluation.

Authors: Shijian Chen; Anqi Wang; Lei M Li
Journal: J Comput Biol Date: 2013-11 Impact factor: 1.479

4. ComB: SNP calling and mapping analysis for color and nucleotide space platforms.

Authors: Tade Souaiaia; Zach Frazier; Ting Chen
Journal: J Comput Biol Date: 2011-05-12 Impact factor: 1.479

5. PhredEM: a phred-score-informed genotype-calling approach for next-generation sequencing studies.

Authors: Peizhou Liao; Glen A Satten; Yi-Juan Hu
Journal: Genet Epidemiol Date: 2017-05-31 Impact factor: 2.135

6. Next generation sequencing technologies and the changing landscape of phage genomics.

Authors: Jochen Klumpp; Derrick E Fouts; Shanmuga Sozhamannan
Journal: Bacteriophage Date: 2012-07-01

7. A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors: Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal: Nat Genet Date: 2011-04-10 Impact factor: 38.330

8. Error and error mitigation in low-coverage genome assemblies.

Authors: Melissa J Hubisz; Michael F Lin; Manolis Kellis; Adam Siepel
Journal: PLoS One Date: 2011-02-14 Impact factor: 3.240

9. Next generation sequence analysis and computational genomics using graphical pipeline workflows.

Authors: Federica Torri; Ivo D Dinov; Alen Zamanyan; Sam Hobel; Alex Genco; Petros Petrosyan; Andrew P Clark; Zhizhong Liu; Paul Eggert; Jonathan Pierce; James A Knowles; Joseph Ames; Carl Kesselman; Arthur W Toga; Steven G Potkin; Marquis P Vawter; Fabio Macciardi
Journal: Genes (Basel) Date: 2012-08-30 Impact factor: 4.096

10. FadE: whole genome methylation analysis for multiple sequencing platforms.

Authors: Tade Souaiaia; Zheng Zhang; Ting Chen
Journal: Nucleic Acids Res Date: 2012-09-10 Impact factor: 16.971