Literature DB >> 30334805

MEC: Misassembly Error Correction in contigs based on distribution of paired-end reads and statistics of GC-contents.

Binbin Wu, Min Li, Xingyu Liao, Junwei Luo, Fangxiang Wu, Yi Pan, Jianxin Wang.   

Abstract

The de novo assembly tools aim at reconstructing genomes from next-generation sequencing (NGS) data. However, the assembly tools usually generate a large amount of contigs containing many misassemblies, which are caused by problems of repetitive regions, chimeric reads and sequencing errors. As they can improve the accuracy of assembly results, detecting and correcting the misassemblies in contigs are appealing, yet challenging. In this study, a novel method, called MEC, is proposed to identify and correct misassemblies in contigs. Based on the insert size distribution of paired-end reads and the statistical analysis of GC-contents, MEC can identify more misassemblies accurately. We evaluate our MEC with the metrics (NA50, NGA50) on four datasets, compared it with the most available misassembly correction tools, and carry out experiments to analyze the influence of MEC on scaffolding results, which shows that MEC can reduce misassemblies effectively and result in quantitative improvements in scaffolding quality. MEC is publicly available at https://github.com/bioinfomaticsCSU/MEC.

Entities:  

Year:  2018        PMID: 30334805     DOI: 10.1109/TCBB.2018.2876855

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  3 in total

1.  RepAHR: an improved approach for de novo repeat identification by assembly of the high-frequency reads.

Authors:  Xingyu Liao; Xin Gao; Xiankai Zhang; Fang-Xiang Wu; Jianxin Wang
Journal:  BMC Bioinformatics       Date:  2020-10-19       Impact factor: 3.169

2.  A Sequence-Based Novel Approach for Quality Evaluation of Third-Generation Sequencing Reads.

Authors:  Wenjing Zhang; Neng Huang; Jiantao Zheng; Xingyu Liao; Jianxin Wang; Hong-Dong Li
Journal:  Genes (Basel)       Date:  2019-01-14       Impact factor: 4.096

3.  MAC: Merging Assemblies by Using Adjacency Algebraic Model and Classification.

Authors:  Li Tang; Min Li; Fang-Xiang Wu; Yi Pan; Jianxin Wang
Journal:  Front Genet       Date:  2020-01-31       Impact factor: 4.599

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.