Literature DB >> 30999844

SQUAT: a Sequencing Quality Assessment Tool for data quality assessments of genome assemblies.

Li-An Yang1, Yu-Jung Chang2, Shu-Hwa Chen1, Chung-Yen Lin1, Jan-Ming Ho1,3.   

Abstract

BACKGROUND: With the rapid increase in genome sequencing projects for non-model organisms, numerous genome assemblies are currently in progress or available as drafts, but not made available as satisfactory, usable genomes. Data quality assessment of genome assemblies is gaining importance not only for people who perform the assembly/re-assembly processes, but also for those who attempt to use assemblies as maps in downstream analyses. Recent studies of the quality control, quality evaluation/ assessment of genome assemblies have focused on either quality control of reads before assemblies or evaluation of the assemblies with respect to their contiguity and correctness. However, correctness assessment depends on a reference and is not applicable for de novo assembly projects. Hence, development of methods providing both post-assembly and pre-assembly quality assessment reports for examining the quality/correctness of de novo assemblies and the input reads is worth studying.
RESULTS: We present SQUAT, an efficient tool for both pre-assembly and post-assembly quality assessment of de novo genome assemblies. The pre-assembly module of SQUAT computes quality statistics of reads and presents the analysis in a well-designed interface to visualize the distribution of high- and poor-quality reads in a portable HTML report. The post-assembly module of SQUAT provides read mapping analytics in an HTML format. We categorized reads into several groups including uniquely mapped reads, multiply mapped, unmapped reads; for uniquely mapped reads, we further categorized them into perfectly matched, with substitutions, containing clips, and the others. We carefully defined the poorly mapped (PM) reads into several groups to prevent the underestimation of unmapped reads; indeed, a high PM% would be a sign of a poor assembly that requires researchers' attention for further examination or improvements before using the assembly. Finally, we evaluate SQUAT with six datasets, including the genome assemblies for eel, worm, mushroom, and three bacteria. The results show that SQUAT reports provide useful information with details for assessing the quality of assemblies and reads. AVAILABILITY: The SQUAT software with links to both its docker image and the on-line manual is freely available at https://github.com/luke831215/SQUAT .

Entities:  

Keywords:  Data quality assessment; Genome assembly; Genome sequencing; Non-model organisms

Mesh:

Year:  2019        PMID: 30999844     DOI: 10.1186/s12864-019-5445-3

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


  8 in total

1.  Leishmania guyanensis M4147 as a new LRV1-bearing model parasite: Phosphatidate phosphatase 2-like protein controls cell cycle progression and intracellular lipid content.

Authors:  Alexandra Zakharova; Amanda T S Albanaz; Fred R Opperdoes; Ingrid Škodová-Sveráková; Diana Zagirova; Andreu Saura; Lˇubomíra Chmelová; Evgeny S Gerasimov; Tereza Leštinová; Tomáš Bečvář; Jovana Sádlová; Petr Volf; Julius Lukeš; Anton Horváth; Anzhelika Butenko; Vyacheslav Yurchenko
Journal:  PLoS Negl Trop Dis       Date:  2022-06-24

2.  European maize genomes highlight intraspecies variation in repeat and gene content.

Authors:  Georg Haberer; Nadia Kamal; Eva Bauer; Heidrun Gundlach; Iris Fischer; Michael A Seidel; Manuel Spannagl; Caroline Marcon; Alevtina Ruban; Claude Urbany; Adnane Nemri; Frank Hochholdinger; Milena Ouzunova; Andreas Houben; Chris-Carolin Schön; Klaus F X Mayer
Journal:  Nat Genet       Date:  2020-07-27       Impact factor: 38.330

3.  Twelve quick steps for genome assembly and annotation in the classroom.

Authors:  Hyungtaek Jung; Tomer Ventura; J Sook Chung; Woo-Jin Kim; Bo-Hye Nam; Hee Jeong Kong; Young-Ok Kim; Min-Seung Jeon; Seong-Il Eyun
Journal:  PLoS Comput Biol       Date:  2020-11-12       Impact factor: 4.475

4.  Comparative Genomics Supports That Brazilian Bioethanol Saccharomyces cerevisiae Comprise a Unified Group of Domesticated Strains Related to Cachaça Spirit Yeasts.

Authors:  Ana Paula Jacobus; Timothy G Stephens; Pierre Youssef; Raul González-Pech; Michael M Ciccotosto-Camp; Katherine E Dougan; Yibi Chen; Luiz Carlos Basso; Jeverson Frazzon; Cheong Xin Chan; Jeferson Gross
Journal:  Front Microbiol       Date:  2021-04-15       Impact factor: 5.640

5.  Impact of short-read sequencing on the misassembly of a plant genome.

Authors:  Peipei Wang; Fanrui Meng; Bethany M Moore; Shin-Han Shiu
Journal:  BMC Genomics       Date:  2021-02-02       Impact factor: 3.969

6.  The genome of the Paleogene relic tree Bretschneidera sinensis: insights into trade-offs in gene family evolution, demographic history, and adaptive SNPs.

Authors:  Hai-Lin Liu; A J Harris; Zheng-Feng Wang; Hong-Feng Chen; Zhi-An Li; Xiao Wei
Journal:  DNA Res       Date:  2022-01-28       Impact factor: 4.477

7.  The genome of the forest insect pest Pissodes strobi reveals genome expansion and evidence of a Wolbachia endosymbiont.

Authors:  Kristina K Gagalova; Justin G A Whitehill; Luka Culibrk; Diana Lin; Véronique Lévesque-Tremblay; Christopher I Keeling; Lauren Coombe; Macaire M S Yuen; Inanç Birol; Jörg Bohlmann; Steven J M Jones
Journal:  G3 (Bethesda)       Date:  2022-04-04       Impact factor: 3.154

8.  The genome of an apodid holothuroid (Chiridota heheva) provides insights into its adaptation to a deep-sea reducing environment.

Authors:  Long Zhang; Jian He; Peipei Tan; Zhen Gong; Shiyu Qian; Yuanyuan Miao; Han-Yu Zhang; Guangxian Tu; Qi Chen; Qiqi Zhong; Guanzhu Han; Jianguo He; Muhua Wang
Journal:  Commun Biol       Date:  2022-03-10
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.