Literature DB >> 30993316

A new statistic for efficient detection of repetitive sequences.

Sijie Chen1, Yixin Chen1, Fengzhu Sun2,3, Michael S Waterman1,2,3, Xuegong Zhang1,4.   

Abstract

MOTIVATION: Detecting sequences containing repetitive regions is a basic bioinformatics task with many applications. Several methods have been developed for various types of repeat detection tasks. An efficient generic method for detecting most types of repetitive sequences is still desirable. Inspired by the excellent properties and successful applications of the D2 family of statistics in comparative analyses of genomic sequences, we developed a new statistic D2R that can efficiently discriminate sequences with or without repetitive regions.
RESULTS: Using the statistic, we developed an algorithm of linear time and space complexity for detecting most types of repetitive sequences in multiple scenarios, including finding candidate clustered regularly interspaced short palindromic repeats regions from bacterial genomic or metagenomics sequences. Simulation and real data experiments show that the method works well on both assembled sequences and unassembled short reads.
AVAILABILITY AND IMPLEMENTATION: The codes are available at https://github.com/XuegongLab/D2R_codes under GPL 3.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2019        PMID: 30993316      PMCID: PMC7963086          DOI: 10.1093/bioinformatics/btz262

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  46 in total

1.  Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching.

Authors:  Pan Du; Warren A Kibbe; Simon M Lin
Journal:  Bioinformatics       Date:  2006-07-04       Impact factor: 6.937

2.  MView: a web-compatible database search or multiple alignment viewer.

Authors:  N P Brown; C Leroy; C Sander
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

3.  Gut metagenome in European women with normal, impaired and diabetic glucose control.

Authors:  Fredrik H Karlsson; Valentina Tremaroli; Intawat Nookaew; Göran Bergström; Carl Johan Behre; Björn Fagerberg; Jens Nielsen; Fredrik Bäckhed
Journal:  Nature       Date:  2013-05-29       Impact factor: 49.962

4.  Multiplex genome engineering using CRISPR/Cas systems.

Authors:  Le Cong; F Ann Ran; David Cox; Shuailiang Lin; Robert Barretto; Naomi Habib; Patrick D Hsu; Xuebing Wu; Wenyan Jiang; Luciano A Marraffini; Feng Zhang
Journal:  Science       Date:  2013-01-03       Impact factor: 47.728

5.  RepARK--de novo creation of repeat libraries from whole-genome NGS reads.

Authors:  Philipp Koch; Matthias Platzer; Bryan R Downie
Journal:  Nucleic Acids Res       Date:  2014-03-14       Impact factor: 16.971

6.  Systematic discovery of antiphage defense systems in the microbial pangenome.

Authors:  Shany Doron; Sarah Melamed; Gal Ofir; Azita Leavitt; Anna Lopatina; Mai Keren; Gil Amitai; Rotem Sorek
Journal:  Science       Date:  2018-01-25       Impact factor: 47.728

7.  One size does not fit all: on how Markov model order dictates performance of genomic sequence analyses.

Authors:  Leelavati Narlikar; Nidhi Mehta; Sanjeev Galande; Mihir Arjunwadkar
Journal:  Nucleic Acids Res       Date:  2012-12-24       Impact factor: 16.971

Review 8.  Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial).

Authors:  Adina Howe; Patrick S G Chain
Journal:  Front Microbiol       Date:  2015-07-09       Impact factor: 5.640

9.  Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

Authors:  Tatiana C Mangericao; Zhanhao Peng; Xuegong Zhang
Journal:  BMC Syst Biol       Date:  2016-01-11

10.  REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads.

Authors:  Chong Chu; Rasmus Nielsen; Yufeng Wu
Journal:  PLoS One       Date:  2016-03-15       Impact factor: 3.240

View more
  2 in total

1.  Genomic sequence capture of Plasmodium relictum in experimentally infected birds.

Authors:  Vincenzo A Ellis; Victor Kalbskopf; Arif Ciloglu; Mélanie Duc; Xi Huang; Abdullah Inci; Staffan Bensch; Olof Hellgren; Vaidas Palinauskas
Journal:  Parasit Vectors       Date:  2022-07-29       Impact factor: 4.047

2.  BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data.

Authors:  Jinxiang Chen; Fuyi Li; Miao Wang; Junlong Li; Tatiana T Marquez-Lago; André Leier; Jerico Revote; Shuqin Li; Quanzhong Liu; Jiangning Song
Journal:  Front Big Data       Date:  2022-01-18
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.