Literature DB >> 33500773

Seqpare: a novel metric of similarity between genomic interval sets.

Selena C Feng1, Nathan C Sheffield2, Jianglin Feng2,3.   

Abstract

Searching genomic interval sets produced by sequencing methods has been widely and routinely performed; however, existing metrics for quantifying similarities among interval sets are inconsistent. Here we introduce Seqpare, a self-consistent and effective metric of similarity and tool for comparing sequences based on their interval sets. With this metric, the similarity of two interval sets is quantified by a single index, the ratio of their effective overlap over the union: an index of zero indicates unrelated interval sets, and an index of one means that the interval sets are identical. Analysis and tests confirm the effectiveness and self-consistency of the Seqpare metric. Copyright:
© 2021 Feng SC et al.

Entities:  

Keywords:  Genome analysis; algorithm; interval set; sequence comparison; similarity metric

Year:  2020        PMID: 33500773      PMCID: PMC7808057          DOI: 10.12688/f1000research.23390.2

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


  10 in total

1.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

2.  Tabix: fast retrieval of sequence features from generic TAB-delimited files.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2011-01-05       Impact factor: 6.937

3.  Galaxy: a platform for interactive large-scale genome analysis.

Authors:  Belinda Giardine; Cathy Riemer; Ross C Hardison; Richard Burhans; Laura Elnitski; Prachi Shah; Yi Zhang; Daniel Blankenberg; Istvan Albert; James Taylor; Webb Miller; W James Kent; Anton Nekrutenko
Journal:  Genome Res       Date:  2005-09-16       Impact factor: 9.043

4.  Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases.

Authors:  Alexander V Alekseyenko; Christopher J Lee
Journal:  Bioinformatics       Date:  2007-01-18       Impact factor: 6.937

5.  fjoin: simple and efficient computation of feature overlaps.

Authors:  Joel E Richardson
Journal:  J Comput Biol       Date:  2006-10       Impact factor: 1.479

6.  BEDOPS: high-performance genomic feature operations.

Authors:  Shane Neph; M Scott Kuehn; Alex P Reynolds; Eric Haugen; Robert E Thurman; Audra K Johnson; Eric Rynes; Matthew T Maurano; Jeff Vierstra; Sean Thomas; Richard Sandstrom; Richard Humbert; John A Stamatoyannopoulos
Journal:  Bioinformatics       Date:  2012-05-09       Impact factor: 6.937

7.  GIGGLE: a search engine for large-scale integrated genome analysis.

Authors:  Ryan M Layer; Brent S Pedersen; Tonya DiSera; Gabor T Marth; Jason Gertz; Aaron R Quinlan
Journal:  Nat Methods       Date:  2018-01-08       Impact factor: 28.547

8.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

9.  Augmented Interval List: a novel data structure for efficient genomic interval search.

Authors:  Jianglin Feng; Aakrosh Ratan; Nathan C Sheffield
Journal:  Bioinformatics       Date:  2019-12-01       Impact factor: 6.931

10.  LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor.

Authors:  Nathan C Sheffield; Christoph Bock
Journal:  Bioinformatics       Date:  2015-10-27       Impact factor: 6.937

  10 in total
  1 in total

1.  Bedshift: perturbation of genomic interval sets.

Authors:  Aaron Gu; Hyun Jae Cho; Nathan C Sheffield
Journal:  Genome Biol       Date:  2021-08-20       Impact factor: 13.583

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.