Literature DB >> 30052772

The Terabase Search Engine: a large-scale relational database of short-read sequences.

Richard Wilton1, Sarah J Wheelan2,3, Alexander S Szalay1,4, Steven L Salzberg3,4,5,6.   

Abstract

MOTIVATION: DNA sequencing archives have grown to enormous scales in recent years, and thousands of human genomes have already been sequenced. The size of these data sets has made searching the raw read data infeasible without high-performance data-query technology. Additionally, it is challenging to search a repository of short-read data using relational logic and to apply that logic across samples from multiple whole-genome sequencing samples.
RESULTS: We have built a compact, efficiently-indexed database that contains the raw read data for over 250 human genomes, encompassing trillions of bases of DNA, and that allows users to search these data in real-time. The Terabase Search Engine enables retrieval from this database of all the reads for any genomic location in a matter of seconds. Users can search using a range of positions or a specific sequence that is aligned to the genome on the fly.
AVAILABILITY AND IMPLEMENTATION: Public access to the Terabase Search Engine database is available at http://tse.idies.jhu.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2019        PMID: 30052772      PMCID: PMC6379032          DOI: 10.1093/bioinformatics/bty657

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  Data-Rich Spatial Profiling of Cancer Tissue: Astronomy Informs Pathology.

Authors:  Alexander S Szalay; Janis M Taube
Journal:  Clin Cancer Res       Date:  2022-08-15       Impact factor: 13.801

2.  Bio-Strings: A Relational Database Data-Type for Dealing with Large Biosequences.

Authors:  Sergio Lifschitz; Edward H Haeusler; Marcos Catanho; Antonio B de Miranda; Elvismary Molina de Armas; Alexandre Heine; Sergio G M P Moreira; Cristian Tristão
Journal:  BioTech (Basel)       Date:  2022-07-30

3.  PIDS: A User-Friendly Plant DNA Fingerprint Database Management System.

Authors:  Bin Jiang; Yikun Zhao; Hongmei Yi; Yongxue Huo; Haotian Wu; Jie Ren; Jianrong Ge; Jiuran Zhao; Fengge Wang
Journal:  Genes (Basel)       Date:  2020-03-30       Impact factor: 4.096

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.