Literature DB >> 1468013

Can we recover a sequence, just knowing all its subsequences of given length?

A Guénoche1.   

Abstract

The problem tackled here concerns the feasibility of DNA sequencing using hybridization methods. We establish algorithms for and computational limitations to the reconstruction of a sequence from all its subsequences having the same length: in other words, the building of a string that contains all the words of a given set, and only these ones. Generally there are several possible strings. We refer to graph theory and propose an algorithm to enumerate all the strings that are solutions. We then carried out stimulations using real DNA sequences. They provided some necessary conditions and give some upper bounds to the length of the sequence to recover in relation with the length of oligonucleotides. To avoid limiting ourselves to problems that admit a unique solution, we introduce another algorithm that produces a signature for each solution string. Each signature can be tested to determine which one belongs to the correct sequence.

Entities:  

Mesh:

Year:  1992        PMID: 1468013     DOI: 10.1093/bioinformatics/8.6.569

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  1 in total

1.  Assembly complexity of prokaryotic genomes using short reads.

Authors:  Carl Kingsford; Michael C Schatz; Mihai Pop
Journal:  BMC Bioinformatics       Date:  2010-01-12       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.