Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Querying the public databases for sequences using complex keywords contained in the feature lines.

Literature DB >> 16441875

Querying the public databases for sequences using complex keywords contained in the feature lines.

Olivier Croce¹, Michaël Lamarre, Richard Christen.

Abstract

BACKGROUND: High throughput technologies often require the retrieval of large data sets of sequences. Retrieval of EMBL or GenBank entries using keywords is easy using tools such as ACNUC, Entrez or SRS, but has some limitations, in particular when querying with complex keywords.
RESULTS: We show that Entrez has severe limitations with respect to retrieving subsequences. SRS works well with simple keywords but not with keywords composed of several terms, and has problems with complex queries. ACNUC works well, but does not allow precise queries in the Feature qualifiers. We developed specific Perl scripts to precisely retrieve subsequences as defined by complex descriptors in the Features qualifiers of the EMBL entries. We improved parts of the bioPerl library to allow parsing of large data files, and we embedded these scripts in a user friendly interface (OS independent) for easy use.
CONCLUSION: Although not as fast as the public tools that use prebuilt indexes, parsing the complete entries using a script is often necessary in order to retrieve the exact data searched for. Embedding in a user friendly interface allows biologists to use the scripts, which can easily be modified, if necessary, by bioinformaticians for unforeseen needs.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Proteins

Year: 2006 PMID： 16441875 PMCID： PMC1403806 DOI： 10.1186/1471-2105-7-45

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

19 in total

1. GeneRecords: a relational database for GenBank flat file parsing and data manipulation in personal computers.

Authors: P D'Addabbo; L Lenzi; F Facchin; R Casadei; S Canaider; L Vitale; F Frabetti; P Carinci; M Zannotti; P Strippoli
Journal: Bioinformatics Date: 2004-05-14 Impact factor: 6.937

2. Phylogenetic screening of ribosomal RNA gene-containing clones in Bacterial Artificial Chromosome (BAC) libraries from different depths in Monterey Bay.

Authors: M T Suzuki; C M Preston; O Béjà; J R de la Torre; G F Steward; E F DeLong
Journal: Microb Ecol Date: 2004-10-14 Impact factor: 4.552

3. Use of PCR targeting of internal transcribed spacer regions and single-stranded conformation polymorphism analysis of sequence variation in different regions of rrna genes in fungi for rapid diagnosis of mycotic keratitis.

Authors: Manish Kumar; P K Shukla
Journal: J Clin Microbiol Date: 2005-02 Impact factor: 5.948

4. Identification of staphylococci by 16S internal transcribed spacer rRNA gene restriction fragment length polymorphism.

Authors: Mert Sudagidan; A Fazil Yenidunya; Hatice Gunes
Journal: J Med Microbiol Date: 2005-09 Impact factor: 2.472

5. Entrez: molecular biology database and retrieval system.

Authors: G D Schuler; J A Epstein; H Ohkawa; J A Kans
Journal: Methods Enzymol Date: 1996 Impact factor: 1.600

6. Fast protocols for the 5S rDNA and ITS-2 based identification of Oenococcus oeni.

Authors: Steffen Hirschhäuser; Jürgen Fröhlich; Armin Gneipel; Inge Schönig; Helmut König
Journal: FEMS Microbiol Lett Date: 2005-03-01 Impact factor: 2.742

7. SRS--an indexing and retrieval tool for flat file data libraries.

Authors: T Etzold; P Argos
Journal: Comput Appl Biosci Date: 1993-02

8. ACNUC--a portable retrieval system for nucleic acid sequence databases: logical and physical designs and usage.

Authors: M Gouy; C Gautier; M Attimonelli; C Lanave; G di Paola
Journal: Comput Appl Biosci Date: 1985-09

9. Fungal diversity in rock beneath a crustose lichen as revealed by molecular markers.

Authors: Torbjørg Bjelland; Stefan Ekman
Journal: Microb Ecol Date: 2005-07-29 Impact factor: 4.552

10. Oligonucleotide microarray for identification of Bacillus anthracis based on intergenic transcribed spacers in ribosomal DNA.

Authors: Ulrich Nübel; Peter M Schmidt; Edda Reiss; Frank Bier; Wolfgang Beyer; Dieter Naumann
Journal: FEMS Microbiol Lett Date: 2004-11-15 Impact factor: 2.742

3 in total

Review 1. PseudoMLSA: a database for multigenic sequence analysis of Pseudomonas species.

Authors: Antoni Bennasar; Magdalena Mulet; Jorge Lalucat; Elena García-Valdés
Journal: BMC Microbiol Date: 2010-04-21 Impact factor: 3.605

2. Preliminary analysis of length and GC content variation in the ribosomal first internal transcribed spacer (ITS1) of marine animals.

Authors: S Chow; Y Ueno; M Toyokawa; I Oohara; H Takeyama
Journal: Mar Biotechnol (NY) Date: 2008-10-21 Impact factor: 3.619

3. UbiProt: a database of ubiquitylated proteins.

Authors: Alexander L Chernorudskiy; Alejandro Garcia; Eugene V Eremin; Anastasia S Shorina; Ekaterina V Kondratieva; Murat R Gainullin
Journal: BMC Bioinformatics Date: 2007-04-18 Impact factor: 3.169

3 in total