| Literature DB >> 26336600 |
Peter J A Cock1, John M Chilton2, Björn Grüning3, James E Johnson2, Nicola Soranzo4.
Abstract
BACKGROUND: The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters of the Galaxy web-based biomedical data analysis platform, integrating BLAST into Galaxy was a natural step for sequence comparison workflows.Entities:
Keywords: Accessibility; Annotation; BLAST; Galaxy; Pipeline; Reproducibility; Sequence analysis; Workflow
Mesh:
Year: 2015 PMID: 26336600 PMCID: PMC4557756 DOI: 10.1186/s13742-015-0080-7
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
NCBI BLAST+ Galaxy tools
| Galaxy tool name | Description | Reference(s) |
|---|---|---|
| NCBI BLAST+ blastp | Protein vs protein | [ |
| NCBI BLAST+ blastn | Nucleotide vs nucleotide | [ |
| NCBI BLAST+ blastx | Translated nucleotide vs protein | [ |
| NCBI BLAST+ tblastn | Protein vs translated nucleotide | [ |
| NCBI BLAST+ tblastx | Translated nucleotide vs translated nucleotide | [ |
| NCBI BLAST+ makeblastdb | Make BLAST nucleotide or protein database | [ |
| NCBI BLAST+ makeprofiledb | Make BLAST protein domain database | [ |
| NCBI BLAST+ blastdbcmd entry(s) | Extract sequence(s) from BLAST database | [ |
| NCBI BLAST+ blastdbcmd info | Show BLAST database information | [ |
| NCBI BLAST+ dustmasker | Nucleotide masking using the DUST algorithm | [ |
| NCBI BLAST+ segmasker | Protein masking using the SEG algorithm | [ |
| NCBI BLAST+ windowmasker | Window-based sequence masker | [ |
| NCBI BLAST+ convert2blastmask | Lowercase masking | [ |
| NCBI BLAST+ rpsblast | Protein vs protein domain | [ |
| NCBI BLAST+ rpstblastn | Translated nucleotide vs protein domain | [ |
Each row lists a separate Galaxy tool, all available from https://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus/ on the Galaxy Tool Shed [9]. A separate Galaxy tool is listed for each different underlying NCBI BLAST+ command line tool, except for the blastdbcmd command line tool, whose two main functions are represented as two separate Galaxy tools. We intend to add further wrappers later, including for the command line tools psiblast and deltablast
Additional Galaxy tools using NCBI BLAST+
| Galaxy tool name and URL | Description | Reference(s) |
|---|---|---|
| BLAST XML to tabular ( | Convert BLAST XML output into tabular output | [ |
| BLAST Reciprocal Best Hits (RBH) ( | Takes two FASTA inputs, returns table | This paper |
Each row lists a separate Galaxy tool, all available from the Galaxy Tool Shed [9]
Galaxy datatypes used or defined
| Galaxy datatype | Type | Description |
|---|---|---|
| tabular | Built-in | Tab-separated plain text table, used as default BLAST+ output |
| text | Built-in | Plain text, used for human-readable BLAST+ output |
| html | Built-in | Webpage, used for human-readable BLAST+ output with hyperlinks |
| blastxml | Add-on | BLAST XML output |
| blastdbn | Add-on | BLAST database of nucleotide sequences, e.g., for BLASTN |
| blastdbp | Add-on | BLAST database of protein sequences, e.g., for BLASTP |
| blastdbd | Add-on | BLAST database of protein domain PSSMs, e.g., for RPS-BLAST |
| maskinfo-asn1 | Add-on | BLAST masking information files as text ASN.1 |
| maskinfo-asn1-binary | Add-on | BLAST masking information files as binary ASN.1 |
Each row lists a separate Galaxy datatype, either available from the Galaxy Tool Shed [9] or already built into Galaxy
Fig. 1Galaxy workflow for finding gene clusters. Screenshot from the Galaxy Workflow Editor, showing a published example workflow [27] discussed in the Analyses section. Given two protein sequences, regions of a genome of interest are identified that contain tblastn matches to both sequences, which pinpoints candidate gene clusters for further study