Literature DB >> 23055622

OrFin: A web tool for detection of putative orthologs.

Mohit Midha1, Raja Polavarapu, Potshangbam Angamba Meetei, Hari Krishnan, Krishnaveni Mohareer, Vaibhav Vindal.   

Abstract

UNLABELLED: Identification of ortholog is one of the important tasks to understand a novel genome. It helps to assign functional annotations, from one organism to another organism. To identify the putative ortholog, Reciprocal Best BLAST hit (RBBH) method is known to be an efficient approach. OrFin makes use of the same approach to identify pair of orthologous proteins for a given set of sequences of two species. It is a user-friendly web tool which works with user defined parameters to search RBBHs. Results are produced in both html and text format. AVAILABILITY: This web tool is freely available at http://bifl.uohyd.ac.in/orfin.

Entities:  

Keywords:  Bioinformatics Tool; Ortholog prediction

Year:  2012        PMID: 23055622      PMCID: PMC3449379          DOI: 10.6026/97320630008738

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

In the post-genomic era, genome-sequencing projects are progressing at a fast pace. Hence the availability of the genome sequences has generated vast sequence data that is available via public domain databases. Complete genome sequences pose challenges and opportunities to a computational biologist to understand the genome function and its complexity [1-3]. To explore the hidden genomic information, building the relationship between the genomes of various species is an important step to begin. Ortholog identification is used efficiently to compare and to understand the functional aspect of un-annotated genes present in the genome. Reciprocal Best BLAST Hits (RBBHs) is known to be an efficient approach to identify the orthologs [4-6]. Our web tool, OrFin also makes use of this approach to identify pair of orthologous proteins for a given set of two proteomes. It allows user to alter the criteria to retrieve the RBBHs. It is user friendly with a web interface that will have potential implications to assist features associated with orthologous proteins.

Algorithm

The web server takes the input sequences for two organisms and returns the orthologous pair of proteins. Firstly, it filters the identical proteins, if any, followed by Reciprocal Best Blast Hits methods to retrieve the orthologs. Ortholog search for these multiple identical proteins is carried out with one of the protein sequence as a representative. Later in the results, representative protein is replaced with actual protein/ORFs along with their identified orthologous proteins. User defined parameters are incorporated in this web-tool. Flow chart of the methodology followed is depicted in Figure 1.
Figure 1

Flowchart illustrating the methodology

Web interface

The web interface is provided through PHP. In client tier, we have allowed the user to go through different options, e.g. selection of listed organisms or to provide organism details. In later case files (.faa) are to be uploaded. After successful submission of the data our application will retrieve the orthologous pairs of proteins. It also allows the user to choose E-Value for BLAST and alignment length criteria to retrieve the RBBHs. Additionally, user can also upload genome coordinate files (.ptt) for each of the genomes submitted in order to retrieve a well formatted output which can be Protein IDs, ORF IDs and their combination with gene names. Apache HTTP Web Server is used for the management of communication between different tiers or layers of the application. Programs running at the backend of OrFin have been written using Perl-CGI scripts. At the backend, mysql database server is used for storing all the relevant data and user outputs. An example of executing a job using OrFin server has been shown in Figure 2.
Figure 2

Screenshot of OrFin with input parameters and results as an example

Applications

OrFin is built to identify the putative orthologous proteins between two proteomes. It will have potential implications to employ comparative genomics to assist not only in functional annotation but also in phylogenetic footprinting.
  6 in total

1.  A post-genomic perspective.

Authors:  D B Young
Journal:  Nat Med       Date:  2001-01       Impact factor: 53.440

2.  Using orthologous and paralogous proteins to identify specificity determining residues.

Authors:  Leonid A Mirny; Mikhail S Gelfand
Journal:  Genome Biol       Date:  2002-02-19       Impact factor: 13.583

Review 3.  Recent developments and future directions in computational genomics.

Authors:  S Tsoka; C A Ouzounis
Journal:  FEBS Lett       Date:  2000-08-25       Impact factor: 4.124

4.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

5.  Choosing BLAST options for better detection of orthologs as reciprocal best hits.

Authors:  Gabriel Moreno-Hagelsieb; Kristen Latimer
Journal:  Bioinformatics       Date:  2007-11-26       Impact factor: 6.937

6.  MycoRRdb: a database of computationally identified regulatory regions within intergenic sequences in mycobacterial genomes.

Authors:  Mohit Midha; Nirmal K Prasad; Vaibhav Vindal
Journal:  PLoS One       Date:  2012-04-26       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.