Literature DB >> 21584190

A database of six eukaryotic hypothetical genes and proteins.

Katika Prabhakara Surya Adinarayana, Tanuka Sai Sravani, Chamarthi Hareesh.   

Abstract

UNLABELLED: Assigning functions to proteins of unknown function is of considerable interest to the proteomic researchers as the genes encoding them are conserved over various species. Here, we describe HypoDB, a database of hypothetical genes and proteins in six eukaryotes. The database was collected and organized based on the number of entries in each chromosome with few annotations. Hypothetical protein database contains information related to gene and protein sequences, chromosome number and location, secondary and tertiary structure related data. AVAILABILITY: The database is available for free at http://www.trimslabs.com/database/hypodb/index.html.

Entities:  

Year:  2011        PMID: 21584190      PMCID: PMC3089888          DOI: 10.6026/97320630006128

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Data pertaining to hypothetical proteins expressed in many eukaryotes would help researchers to search for potential proteins of interest with unknown functions [1]. However, many such hypothetical protein encoding genes are conserved over various species, which can be revealed from comparative genome analysis [2-4]. To predict a function for each of the protein coding regions, a comparative sequence analysis against all functionally elucidated sequences in protein sequence databases would reveal the necessary information for sequence retrieval, functional prediction and homologous sequences [5, 6], further which, multiple sequence alignments would reveal possible functional insights on cellular process or biological function [9, 10]. A hypothetical protein showing one or more significant structural homolog is predicted to have similar molecular properties [7, 8]. On the other hand, conserved hypothetical proteins are found in both prokaryotes and eukaryotes, the function of which can be predicted by domain homology searches, secondary, tertiary structure predictions, and gene annotations. Hence, data on hypothetical proteins from NCBI database was collected and organized in the form of a database using html and javascript. The database contains information regarding gene/protein sequences, chromosome number and location, secondary and tertiary structure information, ProFunc server data, primary analysis tools (mol.wt, ionization constant etc.), expression levels of the sequences and related data.

Methodology

Construction of database

HypoDB is constructed using html and JavaScript and can be accessed at http://www.trimslabs.com/database/hypodb/index.html. Data were collected from NCBI GenBank and SWISS-PROT databases. HypoDB includes hypothetical proteins of 8 organisms. The complete list of organisms with their scientific and general names was given in (see Table 1). They are provided as records and organized to simplify the task of finding relevant data for proteins in the related organism. In order to make the database available online, HTML pages are constructed using Javascript. Hypothetical protein database contains information on hypothetical gene and protein sequences in the form of records. The data were categorized based on the number of hypothetical genes and proteins in each chromosome of six eukaryotes. Each record when accessed returns the nucleotide and protein sequence and annotation such as accession numbers, source organism and chromosome number. An example of an entry in human chromosome 1, LOC100131311 is given in Table 2 (see Table 2).

Utility

The database is of much utility to researchers working in the fields of functional proteomics and genomics. Such data on hypothetical genes and proteins represents a prominent research area to annotate the genes of interest and predict functional regions. However, given the insight into the technological advances in bioinformatics, function prediction and assigning functionally important sites within the protein sequence is advantageous to identify the mutations that might have resulted to unknown function of the particular gene. Therefore, this database of hypothetical genes and proteins would be a useful source to study or predict the functional regions of a protein. Data was segregated based on the number of entries in each chromosome of six eukaryotes, provided with an easy way of access.
  10 in total

1.  DBcat: a catalog of 500 biological databases.

Authors:  C Discala; X Benigni; E Barillot; G Vaysseix
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.

Authors:  C A Wilson; J Kreychman; M Gerstein
Journal:  J Mol Biol       Date:  2000-03-17       Impact factor: 5.469

Review 3.  Biological function made crystal clear - annotation of hypothetical proteins via structural genomics.

Authors:  E Eisenstein; G L Gilliland; O Herzberg; J Moult; J Orban; R J Poljak; L Banerjei; D Richardson; A J Howard
Journal:  Curr Opin Biotechnol       Date:  2000-02       Impact factor: 9.740

4.  MBGD: microbial genome database for comparative analysis.

Authors:  Ikuo Uchiyama
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

5.  Identification of protein biochemical functions by similarity search using the molecular surface database eF-site.

Authors:  Kengo Kinoshita; Haruki Nakamura
Journal:  Protein Sci       Date:  2003-08       Impact factor: 6.725

6.  Crystal structure of hypothetical protein TTHB192 from Thermus thermophilus HB8 reveals a new protein family with an RNA recognition motif-like domain.

Authors:  Akio Ebihara; Min Yao; Ryoji Masui; Isao Tanaka; Shigeyuki Yokoyama; Seiki Kuramitsu
Journal:  Protein Sci       Date:  2006-05-02       Impact factor: 6.725

Review 7.  You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes.

Authors:  W F Doolittle
Journal:  Trends Genet       Date:  1998-08       Impact factor: 11.639

8.  RNA splicing of bacterial genes in eukaryotes.

Authors:  E Lorbach; Z Wang; P Dröge
Journal:  Biol Chem       Date:  1998-11       Impact factor: 3.915

9.  Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics.

Authors:  T I Zarembinski; L W Hung; H J Mueller-Dieckmann; K K Kim; H Yokota; R Kim; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-22       Impact factor: 11.205

10.  Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples.

Authors:  Zhandong Liu; Santosh S Venkatesh; Carlo C Maley
Journal:  BMC Genomics       Date:  2008-10-30       Impact factor: 3.969

  10 in total
  1 in total

1.  In silico functional and tumor suppressor role of hypothetical protein PCNXL2 with regulation of the Notch signaling pathway.

Authors:  Muhammad Naveed; Komal Imran; Ayesha Mushtaq; Abdul Samad Mumtaz; Hussnain A Janjua; Nauman Khalid
Journal:  RSC Adv       Date:  2018-06-12       Impact factor: 4.036

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.