Literature DB >> 17611616

Computational Biology and Bioinformatics: a tinge of Indian spice.

Abstract

Entities: Chemical Disease Gene Species

Year: 2006 PMID： 17611616 PMCID： PMC1904514 DOI： 10.6026/97320630001105

Source DB: PubMed Journal: Bioinformation ISSN： 0973-2063

× No keyword cloud information.

Editorial Message

Since the birth of Bioinformation the proportion of papers published by the Indian authors is substantial. This coincidence made me think about the past and present scenario of Indian computational biology and bioinformatics. If you had a chance to travel around recently in some of the main roads in the major cities in India such as Hyderabad, Bangalore and Chennai (Madras) in the recent years, you might have noticed banners and posters advertising for bioinformatics or, more broad based, biotechnology organizations rather like advertisement boards of hotels, cellular phones and computers. Biotechnology and Bioinformatics are popular terms now in India and recently Nature released a special outlook section on activities on biological sciences in India. [1] The generic term “Biotechnology” which seems to encompass traditional and modern biology is very popular especially among Indian students aspiring for a career in biology. This bioinformatics boom, which looks somewhat subsided now, is compounded by the launch of bioinformatics companies and bioinformatics wings of popular software and pharmaceutical companies few flourishing, some lingering and many dead. Suppose you are an enthusiast of protein structures, the most obvious work that comes to the mind is the famous Ramachandran map [2], and, perhaps Venkatachalam's β turn [ 3] which are now text book material. Clearly, the Ramachandran era marks a glorious time of Indian Computational Biology.

Old wine in the old bottle

Ironically the Ramachandran map resulted out of the provocative criticism on the early model of collagen proposed by Ramachandran and Kartha [4 ], although the main point of the finding, that it is triple helical structure, is fundamentally correct. This criticism was based on the contention that their initial model of collagen contains non bonded atom pairs that are too closely placed. The result of this provocation is the development of the contact criteria and the famous Ramachandran map [ 2] that was done in association with C. Ramakrishnan and V. Sasisekharan. Apparently, Pauling and Corey understood the principles of Ramachandran map so well that they felt no need to explain them by a ((Φ,ψ) diagram. [ 5] However, the simplicity of the two dimensional representation is so striking and highly revealing with information content much more than the spectacular and monumental αhelix and βsheet structures. The basic principle and data that went behind proposition of α helix and β sheet structures is the stability contributed by hydrogen bonding and the fibre diffraction patterns. The proposition of Ramachandran map emerged from the consideration of feasibility of formation of conformations for the minimal system of two linked peptide units based on the limit to how proximal two nonbonded atoms can get. Considerations of stability and feasibility are fundamentally quite different that they resulted in two independent masterpieces for the structural biologists. Over the past four decades, Ramachandran map has been put to extensive and powerful use in understanding the principles of protein structure, stability and folding beyond the ideal αhelix and idealβbsheet [e.g.,6– 7–8– 9–10– 11–12] aside from its foolproof use in validating protein structures. [13] Seminal early work from Ramachandran and coworkers on peptide and protein conformation was followed by a number of important contributions in developments of new methods for biomolecular structural research although the visibility of the publications in the international arena was only modest. A robust work on hydrogen bonding defined definitive lengths and angles for considering an interaction as “Hydrogen bond”. [14] Comprehension of complex protein structures has been a subject of intense study with some simplified viewpoints proposed from Madras and Bangalore. [15-16] Such different viewpoints enabled recognition of occurrence of unusual structural patterns such as single helix of collagen type observed for the first time in globular proteins [ 17] which is presently recognized as a motif for interaction between protein modules. Two groups, fittingly both from India, provided an appraisal for the occurrence of conformations in the disallowed regions of the Ramachandran map. [ 18–19 ] Sidechain rotamer preferences have been extensively studied over many years by several groups and one of the earliest and well noted work emerged from India. [20] Disulfide engineering has been a popular approach to enhance the protein stability. The MODIP procedure, which involves stereochemical modeling of disulfide bridges [21], is popularly used in several laboratories around the world. In retrospect, it appears that while much of the Indian work derived inspiration from the intellect of Ramachandran, it also followed the path of structural data analysis provided particularly by the groups of Thornton [22], Chothia [23] and Richardson. [ 24] The analysis and modeling work was not only done for proteins and peptides, but, also for nucleic acids [eg.,25] and carbohydrates. [eg.,26 ] Aside the structural data analysis, modeling of molecular recognition [eg., 27] and molecular dynamics has also been significant component of computational biology. [eg.,28]

New spices in the traditional curry

In the recent times, the strength of India in computing, information technology and software engineering in academia and industry has provided a fresh impetus to computational biology and bioinformatics in India. The current generation of Indian computational biologists effectively exploited these pivotal points and drifted steadily towards global trends such as integration of biological data, development of useful suites of software and databases in biology, generation of new hypotheses on form, functions and models of evolution of biomolecules especially using genome wide analyses [29– 30–31 –32–33 –34– 35–36–37–38–39–40], neuronal simulations and systems biology. In a remarkable combination of bioinformatics and biological work, it has been shown that some of the annotated fadD genes, located adjacent to the polyketide synthase genes in the Mycobacterium tuberculosis genome, constitute a new class of long chain fatty acyl AMP ligases. These proteins activate long chain fatty acids as acyl adenylates, which are then transferred to the multifunctional polyketide synthases for further chain extension. [41] In a work by the same combination of bioinformatics and experimental biologists based on precise identification of biological functions of proteins from Pps cluster, they have rationally produced a nonmethylated variant of mycocerosate esters. [42] Implications of conserved regions in protein folding and stability [43] and their use in function prediction [ 44] have been recently proposed. The area of protein protein interactions has been a subject of recent focus. [ 45– 46 –47] It has been shown for the first time that protein protein interfaces are not topologically equivalent, in general, if the proteins are distantly related. It has also been shown that variation in protein protein interactions in members within a superfamily could serve as diverging points in otherwise parallel metabolic or signaling pathways. [48 ] Molecular dynamics simulations tailored with analysis of known 3 D structures [49-50-51] have provided new insights.

The main course Post genome sequencing era

Contribution of genomic data based on sequencing projects undertaken in India has been limited. Indian Initiative for Rice Genome Sequencing with member institutions of University of Delhi South Campus and Indian Agricultural Research Institute participated in the international project on the genome sequencing of rice. [52] Another recent notable contribution is from the Jawaharlal Nehru University Delhi which participated in the genome sequencing consortium of Entamoeba histolytica. [53] One of the earliest analyses of the human genome data from India, along side the work of Tony Hunter and coworkers [54], is the investigations on the complete repertoire of human kinases and discovery of previously unknown kinases and annotation of their functions. [55 ] Some of the predictions are consistent with the results of subsequent experimental studies. Analysis of kinases from other organisms [56 –57], phosphatases [ 58–59 ] and prediction of sub cellular localization of proteins [ 60] have generated many new and experimentally testable hypotheses. Extensive functional annotation of human X chromosome has been made by a team involving a large number of Indian researchers based in Bangalore. [61] A Bangalore based company, Jubilant Biosys, has developed several useful resources and a number of these are licensed to companies around the world. The resources include PathArt which builds molecular interaction networks from curated databases, Kinase ChemBioBase which is a comprehensive database of small molecules that focuses on kinase targets and GPCR Ligand Database which is a small molecule ligand database on GPCR agonists/antagonists. Another Bangalore based company, Strand Life Sciences, has been focusing on new suites of software. One of their products Admetis is a platform for modeling and predicting drug relevant properties of molecules In Silico. They design and synthesize compound libraries focused on specific targets. It employs novel machine learning based methods for designing these libraries and partners with leading chemistry companies for synthesis. Strand Life Sciences uses its proprietary prediction tools for annotating these libraries. Acuris is a tool for annotation and management of gene related data and it can automatically gather and present gene related public information and literature curation.

New flavors

Much of the Indian work described so far is confined to computational studies at the molecular level. This feature is perhaps consistent with the global bias of computational biology at the molecular level. Systems biology provides an attractive direction to move higher in the hierarchy. The signs of systems biology work from India are beginning to appear. A recent work forms an excellent example. [62] Flux balance analysis has been performed on the mycolic acid pathway of Mycobacterium tubercolosis , and this analysis has provided insights into the metabolic capabilities of the pathway. In silico systematic gene deletions and inhibitory studies provide clues about proteins essential for the pathway. A recent nice example of work on computational neuroscience from India addresses the question of how cells maintain changes in the efficacy of synaptic connections between nerve cells, during memory formation, despite molecular turnover, traffic, and biochemical noise. [ 63] The authors show using computer simulations that there is a self sustaining switch involving the movement of AMPA receptors to and from the synaptic membrane and more conductance states may arise through interactions with a biochemical switch involving a synaptic protein kinase. It is very effective to combine computational analysis and modeling with experimental results. One of the ways is to generate reasonable and testable hypothesis from computing and subject it to experimental scrutiny. Irrespective of the proposition from computational work is ‘right’ or ‘wrong’, one learns something new and useful about theoretical possibilities. Alternatively, computational biologists could provide a new viewpoint, using modeling and analysis, to an interesting experimental observation. Combining experimental and computational results and to be able to synergize them should be rewarding. In fact, the first paper entirely from India in the prestigious Cell marks the effective combination of experimental and theoretical analysis. [64 ] The authors investigate, using both the kinds of techniques, the size of lipid dependent organization of glycosyl phosphatidylinositol anchored proteins (GPI APs). It has been shown that cell surface GPI APs are present as monomers and a smaller fraction as nanoscale cholesterol sensitive clusters. Although there are a few other interesting combinations of experimental and computational work [e.g.– 41–42 –65], it is something that computational biologists in India would like to improve upon. Achieving an effective collaboration between expert experimental and computational biologists requires enthusiasm from both the sides to interact with each other and look at the results with an excellent understanding of strengths and weaknesses of the techniques involved. Strand Life Sciences has made many new developments to facilitate modern biological experimental research. Sarani automates large scale design of optimal oligonucleotide probes for microarray experiments and Chitraka is an image analysis and management tool for semi automatic recognition and quantification of expressed gene spots from microarray experiments. Sphatika is a crystal image classification tool for high throughput X ray crystallography and it classifies protein crystals into two broad categories, one comprising crystal hits and harvestable crystals and the other comprising empty wells, clear drops and precipitates. Some of the well established Indian software and other technology giants such as Tata Consultancy Services and Infosys have started computational life sciences components. For example, Infosys has a major initiative in drug discovery informatics. Given the high standing and reputation of such companies in the contribution to economy and technology growth in the country their venturing into the life sciences areas is extremely encouraging for world wide visibility of Indian bioinformatics.

Nutrition & Dessert

Department of Biotechnology (DBT), an organization closely connected to the central ministry, generously supports bioinformatics centers instituted in many academic organizations in India. These centers were initially primed towards services in bioinformatics and teaching/training. Some of these institution run one year postgraduate diploma program and even two year Masters program in Bioinformatics. The high quality of these programs is reflected by good achievements of the students in their further academic and industry engagements. In the recent times these centers are encouraged to have a strong research component too. Many small private organizations offer bioinformatics training, which are popular among the undergraduate and masters students. But the quality of the training given by such organizations is a question. The fundamental problem that prevails in such organizations is the lack of teachers who are well trained and experienced in bioinformatics. There are clear exceptions such as Institute of Bioinformatics and Applied Biotechnology in Bangalore that is supported by the local Government where high quality training is provided in bioinformatics. In association with computational biologists in the academic institutions in India and with the support of Council of Scientific and Industrial Research, Government of India the Tata Consultancy Services, a frontline and a long standing software company based in India, has produced a suite of programs called Bio Suite. Bio Suite package covers all the major functional areas of Bioinformatics. This package can be used to analyze, formulate, predict and provide solutions to areas such as genomics, protein modeling and structural analysis, simulation and drug design. Department of Biotechnology offer competitive research grants to the projects in bioinformatics through their taskforce explicitly for bioinformatics. National Bioscience Award instituted by the DBT aims to recognize outstanding research work done in India in the broad area of biological sciences. Among the recent recipients of this award includes, for the first time, the computational biologists. The present day Indian computational biologists are beginning to be visible internationally too with inclusion of an Indian in the editorial board of the journal Bioinformatics. The British biomedical research funding agency, the Wellcome Trust, support Indian researchers in the form of Senior Research Fellowship. The Indian recipients during the last 6 years include at least two computational biologists. Such trends would hopefully continue providing good encouragement to the computational biology community in the country. It is hoped that the resurrection of vibrant computational biology [66] in India after a period of modest international visibility in the post Ramachandran period would flourish and grow to newer heights.

64 in total

1. A manually curated functional annotation of the human X chromosome.

Authors: H C Harsha; Shubha Suresh; Ramars Amanchy; Nandan Deshpande; K Shanker; A J Yatish; Babylakshmi Muthusamy; B M Vrushabendra; B P Rashmi; K N Chandrika; N Padma; Salil Sharma; Jose L Badano; M A Ramya; H N Shivashankar; Suraj Peri; Dipanwita Roy Choudhury; M P Kavitha; R Saravana; Vidya Niranjan; T K B Gandhi; Neelanjana Ghosh; Sreenath Chandran; Minal Menezes; Mary Joy; S Sujatha Mohan; Nicholas Katsanis; Krishna S Deshpande; Chaerkady Raghothama; C K Prasad; Akhilesh Pandey
Journal: Nat Genet Date: 2005-04 Impact factor: 38.330

2. Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues.

Authors: B Anand; V S Gowri; N Srinivasan
Journal: Bioinformatics Date: 2005-04-07 Impact factor: 6.937

3. India.

Authors: Apoorva Mandavilli
Journal: Nature Date: 2005-07-28 Impact factor: 49.962

4. Genome-wide survey of prokaryotic O-protein phosphatases.

Authors: Anirban Bhaduri; R Sowdhamini
Journal: J Mol Biol Date: 2005-09-23 Impact factor: 5.469

5. Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search.

Authors: Aarti Garg; Manoj Bhasin; Gajendra P S Raghava
Journal: J Biol Chem Date: 2005-01-12 Impact factor: 5.157

6. The genome of the protist parasite Entamoeba histolytica.

Authors: Brendan Loftus; Iain Anderson; Rob Davies; U Cecilia M Alsmark; John Samuelson; Paolo Amedeo; Paola Roncaglia; Matt Berriman; Robert P Hirt; Barbara J Mann; Tomo Nozaki; Bernard Suh; Mihai Pop; Michael Duchene; John Ackers; Egbert Tannich; Matthias Leippe; Margit Hofer; Iris Bruchhaus; Ute Willhoeft; Alok Bhattacharya; Tracey Chillingworth; Carol Churcher; Zahra Hance; Barbara Harris; David Harris; Kay Jagels; Sharon Moule; Karen Mungall; Doug Ormond; Rob Squares; Sally Whitehead; Michael A Quail; Ester Rabbinowitsch; Halina Norbertczak; Claire Price; Zheng Wang; Nancy Guillén; Carol Gilchrist; Suzanne E Stroup; Sudha Bhattacharya; Anuradha Lohia; Peter G Foster; Thomas Sicheritz-Ponten; Christian Weber; Upinder Singh; Chandrama Mukherjee; Najib M El-Sayed; William A Petri; C Graham Clark; T Martin Embley; Bart Barrell; Claire M Fraser; Neil Hall
Journal: Nature Date: 2005-02-24 Impact factor: 49.962

Computational Biology and Bioinformatics: a tinge of Indian spice.

Editorial Message

Old wine in the old bottle

New spices in the traditional curry

The main course Post genome sequencing era

New flavors

Nutrition & Dessert

1. A manually curated functional annotation of the human X chromosome.

2. Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues.

3. India.

4. Genome-wide survey of prokaryotic O-protein phosphatases.

5. Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search.

6. The genome of the protist parasite Entamoeba histolytica.

7. Use of Ramachandran plot for increasing thermal stability of bacterial formate dehydrogenase.

8. Dissecting the mechanism and assembly of a complex virulence mycobacterial lipid.

9. The map-based sequence of the rice genome.

10. Genome wide survey of G protein-coupled receptors in Tetraodon nigroviridis.

Editorial Message

Old wine in the old bottle

New spices in the traditional curry

The main course ­ Post genome sequencing era

New flavors

Nutrition & Dessert

The main course Post genome sequencing era