Since the birth of Bioinformation the proportion of papers published by the Indian
authors is substantial. This coincidence made me think about the past and present scenario of Indian
computational biology and bioinformatics.If you had a chance to travel around recently in some of the main roads in the major cities in
India such as Hyderabad, Bangalore and Chennai (Madras) in the recent years, you might have noticed
banners and posters advertising for bioinformatics or, more broad based, biotechnology
organizations rather like advertisement boards of hotels, cellular phones and computers. Biotechnology
and Bioinformatics are popular terms now in India and recently Nature released a
special outlook section on activities on biological sciences in India. [1] The generic term “Biotechnology” which seems to encompass
traditional and modern biology is very popular especially among Indian students aspiring for a
career in biology. This bioinformatics boom, which looks somewhat subsided now, is compounded by
the launch of bioinformatics companies and bioinformatics wings of popular software and
pharmaceutical companies few flourishing, some lingering and many dead.Suppose you are an enthusiast of protein structures, the most obvious work that comes to the
mind is the famous Ramachandran map [2],
and, perhaps Venkatachalam's β turn [
3] which are now text book material. Clearly, the Ramachandran era marks a glorious
time of Indian Computational Biology.
Old wine in the old bottle
Ironically the Ramachandran map resulted out of the provocative criticism on the early model of
collagen proposed by Ramachandran and Kartha [4
], although the main point of the finding, that it is triple helical structure, is
fundamentally correct. This criticism was based on the contention that their initial model of
collagen contains non bonded atom pairs that are too closely placed. The result of this
provocation is the development of the contact criteria and the famous Ramachandran map [
2] that was done in association with C. Ramakrishnan
and V. Sasisekharan.Apparently, Pauling and Corey understood the principles of Ramachandran map so well that they
felt no need to explain them by a ((Φ,ψ) diagram. [
5] However, the simplicity of the two dimensional
representation is so striking and highly revealing with information content much more than the
spectacular and monumental αhelix and βsheet structures. The
basic principle and data that went behind proposition of α helix and β
sheet structures is the stability contributed by hydrogen bonding and
the fibre diffraction patterns. The proposition of Ramachandran map emerged from the consideration
of feasibility of formation of conformations for the minimal system of two
linked peptide units based on the limit to how proximal two nonbonded atoms can
get. Considerations of stability and feasibility are fundamentally quite different that they
resulted in two independent masterpieces for the structural biologists. Over the past four decades,
Ramachandran map has been put to extensive and powerful use in understanding the principles of
protein structure, stability and folding beyond the ideal αhelix and
idealβbsheet [e.g.,6–
7–8–
9–10–
11–12]
aside from its foolproof use in validating protein structures. [13]Seminal early work from Ramachandran and coworkers on peptide and protein conformation was
followed by a number of important contributions in developments of new methods for biomolecular
structural research although the visibility of the publications in the international arena was
only modest. A robust work on hydrogen bonding defined definitive lengths and angles for
considering an interaction as “Hydrogen bond”. [14] Comprehension of complex protein structures has been a subject of
intense study with some simplified viewpoints proposed from Madras and Bangalore. [15-16]
Such different viewpoints enabled recognition of occurrence of unusual structural patterns such
as single helix of collagen type observed for the first time in globular proteins [
17] which is presently recognized as a motif for
interaction between protein modules. Two groups, fittingly both from India, provided an appraisal
for the occurrence of conformations in the disallowed regions of the Ramachandran map. [
18–19
] Sidechain rotamer preferences have been extensively studied over many years by several
groups and one of the earliest and well noted work emerged from India. [20] Disulfide engineering has been a popular approach to
enhance the protein stability. The MODIP procedure, which involves stereochemical modeling of
disulfide bridges [21], is popularly used
in several laboratories around the world.In retrospect, it appears that while much of the Indian work derived inspiration from the
intellect of Ramachandran, it also followed the path of structural data analysis provided
particularly by the groups of Thornton [22],
Chothia [23] and Richardson. [
24] The analysis and modeling work was not only
done for proteins and peptides, but, also for nucleic acids [eg.,25] and carbohydrates. [eg.,26
] Aside the structural data analysis, modeling of molecular recognition [eg.,
27] and molecular dynamics has also been significant
component of computational biology. [eg.,28]
New spices in the traditional curry
In the recent times, the strength of India in computing, information technology and software
engineering in academia and industry has provided a fresh impetus to computational biology and
bioinformatics in India. The current generation of Indian computational biologists effectively
exploited these pivotal points and drifted steadily towards global trends such as integration
of biological data, development of useful suites of software and databases in biology, generation
of new hypotheses on form, functions and models of evolution of biomolecules especially using
genome wide analyses [29–
30–31
–32–33
–34–
35–36–37–38–39–40], neuronal
simulations and systems biology.In a remarkable combination of bioinformatics and biological work, it has been shown that some
of the annotated fadD genes, located adjacent to the polyketide synthase genes in the
Mycobacterium tuberculosis genome, constitute a new class of long chain fatty
acyl AMP ligases. These proteins activate long chain fatty acids as acyl
adenylates, which are then transferred to the multifunctional polyketide synthases for further
chain extension. [41] In a work by the same
combination of bioinformatics and experimental biologists based on precise identification of
biological functions of proteins from Pps cluster, they have rationally produced a nonmethylated
variant of mycocerosate esters. [42]Implications of conserved regions in protein folding and stability [43] and their use in function prediction [
44] have been recently proposed. The area of
protein protein interactions has been a subject of recent focus. [
45– 46
–47] It has been shown for the first
time that protein protein interfaces are not topologically equivalent, in general, if
the proteins are distantly related. It has also been shown that variation in protein
protein interactions in members within a superfamily could serve as diverging points in
otherwise parallel metabolic or signaling pathways. [48
] Molecular dynamics simulations tailored with analysis of known 3 D
structures [49-50-51] have provided
new insights.
The main course Post genome sequencing era
Contribution of genomic data based on sequencing projects undertaken in India has been limited.
Indian Initiative for Rice Genome Sequencing with member institutions of University of Delhi South
Campus and Indian Agricultural Research Institute participated in the international project on
the genome sequencing of rice. [52]
Another recent notable contribution is from the Jawaharlal Nehru University Delhi which
participated in the genome sequencing consortium of Entamoeba histolytica.
[53]One of the earliest analyses of the human genome data from India, along side the work of Tony
Hunter and coworkers [54], is the
investigations on the complete repertoire of human kinases and discovery of previously unknown
kinases and annotation of their functions. [55
] Some of the predictions are consistent with the results of subsequent experimental
studies. Analysis of kinases from other organisms [56
–57], phosphatases [
58–59
] and prediction of sub cellular localization of proteins [
60] have generated many new and experimentally
testable hypotheses. Extensive functional annotation of human X chromosome has been made by a
team involving a large number of Indian researchers based in Bangalore. [61] A Bangalore based company, Jubilant Biosys,
has developed several useful resources and a number of these are licensed to companies
around the world. The resources include PathArt which builds molecular interaction
networks from curated databases, Kinase ChemBioBase which is a comprehensive
database of small molecules that focuses on kinase targets and GPCR Ligand Database
which is a small molecule ligand database on GPCR agonists/antagonists.Another Bangalore based company, Strand Life Sciences, has been focusing on new suites
of software. One of their products Admetis is a platform for modeling and
predicting drug relevant properties of molecules In Silico. They design
and synthesize compound libraries focused on specific targets. It employs novel machine
learning based methods for designing these libraries and partners with leading chemistry
companies for synthesis. Strand Life Sciences uses its proprietary prediction tools for annotating
these libraries. Acuris is a tool for annotation and management of gene
related data and it can automatically gather and present gene related public information
and literature curation.
New flavors
Much of the Indian work described so far is confined to computational studies at the molecular
level. This feature is perhaps consistent with the global bias of computational biology at the
molecular level. Systems biology provides an attractive direction to move higher in the hierarchy.
The signs of systems biology work from India are beginning to appear. A recent work forms an
excellent example. [62] Flux balance
analysis has been performed on the mycolic acid pathway of Mycobacterium tubercolosis
, and this analysis has provided insights into the metabolic capabilities of the pathway.
In silico systematic gene deletions and inhibitory studies provide clues about
proteins essential for the pathway.A recent nice example of work on computational neuroscience from India addresses the question
of how cells maintain changes in the efficacy of synaptic connections between nerve cells, during
memory formation, despite molecular turnover, traffic, and biochemical noise. [
63] The authors show using computer simulations
that there is a self sustaining switch involving the movement of AMPA receptors to and
from the synaptic membrane and more conductance states may arise through interactions with a
biochemical switch involving a synaptic protein kinase.It is very effective to combine computational analysis and modeling with experimental results.
One of the ways is to generate reasonable and testable hypothesis from computing and subject it
to experimental scrutiny. Irrespective of the proposition from computational work is
‘right’ or ‘wrong’, one learns something new and useful about
theoretical possibilities. Alternatively, computational biologists could provide a new viewpoint,
using modeling and analysis, to an interesting experimental observation. Combining experimental
and computational results and to be able to synergize them should be rewarding. In fact, the first
paper entirely from India in the prestigious Cell marks the effective combination
of experimental and theoretical analysis. [64
] The authors investigate, using both the kinds of techniques, the size of lipid
dependent organization of glycosyl phosphatidylinositol anchored proteins
(GPI APs). It has been shown that cell surface GPI APs are present as monomers
and a smaller fraction as nanoscale cholesterol sensitive clusters. Although there are a
few other interesting combinations of experimental and computational work [e.g.–
41–42
–65], it is something that computational
biologists in India would like to improve upon. Achieving an effective collaboration between
expert experimental and computational biologists requires enthusiasm from both the sides to
interact with each other and look at the results with an excellent understanding of strengths and
weaknesses of the techniques involved.Strand Life Sciences has made many new developments to facilitate modern biological experimental
research. Sarani automates large scale design of optimal oligonucleotide
probes for microarray experiments and Chitraka is an image analysis and management
tool for semi automatic recognition and quantification of expressed gene spots from
microarray experiments. Sphatika is a crystal image classification tool for high
throughput X ray crystallography and it classifies protein crystals into two broad
categories, one comprising crystal hits and harvestable crystals and the other comprising empty
wells, clear drops and precipitates. Some of the well established Indian software and
other technology giants such as Tata Consultancy Services and Infosys have started computational
life sciences components. For example, Infosys has a major initiative in drug discovery
informatics. Given the high standing and reputation of such companies in the contribution to
economy and technology growth in the country their venturing into the life sciences areas is
extremely encouraging for world wide visibility of Indian bioinformatics.
Nutrition & Dessert
Department of Biotechnology (DBT), an organization closely connected to the central ministry,
generously supports bioinformatics centers instituted in many academic organizations in India.
These centers were initially primed towards services in bioinformatics and teaching/training. Some
of these institution run one year postgraduate diploma program and even two year Masters
program in Bioinformatics. The high quality of these programs is reflected by good achievements of
the students in their further academic and industry engagements. In the recent times these centers
are encouraged to have a strong research component too. Many small private organizations offer
bioinformatics training, which are popular among the undergraduate and masters students. But the
quality of the training given by such organizations is a question. The fundamental problem that
prevails in such organizations is the lack of teachers who are well trained and
experienced in bioinformatics. There are clear exceptions such as Institute of Bioinformatics and
Applied Biotechnology in Bangalore that is supported by the local Government where high quality
training is provided in bioinformatics.In association with computational biologists in the academic institutions in India and with
the support of Council of Scientific and Industrial Research, Government of India the Tata
Consultancy Services, a frontline and a long standing software company based in India, has produced
a suite of programs called Bio Suite. Bio Suite package covers all the major
functional areas of Bioinformatics. This package can be used to analyze, formulate, predict and
provide solutions to areas such as genomics, protein modeling and structural analysis, simulation
and drug design.Department of Biotechnology offer competitive research grants to the projects in bioinformatics
through their taskforce explicitly for bioinformatics. National Bioscience Award instituted by the
DBT aims to recognize outstanding research work done in India in the broad area of biological
sciences. Among the recent recipients of this award includes, for the first time, the computational
biologists. The present day Indian computational biologists are beginning to be visible
internationally too with inclusion of an Indian in the editorial board of the journal
Bioinformatics. The British biomedical research funding agency, the Wellcome
Trust, support Indian researchers in the form of Senior Research Fellowship. The Indian recipients
during the last 6 years include at least two computational biologists. Such trends would hopefully
continue providing good encouragement to the computational biology community in the country. It is
hoped that the resurrection of vibrant computational biology [66] in India after a period of modest international visibility in the
post Ramachandran period would flourish and grow to newer heights.
Authors: H C Harsha; Shubha Suresh; Ramars Amanchy; Nandan Deshpande; K Shanker; A J Yatish; Babylakshmi Muthusamy; B M Vrushabendra; B P Rashmi; K N Chandrika; N Padma; Salil Sharma; Jose L Badano; M A Ramya; H N Shivashankar; Suraj Peri; Dipanwita Roy Choudhury; M P Kavitha; R Saravana; Vidya Niranjan; T K B Gandhi; Neelanjana Ghosh; Sreenath Chandran; Minal Menezes; Mary Joy; S Sujatha Mohan; Nicholas Katsanis; Krishna S Deshpande; Chaerkady Raghothama; C K Prasad; Akhilesh Pandey Journal: Nat Genet Date: 2005-04 Impact factor: 38.330
Authors: Brendan Loftus; Iain Anderson; Rob Davies; U Cecilia M Alsmark; John Samuelson; Paolo Amedeo; Paola Roncaglia; Matt Berriman; Robert P Hirt; Barbara J Mann; Tomo Nozaki; Bernard Suh; Mihai Pop; Michael Duchene; John Ackers; Egbert Tannich; Matthias Leippe; Margit Hofer; Iris Bruchhaus; Ute Willhoeft; Alok Bhattacharya; Tracey Chillingworth; Carol Churcher; Zahra Hance; Barbara Harris; David Harris; Kay Jagels; Sharon Moule; Karen Mungall; Doug Ormond; Rob Squares; Sally Whitehead; Michael A Quail; Ester Rabbinowitsch; Halina Norbertczak; Claire Price; Zheng Wang; Nancy Guillén; Carol Gilchrist; Suzanne E Stroup; Sudha Bhattacharya; Anuradha Lohia; Peter G Foster; Thomas Sicheritz-Ponten; Christian Weber; Upinder Singh; Chandrama Mukherjee; Najib M El-Sayed; William A Petri; C Graham Clark; T Martin Embley; Bart Barrell; Claire M Fraser; Neil Hall Journal: Nature Date: 2005-02-24 Impact factor: 49.962