The sequencing of the first human genome in 2003 marked the beginning of a new era in drug discovery.[12] Genomics studies have paved the way to identify potential biomarkers and drug targets for diseases. In the postgenomic period, with high-throughput technology, it is now easier to access the information of putative biomolecules involved in a disease state in a single run. Similarly, “omics” suffix has now started to use for other data sets also, for instance, transcriptomics, proteomics, lipidomics, metabolomics, and cytomics.[3] Omics technologies have given the unparalleled ability to researchers to screen biological samples for potential drug targets at the gene, transcript, protein, and metabolite levels, and evaluate their interactions at the network level.[45]
Genomics
The study of the entire genome of an organism is referred to as genomics. It entails the analysis of genetic variants linked to disease and drug response to derive direct inferences. Genome-wide association study identifies numerous genetic variants in different diseases in thousands of individuals. The inference is made by comparing the genetic variants in the disease group to its respective comparator.[6] Genomics can be further classified into structural genomics and functional genomics. Structural genomics helps in analyzing the three-dimensional structure of proteins using the modeling and experimental methods. Functional genomics deals with studying the interactions of genes and proteins.[7]
Pharmacogenomics
Pharmacogenomics evaluates the individual's genetic makeup in response to drug treatment.[8] The data generated helps in predicting whether the drug is effective or not and/or associated with any adverse effect for a specific individual.[7] Microarray and next-generation sequencing (NGS) are the putative high-throughput put techniques for pharmacogenomics. Microarray is a technique that provides the true picture of gene expression of the whole genome in a single experiment. Gene expression study can be extrapolated to understand the molecular signature of the disease, therapeutic targets, and drug discovery. Microarray replaced traditional northern blot, real-time polymerase chain reaction method and allows to evaluate almost lacs of interest genes expression. For gene expression analysis, complementary deoxyribonucleic acid (cDNA) microarray and oligonucleotide-based gene chip (OGC) methods can be used. In cDNA microarray, the battery of genes can be spotted on a nylon membrane or glass slide while in OGC method, oligonucleotide complementary to known genes can be synthesized in situ. The pharmacologically and toxicology-relevant gene expression on exposure to chemicals or drugs can be identified.[91011] Earlier Sanger sequencing was used to determine nucleotide variant sequencing but that was a tedious task.[12] NGS is a new technology that enables rapid sequencing of an entire genome. Short and sheared DNA segments are read by synthesizing complementary nucleotide sequences and read as fluorescent images. The actual sequence is finally read computationally against a human reference genome.[13] Numerous public databases are available like 1000 genome projects (1000 Genomes Project Consortium 2012), NHLBI (Natural heart, lung, and blood institute) genes, exome sequencing project, which contains DNA sequence variation in a large number of populations.[14] NGS is also applied in finding rare and common variants in genes associated with drug pharmacokinetic (PK) and pharmacodynamics (PD).[15] NGS evaluation of genetic variants in Phase I and Phase II metabolic enzyme and drug receptors identified the vast majority of variations in a coding region which are rare and very rare and leads to significant functional variability.[1617] An example of PK variation is in the case of a prodrug that is pharmacologically inactive until activated biologically by drug-metabolizing enzymes. Clopidogrel is an antiplatelet drug activated biologically by Cytochrome P450 2C19 enzyme (CYP2C19). Loss-of-function variants of these enzymes lead to undesirable therapeutic outcomes. In heterozygotes for CYP2C19*2 where apparent activity is present, increasing the drug dose results in an antiplatelet effect. In the case of homozygous variants, the increasing dose does not show any significant effect. While gain-of-function variants (CYP2C19*17) are associated with bleeding.[1819] In some cases, the single biological metabolizing enzyme variants can exert a large impact during the administration of a drug having a narrow therapeutic index. For instance, loss-of-function of thiopurine S-methyltransferase, an inactivator of 6- mercaptopurine, leads to accumulation of the drug that results in life-threatening cytotoxicity.[20] Ryanodine receptor 1 (a calcium channel receptor) variation leads to inhaled anesthetics associated with malignant hyperthermia.[21] During the Second World War, variation in glucose-6-phosphate dehydrogenase resulted in hemolytic anemia in patients exposed to antimalarial drugs.[22]
Metagenomics
Metagenomics refers to the direct genetic analysis of microbial genomes obtained from different environments.[23] It is an uncultured approach that uses the biosynthetic capacity of bacterial species to synthesize drugs. Metagenomics capture environmental DNA (eDNA) to identify, isolate and induce biosynthetic gene expression in the heterologous environment to synthesize small drug molecule (antimicrobials). It employs two methodologies-sequencing-based and functional-based.[24] The sequencing-based approach generally uses shotgun sequencing and other sequencing tag tools to profile biosynthetic activity, identify targets, and recover biosynthetic pathways from eDNA cosmid library.[24] The functional-based approach evaluates clone-induced phenotype in heterologous hosts to identify biosynthetically active clones by library creation and its enrichment.[25] Biotin (Vitamin H) is currently produced through a chemical pathway in the pharmaceutical industry leading to diverse environmental impacts. Lakhdari et al.[26] produced a reporter system that detects the alteration in immune response of metagenomics clones. A metagenomic library was constructed for the patients of Crohn's disease (CD) using their fecal microbiota and was further screened for the modulatory activity of nuclear factor-kappa B subunit by using an intestinal epithelial cell line that was infected with a reporter gene. A clone showing stimulatory activity was found. The homology Bacteroids vulgatus was found to be associated with the source. Later it was found that Bacteroids vulagtus was more in patients with CD disease in comparison to the normal population.
Epigenomics
Epigenomics refers to the heritable changes that do not bring any alterations in DNA sequences. Chromatin folding and attachment to nuclear matrix, covalent modifications associated with histone proteins, DNA methylation all together is known as the epigenome.[27] It is helpful in understanding the effect of DNA sequence on a particular gene function. Epigenetic changes are regulated by certain chemical modifications in the DNA itself or chromatin which are proteins, directly associated with the DNA. Epigenome varies at each cell and individual to individual, thereby modulating the gene expression. NGS evolved around the 2000s enabling researchers to sequence vast quantities of DNA in one go.[28] Epigenomics was the first one to use this technology to identify epigenetically modified regions. Other technologies used in analyzing epigenetic modifications include fluorescent in situ hybridization and chromosome conformation capture. These techniques have provided us with strong evidence about the existence of chromosomal interactions.[29] Chromatin Conformation Capture (3C) is another molecular technique that allows us to map chromosomal interactions by using formaldehyde cross-linking.[30] The biggest application of epigenomics is the early-stage detection of a disease and its therapeutic treatment. This technique has been used in the case of coronary artery disease (CAD).[31] Using integrated omics analysis several regulatory loci for CAD were investigated.[32]
Transcriptomics
Transcriptomics includes the primary transcripts such as ribosomes, messenger RNA, transfer RNA, and noncoding RNA. The primary transcripts vary under the influence of the external environment. RNA synthesis is at the center of central dogma between DNA and protein, therefore the primary step in gene expression.[7] All cells have a similar genome but different cell types express different genes. Thousands of noncoding RNA are implicated in many disease conditions, for instance, cancer,[33] diabetes,[3435] and myocardial infarction.[36] RNA sequencing[37] and probe-based assay[38] paved the way to identify and quantify micro RNA,[39] small nuclear RNA,[40] and circular RNA[41] that plays a vital role in the disease states. Drug-induced gene expression under in vitro and in vivo conditions elucidate the therapeutic efficacy of drug target.[4243] Transcriptomics data is helpful in toxicogenomic studies.[5] Data of gene expression following drug therapy are vital in understanding the efficacy of a drug and is publically available such as DrugMATRIX,[44] open TG-GATE,[45] and Gene-Expression Omnibus.[46]
Proteomics
Proteomics includes the identification and quantification of a set of proteins required to understand the function of cells. Due to the varied physicochemical characteristics of amino acids, posttranslational modifications (PTMs), the interconnectedness of proteins, and extremely different signaling networks, the analysis provides a difficulty.[47] Protein expression prediction at the genome level is problematic because genomic analysis ignores post-translational activities such as protein modifications and degradation. Therefore, dynamic proteomics analysis is vital along with static genomic analysis for accurate identification and quantification of biomarker and drug target studies.[48] Liquid chromatography-mass spectrometry (LC-MS) is an important tool for proteomics. The digested peptides are separated using LC and MS is used to evaluate the peptide on the basis of mass-charge ratio. The data generated helps in the identification of protein/peptide through mass spectrum and quantification of protein using mass chromatograms.[49] On the basis of methodology, proteomics is classified as quantitative and targeted proteomics. Quantitative proteomics is also known as nonbiased proteomics since it measures the expression of almost 5000 proteins in cells and tissues and 500 proteins in plasma. In addition, stable isotope labeling by metabolic incorporation of labeled amino acids in cell culture (SILAC) and isobaric labeling methods (tandem mass tags, isobaric tags), are employed to generate quantitative data.[5051] This approach is commonly used to find biomarkers. Targeted proteomics is another method, generally termed as a biased method to quantify specific targeted proteins/peptides.[5253] This method is suitable to validate the biomarkers for disease conditions. The method employs to evaluate the expression of 10-100 proteins at a single run. The biggest advantage of proteomics over genomics is that it considers PTMs. The regulation of protein function is considerably dependent on the PTMs as it affects the final molecular mass of the protein. PTM's are identified and quantified using quantitative and targeted proteomics.[54] A proteomic investigation identified a high-grade glioma-related protein in four cancer patients' tissues. The study used metabolic labeling to label an immunotoxin that targets surface sialylated glycoprotein (Ac4ManNAz). The tagged glycans were then connected to biotin-linked phosphine (Biorthogonal chemical reporting) and then enriched. To identify prospective therapeutic targets, a label-free MS technique was used. There were 52 glioblastoma-related proteins discovered, with the most important being receptor-type tyrosine-protein phosphatase zeta (PTPRZ1), angiopoietin-related protein-2 (ANGPTL2), Galectin-3-binding protein (LGALS3BP), Intercellular Adhesion Molecule-1 (ICAM1), Integrin Subunit Beta-8 (ITGB8), and interleukin-1 receptor (Interleukin-13 receptor subunit alpha-2).[55]
Metabolomics
Metabolomics helps in the thorough analysis of metabolites in a cell, tissue, or organism throughout time.[56] It provides a direct signature of biochemical processes as it does not influence by any other regulatory mechanism as in the case of genes and proteins that get affected by epigenetic changes and PTM.[57] Therefore, metabolomics can directly be correlated with the phenotype. Through metabolomics one can analyze different subsets of metabolites-based compounds, functional groups, or structure similarity. This kind of analysis relies on complex instrumentation such as MS and NMR. Metabolomics helps in globally analyzing the gene expression, genetic analysis, changes in kinetic activity, and regulation of enzymes.[58] One of the most recent applications of metabolomics is in the field of oncology. Cancer cells have a high transcription and translation rate along with a high rate of proliferation in comparison to normal cells.[59] Metabolomics can be used to predict the shape of cancer cells. Pre-dose analysis of bio-fluid profile helps in predicting the efficacy of drug treatment for each sample.[60]
Lipidomics
Lipidomics is a type of lipid-targeted metabolomics that studies the global lipid profile. It studies structure, function, and lipid interactions with other biomolecules.[6162] Lipidomics studies revealed the vital role of lipids in numerous neurodegenertive and metabolic diseases.[626364] European Union is supporting “Lipidomics Expertise Platform,” that collects lipidomics data from institutions and houses databases for further investigational purpose.[63] Gas chromatography with polar group derivatization or LC with MS is the most regularly utilized techniques for lipidomic analysis (GC-MS or LC-MS). Some lipid indicators for early illness detection have been identified using lipidomics analysis based on MS. As a result, lipidomic research mainly focuses on the detection of lipid changes at the system level that are symptomatic of illness, environmental disturbances, food, medications, and toxins, as well as genetics.[65] It opens up a whole new way of looking at how lipids work in biological systems.[66] Lipidomics can be utilized to develop biomarkers for early detection and prognosis of various diseases.[67]
Cytomics
Cytomics is the study of cytome heterogeneity, or more specifically, the study of molecular single-cell phenotypes arising from genotype and exposure combined with extensive bioinformatics knowledge extraction. Cells and their interconnections, rather than genes or macromolecules, are the fundamental functional components of organisms, according to the cytomics concept. Cytomics relies on sensitive, minimally invasive fluorescence-based multiparametric techniques such as flow cytometry (FCM), microscopic techniques, bioimaging techniques to unravel the activity of cells and tissue associated with disease.[68697070] To view and quantify the interaction of treatments in cell populations, high-content screening bioimaging uses automated microscopy, fluorescence detection, and multiparameter algorithms.[71] Microscopic technologies, on the other hand, are superior to FCM in terms of high content analysis. All molecules of individual cells, including their intracellular location, can be quantified with the right technique. These instruments allow for simultaneous screening of many drug targets and actions in the same specimen.[7273] A photonic microscopic robot can tag and scan hundreds of distinct molecular components, such as proteins, in morphologically intact preserved cells and tissue.[74] Different omics approaches along with their tools and applications are summarized in Figure 1.
Figure 1
Different omics approaches with their tools and applications applied to drug discovery
Different omics approaches with their tools and applications applied to drug discovery
Current Challenges and Future Insights
Data processing
The data generated through omics approach has an issue of data diversification. Further, there is a database redundancy due to the huge amount of data available from different omics sources. Lack of uniform data description standards also increased the complexity.[7] The presentation of data in network biology presents the solution of data processing. Network biology represents biological systems in a simplistic manner. Nodes in network biology represent intracellular processes (enzyme, protein, gene, metabolites) and edges represent relation between nodes.[75]
Nature of data
The data obtained from different data sets (genomic, transcriptomic, proteomics, etc.,) are dynamic in nature and not static. This causes biasness, low repeatability and the results fail in clinics. Therefore, it is important to consider time factor and sampling can be performed at different time points.[7]
Heterogeneity of the sample preparation
Omics studies compare data from healthy and disease and try to apply on the whole disease population but in actual both the population are heterogeneous. The approach should be to use a similar population. However, the challenge is that the confounding factors are not known. Therefore integrative omics approach should be employed that elucidates large number of variables.[6]
Single omic dataset
In general, a disease is a multifactorial means results of variation in genes, proteins, and metabolites. Single omic dataset is not sufficient as it provides reactive processes rather than causative,[6] therefore, integrative approach is required which disclose underlying mechanism that regulate complex disease.
Implementation of Multi-omics approach at bench side
For different omics approaches different type of sample processing is required. If limited patient sample is available, then it is not feasible to carry out the analysis. Implementation of multi-omics approach to the clinics requires standardization of analytical method, streamline sample collection, storage process, cost of analysis at each omic level.[7677]
Authors: Eric S Okerberg; Jiangyue Wu; Baohong Zhang; Babak Samii; Kelly Blackford; David T Winn; Kevin R Shreder; Jonathan J Burbaum; Matthew P Patricelli Journal: Proc Natl Acad Sci U S A Date: 2005-03-28 Impact factor: 11.205
Authors: Stephen G Gonsalves; Robert T Dirksen; Katrin Sangkuhl; Rebecca Pulk; Maria Alvarellos; Teresa Vo; Keiko Hikino; Dan Roden; Teri E Klein; S Mark Poler; Sephalie Patel; Kelly E Caudle; Ronald Gordon; Barbara Brandom; Leslie G Biesecker Journal: Clin Pharmacol Ther Date: 2019-01-24 Impact factor: 6.875
Authors: Clint L Miller; Milos Pjanic; Ting Wang; Trieu Nguyen; Ariella Cohain; Jonathan D Lee; Ljubica Perisic; Ulf Hedin; Ramendra K Kundu; Deshna Majmudar; Juyong B Kim; Oliver Wang; Christer Betsholtz; Arno Ruusalepp; Oscar Franzén; Themistocles L Assimes; Stephen B Montgomery; Eric E Schadt; Johan L M Björkegren; Thomas Quertermous Journal: Nat Commun Date: 2016-07-08 Impact factor: 14.919