Literature DB >> 28915268

Large-scale cross-species chemogenomic platform proposes a new drug discovery strategy of veterinary drug from herbal medicines.

Chao Huang1, Yang Yang2, Xuetong Chen1, Chao Wang3, Yan Li4, Chunli Zheng5, Yonghua Wang1.   

Abstract

Veterinary Herbal Medicine (VHM) is a comprehensive, current, and informative discipline on the utilization of herbs in veterinary practice. Driven by chemistry but progressively directed by pharmacology and the clinical sciences, drug research has contributed more to address the needs for innovative veterinary medicine for curing animal diseases. However, research into veterinary medicine of vegetal origin in the pharmaceutical industry has reduced, owing to questions such as the short of compatibility of traditional natural-product extract libraries with high-throughput screening. Here, we present a cross-species chemogenomic screening platform to dissect the genetic basis of multifactorial diseases and to determine the most suitable points of attack for future veterinary medicines, thereby increasing the number of treatment options. First, based on critically examined pharmacology and text mining, we build a cross-species drug-likeness evaluation approach to screen the lead compounds in veterinary medicines. Second, a specific cross-species target prediction model is developed to infer drug-target connections, with the purpose of understanding how drugs work on the specific targets. Third, we focus on exploring the multiple targets interference effects of veterinary medicines by heterogeneous network convergence and modularization analysis. Finally, we manually integrate a disease pathway to test whether the cross-species chemogenomic platform could uncover the active mechanism of veterinary medicine, which is exemplified by a specific network module. We believe the proposed cross-species chemogenomic platform allows for the systematization of current and traditional knowledge of veterinary medicine and, importantly, for the application of this emerging body of knowledge to the development of new drugs for animal diseases.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28915268      PMCID: PMC5600375          DOI: 10.1371/journal.pone.0184880

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Drug discovery aims at finding molecules that will target a specific pathway or pathogen with minimal side effects [1]. However, productivity, in terms of new drug approvals, has presumably been falling for almost a decade and the safety of a considerable number of highly effective drugs has recently been introduced into doubt [2]. For example, about 2.3 million adverse event reports were collected against ∼6000 marketed drugs between 1969 and 2002 [3]. Therefore, the pharmaceutical industry is presently beleaguered by detailed scrutiny from the financial sector, managers and the wider population [2]. To achieve the potential for rescuing the pharmaceutical industry, shifting the focus of drug discovery from chemosynthesis to cross-species sources, typically natural products from medicinal plants, is essential for discovering effective therapeutic agents that revolutionized treatment of serious animal diseases. Medicinal plants are a vital source of phytochemicals that supply traditional medicinal treatment of various diseases [4]. At present, the interest in medicinal plants has increased significantly in animal therapy, which is named as VHM [5]. As described by Viegi et al. [6], cattle, horses, sheep, goats and pigs account for about 70% of the animals cured with herbal remedies, followed by poultry (9.1%), dogs (5.3%) and rabbits (4.3%). This is not only because of a general trend towards the utilization of natural products for therapeutic diseases but also attributable to the availability of extensive evidence regarding the efficacy of herbal remedies [7]. A case in point is ‘Zoopharmacognosy’, which refers to animals self-medicate by searching for herbs best capable of treating their disease [8, 9]. Although the clinical efficiency and safety of herbs are unquestioned for animal disease, identification of the new structural leads remains a matter of dispute. This raises questions about whether these most successful source of drugs (natural products) has any place in modern drug discovery [10, 11]. With the above background, it is worth considering how new drugs have been discovered. In general, three different type approaches have been, and continue to be utilized. These are: traditional, empirical and experimental. The traditional approach takes advantage of material that has been discovered by years of trial and error in dissimilar medical system. Typical examples cover drugs such as morphine, quinine and ephedrine that have been widely and long-term used, and the closest adopted compounds such as the antimalarial artemisinin. The empirical approach constructs on an interpretation of a correlative physiological process and regularly exploits a therapeutic agent from a naturally occurring lead molecule. Representative drugs include muscle relaxant tubocurarine, β-adrenoceptor antagonist propranolol, and histamine H2 receptor antagonist cimetidine [10]. The drawback of this approach is that it lacks the scientific and standard evaluation system of modern medicine. The experimental approach is based on the development of molecular biological techniques and the advances in genomics. The majority of drug discovery is currently on the basis of the experimental approach, which is unfortunatly time-consuming and laborious [12]. Thus, a new approach, such as computer strategies, will be needed to remedy this situation. More recently, the advent of–omics technologies that rapidly measure the entirety of the complement of various organisms, for example, genes (genomics) or metabolites (metabonomics)—and to integrate these diverse data into a complete picture—has given rise to a new way of looking at the herbal remedies in the form of chemogenomic profile [13]. Chemogenomics is an incipient discipline that integrates the latest instruments of genomics and chemistry and applies them to target and drug discovery. Its strength lies in eliminating the bottleneck that presently arises in target identification by measuring the wide, conditional effects of chemical libraries on entire biological systems or by filtering huge chemical libraries rapidly and effectively against given targets. The hope is that chemogenomics will concurrently recognize and verify therapeutic targets and detect drug candidates to quickly and efficiently generate new drugs for many diseases [14]. In this study, we construct a cross-species chemogenomic screening platform to decode the drug discovery procedure and utilized it into VHM, which is exemplified by identifying lead compounds that have curative effect on Bovine pneumonia of erchen decoction (Fig 1). This herbal remedy is a China proved prescription for the treatment of pneumonia, which is composed of Pinellia ternata (Thunb.) Breit (Pinellia ternate), Tangerine Peel, Poria cocos (Schw.) Wolf (Tuckahoe) and Glycyrrhiza uralensis Fisch (Licorice). First, based on critically examined pharmacology knowledge, we propose a large-scale statistical analysis to evaluate the efficiency of ingredients in herbal remedy, which consists of drug-likeness (DL) assessment and chemical properties comparison. Second, specific informatics method is developed based on complex structure-, omics- analysis to infer drug-target connections, with purpose to understand how drugs work on the specific targets. Third, we focus on the exploration of the interactions among active ingredients, targets and disease by carrying out network-based systematic investigations, such as network convergence and modularized analysis. Finally, we choose a typical convergent module and associate it with pathway to reveal the molecular basis of the therapeutic potential. We believe the large-scale cross-species chemogenomic platform promise to improve decision making in pharmaceutical development and announce the mechanism of action.
Fig 1

Flowchart of the cross-species chemogenomic platform.

Materials and methods

Data sets

All the compounds in erchen decoction are collected from the TCMSP database[15].

DL assessment

DL is calculated by Tanimoto similarity [16] between herbal compounds and the average molecular properties of all veterinary drugs in FDA. The molecular properties refer to the 1,664 symbols which are calculated by Dragon professional version 5.4. The 1,664 descriptors are divided into 20 different types, such as constitutional, topological, 2D-autocorrelations, geometrical and so on. After removing the descriptors that are not available for all drugs, 1,533 descriptors are finally used (S1 Table). where A is the molecular descriptors of herbal compounds, B represent the average molecular properties of all veterinary drugs in FDA. In this work, ingredients with DL ≥ 0.15 are regarded as the candidate bioactive molecules, because the mean value of DL for all veterinary drugs in FDA is 0.15 (S1 Fig).

Physicochemical features calculation

Molecular weight (MW), number of hydrogen bond acceptors (nHAcc), number of hydrogen bond donors (nHDon), octanol-water partition coefficient (MlogP) and number of rotatable bonds (RBN) these physicochemical parameters are calculated by the Dragon software [17] in this work. According to the Lipinski’s rule of five, the threshold value of them are respectively set to: 500, 10, 5, 5 and 10.

Target identification

In an effort to predict the therapeutic target of animals, we construct a novel cross-species target prediction model (CSDT) by using Random Forest [18], which expands the predicted protein scope to all Swiss-Prot in Uniprot database [19], including 549,649 sequences involving 13,241 species such as Eukaryotes, Procaryotes, and Viruses. The building mainly includes the following four steps (S2 Fig): Benchmark Dataset. Drug-target interactions are retrieved from the DrugBank database (http://www.drugbank.ca/, accessed on October 1, 2015). To eliminate noise of this data set, we further match them to STIICH [20], SuperTarget [21] and KEGG [22] database. In total, a 12,907 drug-target Interactions including 5,689 drugs and 3,650 targets is applied in this work as the benchmark dataset (S2 Table); Descriptor calculation. To characterize the drugs and targets with known pharmacological interactions, drug structures and protein sequences are converted into numerical descriptors by employing DRAGON program (http://www.talete.mi.it/index.htm) and ProteinEncoding (http://jing.cz3.nus.edu.sg/cgi-bin/prof/prof.cgi/), respectively. As a result, each drug is represented as 900 physicochemical descriptors. For a certain protein, it is characterize by 1,545 dimensions structural and physicochemical features (S3 Table); Construction of training and test sets. The positive set is constructed by the known drug-target interactions that extracted from the DrugBank database [23]. The negative set is assembled by a random generation of the same number of relations that do not overlap with those positive interactions, which is repeated 1,000 times to overcome the choice bias of the negative set. For each time, the dataset is then randomly split into two subsets, i.e., training set (19,360 = 9,680 positive interactions+9680 negative interactions, 3/4 of total sets) used to construct the model and an independent test set (6,454 = 3,227positive interactions+3,227negative interactions, 1/4 of total sets) to validate the accuracy of the model. Finally, these data are applied for random forests (RF) (http://www.stat.berkeley.edu/users/breiman/) modeling process. Default settings are used for the parameters: 500 for the number of trees and the square root of the total number of variables for the number of randomly selected variables, respectively. Model performance. With the purpose of deriving a reliable in silico model, both internal and external validation methods are applied. For the internal validation, the target prediction model is evaluated and verified with 5-fold cross-validation. The training set is firstly randomly separated into five approximately equal-sized subsets, where four subsets are selected as the training set to build a model and the remaining samples as test set. This process is repeated five times to ensure every subset can be predicted as a validation set once. As a result, the derived model performs well in predicting the drug targets with the accuracy of 77.04±0.80%, the sensitivity of 75.3±1.10%, the specificity of 77.48±0.98%, and the area under the receiver operating curves (AUC) of 0.86±0.01 (S3 Fig), respectively. For the external validation, the model shows the accuracy of 75.81±1.31%, the sensitivity of 74.27±1.67%, the specificity of 76.30±1.48%, and the AUC of 0.85±0.12 (S3 Fig).

Drug direct targeting

We also apply the ensemble similarity (WES) algorithm [24] to identify direct targets of ingredients in erchen decoction. WES quantitatively evaluates whether a molecule will direct bind to a target based on the weighted structural and physicochemical features it shares with known ligands of the target. The WES model performs well in predicting the binding (sensitivity 85%, SEN) and the nonbinding (specificity 71%, SPE) patterns, with the accuracy of 78%, the precision (PRE 74%) and the area under the receiver operating curves (AUC) of 0.85, respectively.

GO and KEGG pathway enrichment

We utilize the DAVID [25] to decipher the biological interpretation of the predicted targets of erchen decoction.

Network construction

Target-target (T-T) interaction are built by searching the STRING database. Specifically, in the STRING database, the target-target interactions are respectively given a confidence score: high confidence (0.7), medium confidence (0.4) and low confidence (0.15). To ensure the accuracy of the obtained target-target interactions, we search the STRING database with the confidence (score) greater than or equal to 0.7. The compound-target network and target-target network are displayed by Cytoscape 3.3 [26]. Cytoscape is a popular bioinformatics package for biological network visualization and data integration.

Results and discussion

Identification lead compounds through cross-species drug-likeness evaluation

Multicomponent quantitative analysis is one of the mainstream quality control methods of herbal medicines, since the ingredients of herbal medicines materials are heterogeneous [27]. In this work, 493 chemical components of erchen decoction are extracted from our database TCMSP (http://lsp.nwu.edu.cn/tcmsp.php). TCMSP is a unique systems pharmacology platform of Chinese herbal medicines that captures the relationships between drugs, targets and diseases [15]. To efficiently remove compounds chemically unsuitable for veterinary drug discovery, we construct a cross-species drug-likeness evaluation method based on Tanimoto coefficient [16] (see materials and methods). Here, DL is a complicated balance of diverse molecular properties and structure features which govern whether a particular molecule in erchen decoction is analogous to the known veterinary drugs in FDA (http://www.fda.gov/). And, the filtering criteria is defined as DL ≥ 0.15, because the average value of DL for all 333 veterinary drugs in FDA is 0.15. In total, among the 493 compounds, 126 representative compounds with favorable DL value are singled out and displayed in Table 1. Note that 48% (61/126) of the active agents have been reported by literatures S4 Table. For example, baicalin in Pinellia ternate protects mice from Staphylococcus aureus pneumonia via inhibition of the cytolytic activity of α-hemolysin [28]. Cavidine possesses anti-inflammatory activity and has been used to treat various inflammatory diseases [29]. These results indicate that the DL prediction approach is not only easy to discover known active ingredient, but also available to predict potential active ingredients.
Table 1

Candidate active compounds.

No.HerbCompound NameDLMWMLOGPnHDonnHAccRBN
M017TuckahoeCerevisterol0.18430.745.15334
M020Tangerine Peelbeta-Citraurin0.16432.77.1129
M024Tuckahoe3-oxo-6, 16α-dihydroxylanosta-7, 9, 26-trien-23-oicacid0.27502.815.01356
M030Licoricelicorice-saponin B20.33809.063.358157
M033Pinellia ternatesoya-cerebroside ii_qt0.3455210.324529
M034Pinellia ternatesoya-cerebroside I_qt0.3455210.324529
M043Tuckahoeporicoic acid DM0.25528.84.983611
M058LicoriceLicorice glycoside A0.78726.741.8181614
M068LicoriceGancaonin H0.16420.494.71363
M075LicoriceIsoglycyrol0.17366.394.36161
M079LicoriceHispaglabridin B0.18390.515141
M085Licoricekanzonols L0.23488.626.57365
M090Pinellia ternatecampesterol0.15400.767.97115
M114Tangerine Peelnaringin0.39580.59-0.478146
M118Tuckahoeporicoic acid D0.23514.774.734610
M123TuckahoeErgosterol0.15396.726.93114
M136LicoriceAraboglycyrrhizin0.39779.032.727146
M140TuckahoeDehydroeburiconic acid0.23466.776.85136
M149Licorice18α-hydroxyglycyrrhetic acid0.33486.764.55351
M154LicoriceXambioona0.17388.494.68041
M155LicoriceVicenin-20.35594.57-2.4511155
M175LicoriceLiquiritin apioside0.36550.56-0.757137
M187Tuckahoepolysaccharides0.17504.5-6.0111167
M188TuckahoeBeta-Glucan0.17504.5-6.0111167
M207Tuckahoeporicoic acid H0.21500.796.393510
M214Licoricelicorice glycoside E0.59693.711.5971410
M222Licoricelicorice-saponin G2_qt0.34486.764.4352
M223Licorice24-Hydroxyglycyrrhetic acid0.34486.764.4352
M225LicoriceDocosyl caffeate0.29488.8311.162424
M238Pinellia ternateStigmasterol0.17412.777.64115
M243LicoriceArtonin E0.18436.494.67473
M249Licoricelicorice-saponin F3_qt0.34454.765.93130
M267Pinellia ternatesoya-cerebroside I0.62714.168.5771032
M268Pinellia ternatesoya-cerebroside ii0.61714.168.5771032
M289Licoricerutin0.46610.57-1.4510166
M290Pinellia ternateBaicalin0.16446.390.646114
M297LicoriceKanzonol Z0.15406.514.93253
M299Licoricelicorice-saponin K20.33823.042.019168
M310Licorice2’,7-Dihydroxy-4’-methoxyisoflavan-7-O-β-d-glucopyranoside0.15434.480.59595
M317Licoriceglycyrrhetol0.28456.785.28231
M328LicoriceAstragalin0.16448.41-0.327114
M331LicoriceKanzonol H0.17424.586.1254
M334Pinellia ternateCavidine0.16353.453.72052
M340Licorice11-deoxyglycyrrhetic acid0.29456.786.42231
M343Tangerine PeelLimonin0.36470.561.42081
M352Licoriceuralsaponin B0.33823.042.428167
M355Licorice24-Hydroxy-11-deoxyglycyrrhetic acid0.29458.755.13341
M366TuckahoePolyporenic acid C0.25482.775.68246
M367Tuckahoe(16α)-16-Hydroxy-24-methylene-3-oxolanosta-7,9(11)-dien-21-oic acid0.25482.775.68246
M369Tangerine PeelTangeraxanthin0.26484.787.651210
M376Licorice4H-1-Benzopyran-4-one, 2-(4-(beta-D-glucopyranosyloxy)phenyl)-2,3-dihydro-5,7-dihydroxy-, (2S)-0.17434.430.396104
M381TuckahoePoricoic acid A0.21498.775.943510
M392Pinellia ternatebeta-Sitosterol0.17414.798.08116
M394Tangerine Peelbeta-Sitosterol0.17414.798.08116
M395TuckahoeDaucosterol_qt0.17414.798.08116
M399Tuckahoe3-epidehydrotumulosic acid0.25484.795.72346
M400Tuckahoedehydrotumulosic acid0.25484.795.72346
M424Licoriceglycyrrhizin0.33823.042.428167
M425Licoriceglycyrrhizic acid0.33823.042.428167
M426Licoricelicorice-saponin H20.33823.042.428167
M451LicoriceOnonin0.16430.440.68495
M454TuckahoeOleanolic acid0.28456.786.42231
M455Licorice3,22-Dihydroxy-11-oxo-delta(12)-oleanene-27-alpha-methoxycarbonyl-29-oic acid0.42512.754.37162
M462Licoricelicorice-saponin H2_qt0.31470.765.49241
M463Licorice18β-glycyrrhetic acid0.31470.765.49241
M464Licoriceapioglycyrrhizin_qt0.31470.765.49241
M465LicoriceAraboglycyrrhizin_qt0.31470.765.49241
M466Licoriceglycyrrhetinic acid0.31470.765.49241
M471Licoriceviolanthin0.32578.57-1.5610144
M479TuckahoeTrametenolic acid0.21456.787.03235
M480Licoriceisoglabrolide0.35468.745.15140
M484Tuckahoe25-hydroxy-3-epidehydrotumulosic acid0.28514.824.78456
M491Licoriceliquoric acid0.37484.744.05251
M493Licoriceschaftoside0.38596.54-1.510166
M496Tuckahoepachyman0.16500.56-4.279147
M497Licoricelicuraside0.31550.56-0.418139
M509TuckahoeDaucosterol0.49576.956.34469
M511Pinellia ternateDaucosterol0.49576.956.34469
M523TuckahoePoricoic acid B0.19484.745.64359
M543Licoriceglyasperin E0.16444.516.41266
M546TuckahoeDehydroeburicoic acid0.23468.796.89236
M547Pinellia ternateCycloartenol0.21426.87.55114
M561Licorice3β-formylglabrolide0.41496.755.33052
M563LicoriceIsoschaftoside0.3564.54-1.9410144
M565LicoriceHirsutrin0.18464.41-0.598124
M597Licorice(-)-Medicocarpin0.23432.460.75494
M598Pinellia ternateTRIPALMITIN0.84807.4919.520650
M609Tuckahoe(3β)-3-Hydroxylanosta-7,9(11),24-trien-21-oic acid0.21454.766.58235
M611Tuckahoeporicoic acid C0.19482.777.112410
M612LicoriceMairin0.27456.786.52232
M625TuckahoeErgosta-7,22-dien-3-ol0.15398.747.18114
M633Tangerine Peelhesperidin0.48610.62-0.488157
M646Licoriceapioglycyrrhizin0.39779.032.547147
M663LicoriceNicotiflorin0.43594.57-1.189156
M668Pinellia ternateStigmast-4-en-3-one0.17412.778.18016
M669Tuckahoedehydropachymic acid0.32526.836.1258
M685Licoricelicorice-saponin J20.33825.062.269168
M690Licoriceglabrolide0.36468.745140
M693Pinellia ternate1,2,3,4,6-Pentagalloyl glucose0.571092.834.59173019
M696LicoriceKanzonol F0.25420.545.3153
M700Licoricelicorice-saponin C2_qt0.29454.766.17231
M704Licoriceglycyroside0.39562.57-0.736138
M708Tangerine PeelNeohesperidin0.45610.62-0.488157
M709LicoriceIsoviolanthin0.32578.57-1.5610144
M712Licorice6″-O-acetylliquiritin0.18444.472.33395
M728TuckahoeEburicoic acid0.23470.817.33236
M732LicoriceNarcissoside0.5624.6-1.29167
M738Tuckahoebeta-Amyrin acetate0.31468.847.68022
M756Tuckahoepachymic acid0.32528.856.54258
M767Tuckahoeporicoic acid G0.19486.766.08359
M768Pinellia ternate(+)-Isolariciresinol monoglucoside0.21522.60.357117
M770Licorice22β-acetylglabric acid0.42528.84.77263
M772Licoricelicorice-saponin J2_qt0.31472.785.33342
M776Pinellia ternateErgosterol peroxide0.2428.726.73134
M777TuckahoeErgosterol peroxide0.2428.726.73134
M781Licoricelicorice-saponin G20.32839.041.339178
M783LicoriceUrsolic acid0.28456.786.47231
M794Licorice1-Methoxyficifolinol0.19422.566.1255
M803Tuckahoe29-hydroxypolyporenic acid C0.27498.774.59357
M807Licoricelicorice-saponin C20.33807.043.18157
M828TuckahoeTumulosic acid0.25486.816.16346
M836LicoriceMorusin0.16420.494.94363
M844Licoricelicorice-saponin K2_qt0.31470.765.08342
M860Tuckahoe(3β,16α,17α)-3,16-Dihydroxylanosta-7,9(11),24-trien-21-oic acid0.23470.765.41345
M864LicoriceIsoononin0.17430.440.68495
M895Pinellia ternatevaleraldoxime0.31514.722.29276
MW, nHAcc, nHDon, MlogP and RBN are the mainly pharmacophoric features that influence the behavior of molecule in a living organism, including bioavailability, transport properties, affinity to proteins, reactivity, toxicity, metabolic stability and many others [30]. Therefore, we further compare these chemical properties of the obtained potential active ingredients in erchen decoction with that of the 126 randomly selected molecules in the TCMSP database to further testify the validity and precision of the cross-species DL evaluation method. The distributions of the five pharmacological features of the above two types of ligands have different characteristics (Fig 2). Specifically speaking, a majority of the potential active compounds in erchen decoction have very low molecular weights in comparison to the ligands in TCMSP, which presumably is be caused by the fact that in proteins often very small solvent molecules are bound. Meanwhile, considerably more (40%) ingredients than TCMSP ligands fulfil the Lipinski "Rule of five" regarding the molecular weight. The same applies for RBN: 23% more active compounds in erchen decoction fulfil the Lipinski "Rule of five". A bigger percentage of active compounds in erchen decoction (90%) have less than 10 nHAcc, which is similar to that of TCMSP ligands. Meanwhile, a slightly fraction (18%) of the erchen decoction ligands have ten to twenty of them. Nevertheless, for TCMSP ligands, there are hardly no molecules meet the condition described above. Interestingly, this distribution is also applies to nHDon. Most potential active compounds in erchen (70%) have a MlogP value around 5, and the MlogP values of the TCMSP ligands accumulate around 10. Approximately 30% fraction of TCMSP ligands are "drug-like" according to the Lipinski "Rule of five", have a MlogP value less than 5. These results indicate that the cross-species DL evaluation method can reliably screen potential active ingredients.
Fig 2

Statistics: Comparison of erchen decoction potential active compounds with equal number compounds in TCMSP database.

Chemical properties of these two types of molecules are compared: distributions for molecular weight (MW), octanol-water partition coefficient (MlogP), numbers of hydrogen bond donors and acceptors (nHDon and nHAcc), and number of rotatable bonds (RBN) value are shown.

Statistics: Comparison of erchen decoction potential active compounds with equal number compounds in TCMSP database.

Chemical properties of these two types of molecules are compared: distributions for molecular weight (MW), octanol-water partition coefficient (MlogP), numbers of hydrogen bond donors and acceptors (nHDon and nHAcc), and number of rotatable bonds (RBN) value are shown.

Prediction target proteins through the cross-species drug-target (CSDT) interaction assessment model

In the elucidation of the pharmacological activities of the filtered active ingredients in erchen decoction, knowledge of potential targets is of highest importance, which remains an ongoing focus in drug discovery efforts [31]. In silico prediction of such interaction is in favor of improving the efficiency of the laborious and costly experimental determination of drug-target interaction [32, 33]. However, limiting by the scope of the training datasets, both in chemical space as well as biological space, current drug-target interaction prediction models, especially ligand-based methods, seem to be all trivially adapted to make predictions for new targets of human drugs. Thus, there is still no available target prediction model for veterinary drugs. To obtain the target proteins of the filtered active ingredients, we build a random forest [18] target prediction model, which expands the predicted protein scope to all Swiss-Prot in the Uniprot database [19], including 549,649 sequences involving 13,241 species such as Eukaryotes, Procaryotes, and Viruses (see materials and methods). The algorithm is based on extraction of conserved patterns from subdivided drug-target interaction vectors. The advantage of this model lies in that it allows us to take proteins of different species into accounts and thus predict the targets of a broad spectrum of species on a large scale. And indeed there are a similar model that we have contributed in our previous work which has been successfully applied to human target protein prediction [34]. Also, to evaluate the reliability of CSDT, we further compare the AUC of CSDT with the BATMAN-TCM [35] and HGBI method [36]. Although, the other two models outperforms CSDT, CSDT has wide adaptation range which provides help for target prediction of VHM. Thus, we can conclude that the target prediction model in this work is reliable to predict the targets that causes Bovine pneumonia. In addition, to guarantee the comprehensive of the target of active ingredients in erchen decoction, we further introduce the WES algorithm into this part [24]. WES quantitatively evaluates whether a molecule will direct bind to a target based on the weighted structural and physicochemical features it shares with known ligands of the target. In total, we obtain 5,219 targets for the 126 active ingredients. Considering that the focus of our work is to obtain the targets therapeutic for Bovine pneumonia, we further restrict the species to Bovin (Bos Taurus), STRP1 (Streptococcus pyogenes) and STRPN (Streptococcus pneumonia), which result in 448 targets (S5 Table). To verify whether the screened 448 targets are closely related to Bovin pneumonia, we respectively enrich the GO biological processes of these three types of targets by using David [25] and visualize them by enrichment Map [37] with the threshold of P-value ≤ 0.01. The Enrichment Map Cytoscape Plugin allows us to visualize the results of target-set enrichment as a network. GO analysis of Bovin targets reveal that the GO term ‘inflammatory response’ and ‘immune system process’ are significantly enriched (Fig 3 and S6 Table). Interestingly, the inflammatory response is responsible for the majority of the pulmonary damage [38]. The importance of ‘Immune system process’ in curing bacterial pneumonia is clearly demonstrated in experimental models of bovine pneumonia [39]. Erchen decoction not only targets the proteins of Bovin, but also works on the proteins of bacteria (STRP1 and STRPN). For the targets of STRP1 and STRPN, the main biological processes are ‘translation’, ‘tRNA metabolic process’, ‘nucleotide-excision repair’, ‘amino acid activation’, ‘tRNA aminoacylation’ and ‘ncRNA metabolic process’ (S7 Table and S8 Table). These processes are associated with cellular and metabolic processes, mainly involving in cell cycle regulation. These results suggest that erchen decoction has antibacterial activity. Taken together, the obtained targets function by directly inhibiting pathogenic bacteria proliferation through targeting their proteins essential for the bacteria life cycle, and also, indirectly suppressing bacterial infection via strengthening the immune systems of bovine.
Fig 3

The GO biological process enrichment analysis of Bovine targets.

Recognition multiple targets interference effects by heterogeneous network convergence and modularization analysis

To identify the interrelated target set of each active ingredient in erchen decoction, we perform heterogeneous network convergence and modularization analysis in this part. Network convergence is the efficient coexistence of heterogeneous data communication within a single network. Modularization analysis is of benefit to search for functional closely related information in a biological network. First, to discover the most potential lead compounds and decipher the action mechanism of erchen decoction, we generate two levels of networks: Compound-Target network (C-T network) and Target-Target network (T-T network). S5 Table shows a detailed view of the C-T interactions, which consists of 126 active compounds and 448 candidate targets of Bovin, STRP1 and STRPN through 1,773 interactions. Among them, proteins such as VDR USP10 connect with more than 13 compounds, which can be labeled as hub targets. These results indicate that the distribution of the compounds is extremely inhomogeneous. Thus, intervening measures of multiple targets are of benefit to the recovery of Bovin pneumonia. T-T interactions are built by searching the STRING database [40] with the required confidence (score) greater than the high confidence threshold 0.7. The STRING database contains protein interactions from numerous sources, including experimental data, computational prediction methods and public text collections, which can be regarded as functional protein association networks. S9 Table provides a comprehensive view of the cross-species target space which consists of 448 nodes and 696 edges. Among these interactions, about two-thirds of the targets are regulated by at least 10 proteins, indicating the close relationship among them. Then, we converge and modularize the aforementioned heterogeneous C-T and T-T network using Markov Cluster Algorithm (MCL) [41] implemented by clusterMaker2 for the purpose of uncovering the pharmacology correlation among the target proteins of a certain compound. ClusterMaker is a Cytoscape plugin that unifies different clustering techniques and displays into a single interface. MCL is a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs. As a result, these interactions are mainly assigned to 11 modules, where each module contains at least 12 targets Further, we analyze the chemical characteristics of molecules and proteins within the same modules to help us understand multiple targets interference effects of erchen. By applying the Tanimoto similarity with CDK fingerprints, we evaluate the molecular similarity among modules by comparing the molecules in different modules. The result shows that mean similarity of molecules in the same module (0.57) higher than that between modules (0.35) (one-tailed student's t-test P-value = 2.3E-213, S4A Fig). As an example, the pharmacophore model for molecules in module 1 that target TBC1D1 protein shows a good alignment to the pharmacophore and among themselves (S10 Table and S5 Fig). Similarly, to search for the common features of the proteins from the same module, we compare the similarity of protein sequences in the same module and between modules using the Smith–Waterman sequence alignment method. The similarity score is normalized by dividing it by the geometric mean of the scores obtained from the S-score of each protein against itself. We observe that the mean sequence similarity of proteins in the same module (0.031) higher than that between modules (0.021) (one-tailed student's t-test P-value = 1.45E-114, S4B Fig). These findings suggest that the molecules with similar structure trend to target similar targets. Finally, we enrich the GO biological process and KEGG pathway of each module by David to annotate these modules. We find that these modules are associated with inflammation, immunization, and apoptosis (Fig 4B, S11 Table). We present in detail two of the converged modules (Module 1 and Module 7) (Fig 4A), selected to show the method's ability to reproduce diverse features of these compound-target interactions.
Fig 4

Illustration of heterogeneous network convergence and modularization analysis.

(A) Global view of the modularized set of relationships among potential active compounds and their predicted protein targets. Module 1 and Module 7 are enlarged sub-modules in the global network. A compound node and a target protein node are linked if the protein is targeted by the corresponding compound. Analogous to the edge between a compound and a target, links are placed among targets if they are functional associated. Yellow node represents active compounds in erchen decoction. Green, blue and gray node respectively indicates that target of streptococcus pyogenes, Streptococcus pneumonia and Bovine.

Illustration of heterogeneous network convergence and modularization analysis.

(A) Global view of the modularized set of relationships among potential active compounds and their predicted protein targets. Module 1 and Module 7 are enlarged sub-modules in the global network. A compound node and a target protein node are linked if the protein is targeted by the corresponding compound. Analogous to the edge between a compound and a target, links are placed among targets if they are functional associated. Yellow node represents active compounds in erchen decoction. Green, blue and gray node respectively indicates that target of streptococcus pyogenes, Streptococcus pneumonia and Bovine.

Module 1

Module 1 reflects the 187 interactions between 18 molecules and 87 targets. The 87 targets are not only from Bovine, but also from STRP1 and STRPN these two species. Among them, an overwhelming number of targets are from Bovine (95.4%, 83/87). Thus we annotate the biological process and KEGG pathway of Module 1 (S11 Table) by using these Bovine targets. For biological process, the top 5 are respectively proteolysis involved in proteolysis involved in cellular protein catabolic process, cellular protein catabolic process, protein catabolic process, regulation of small GTPase mediated signal transduction, and cellular macromolecule catabolic process. These biological processes are all associated with protein catabolic. Interestingly, it has been reported that lung disorders where the inflammatory mediators produce direct lung damage and cause catabolism or protein degradation [42]. And therefore, the molecules in Module 1 can therapeutic for Bovine pneumonia by intervening these functionally related target proteins. For example, Vicenin-2 (M155), a flavonoid glycoside, is a potential anti-inflammatory constituent of Licorice [43]. Inflammatory stimuli increase SAMHD1 [44], which is a target protein of Vicenin-2. In addition, by literature research, we also observe the Bovine pneumonia associated biological function of targets in Module 1 belong to other species. For example, Cas9, a target of STRP1, can mediate bacterial immunity. The result of KEGG pathway enrichment shows that MAPK signaling pathway, Inositol phosphate metabolism, Ubiquitin mediated proteolysis, Arrhythmogenic right ventricular cardiomyopathy (ARVC), and Cardiac muscle contraction pathway play important roles in Module 1. For example, MAPK signaling pathway (Fig 5) is a chain of proteins that plays a key role in anti-inflammatory therapy [45]. Members of Inositol phosphates metabolism pathway are a group of mono- to polyphosphorylated inositols [46]. They play crucial roles in diverse cellular functions, such as cell growth, apoptosis, cell migration, endocytosis, and cell differentiation [47]. Ubiquitin mediated proteolysis involves in the degradation of native cellular proteins [48].
Fig 5

Active mechanism of erchen decoction in combating Bovine pneumonia of Module 1.

Module 7

Module 7 is an example of a converged module that covers primarily of Bovin genes encoding proteins (17 of 28) and, STRP1 targets (9/28). Also, there are two STRPN targets in Module 7. In consideration of the number of targets in each species, we respectively enrich the biological process and KEGG pathway of Bovin (S11 Table) and STRP1 (S12 Table) targets in Module 7. The results show that these two categories of targets participate in diverse metabolic pathways and cellular roles. The Bovin targets in Module 7 are mainly involved in programmed cell death, cell death, death, apoptosis, and positive regulation of cellular component organization. Thus, molecules in Module 7 are intimately correlated to regulate cell apoptosis. This result is supported by a recent study that chlamydia pneumonia induces T cell apoptosis through glutathione redox imbalance and secretion of TNF-alpha [49]. In particular, Beta-Glucan (M188), which is derived from Tuckahoe, was predicted to target ACTA1. Beta-Glucan has been reported to inhibit the growth of bacteria, virus, and fungus [50], to stimulate macrophages as immune enhancer [51] and enhance apoptosis in human colon cancer cells SNU-C4 [52]. Interestingly, mutations in the gene ACTA1 account for cell death [53]. Also, we evaluate the 9 STRP1 targets to test whether the proteins encoded by genes in the same module have related functions. These targets involve in many biological processes, such as translation, amino acid activation, tRNA aminoacylation for protein translation, tRNA aminoacylation, and tRNA metabolic process. In brief, these biological processes are all relevant to cell metabolic. Study shows that changes in metabolic processes play a critical role in the survival or death of cells subjected to various stresses [54]. Thus, despite the targets in the same module belong to different species, they still share homogeneous function. The enriched KEGG pathway of STRP1 targets are Ribosome, Aminoacyl-tRNA biosynthesis and Valine, leucine and isoleucine biosynthesis. Interfering with these pathways has effects on protein metabolism. However, deregulation of proteostasis results in protein stress and damage that may cause cell death [55]. Hence, we can conclude that KEGG pathway enrichment could also reflect the function of the module. Together, these results indicate that the strategy in this study has the ability to capture the cellular response of multiple targets interference.

Conclusions

VHM is a holistic approach that is suited to evaluating the well-being of the whole animal, and treatments are commonly non-invasive with few side effects. Although quite new-fangled to the Western world, it is a health care system that has been used in China to treat animals for thousands of years. It is an adaptation and extension of Traditional Chinese Medicine used to heal humans. However, VHM lacks the tools necessary to identify the lead compounds which have the effect to treating animal illness. As a group with computational technology strengths, we first gravitate toward methods such as systems pharmacology [56] [24] that investigate databases or construct model for clues. The application of bioinformatics approaches enable us to elucidate the therapeutic effects of drugs at multiple scales of biological organization (the organ and organismal levels) through network analyses. And there have been a few examples of successful integration of different procedures to help determine the action mechanism of a small molecule[57] [58]. In this study, to clarify the procedure of veterinary drug discovery from herbal medicines, a cross-species chemogenomic platform was proposed. First, we build a cross-species drug-likeness evaluation approach to screen the lead compounds in veterinary medicines by critically examined pharmacology and text mining. We observe that erchen decoction can treat animal pneumonitis through multicomponent therapeutics. Furthermore, we compare the chemical properties of these molecules with equal number of randomly selected molecules. The results demonstrate that the constructed cross-species DL evaluation method is reliable to screen potentially active molecule. Second, to understand how drugs work on the specific targets, a specific cross-species target prediction model (CSDT) is developed to infer drug-target connection. In addition, by enriching the GO biological process of these targets, we find that all the biological processes of the targets are physiologically relevant. Thus, we can speculate that the active compounds in erchen decoction exert their therapeutic effect by interfering functional associated multiple targets network. To determine whether the therapeutic activity could be attributed to the selectively functional in target network, we subsequently converge the heterogeneous network and modulated analysis. Interestingly, the empirical analysis results demonstrate our scientific hypotheses. Finally, we manually characterize an integrated pathway to test whether the cross-species chemogenomic platform could uncover the active mechanism of veterinary medicine, which is exemplified by a network module. The cross-species chemogenomic platform shows how powerful the ability to effectively and systematically integrate large sets of disparate data will be in discovering new drugs and understanding the molecular mechanisms of a small molecule in biological systems. When done in a disciplined and thoughtful manner, such data integration characterizes a modern instantiation of the scientific approach, depending on high-throughput biotechnology, data consolidation and multidisciplinary tactics to offer hints and avenues to new targets and mechanisms of small-molecule action.

DL distribution of FDA-approved veterinary drugs.

(TIF) Click here for additional data file.

The flowchart of the CSDT model.

(TIF) Click here for additional data file.

ROC (Receiver Operating Characteristic) plot of CSDT model.

(TIF) Click here for additional data file. (A) The frequency histogram of molecule similarity between modules (blue) and within modules (brown). Firstly, the similarity among molecules in the same module are calculated by applying the Tanimoto similarity with their CDK fingerprints. Then, using the same method, we evaluate the molecular similarity among modules by comparing the molecules in different datasets. The result shows that mean similarity of molecules in the same module (0.57) higher than that between modules (0.35) (one-tailed student's t-test P-value = 2.3E-213). (B) The frequency histogram of target sequence similarity between modules (blue) and within modules (brown). The sequence similarity between two targets are calculated based on the Smith–Waterman sequence alignment score. The similarity score is normalized by dividing it by the geometric mean of the scores obtained from the S-score of each protein against itself. The result shows that the mean sequence similarity of proteins in the same module (0.031) higher than that between modules (0.021) (one-tailed student's t-test P-value = 1.45E-114). (TIF) Click here for additional data file.

The alignment of compounds on the best pharmacophore model for the protein TBC1D1 in module 1.

(TIF) Click here for additional data file.

Descriptors used to calculate DL.

(XLS) Click here for additional data file.

The drug-target interactions in Drugbank used to build the CSDT model.

(XLS) Click here for additional data file.

Descriptors used to construct the CSDT model.

(XLS) Click here for additional data file.

Reference validation of candidate active compounds.

(XLS) Click here for additional data file.

Compound-Target interactions.

(XLS) Click here for additional data file.

GOBP and KEGG pathway enrichment results of Bos Taurus targets.

(XLS) Click here for additional data file.

GOBP and KEGG pathway enrichment results of Streptococcus pyogenes targets.

(XLS) Click here for additional data file.

GOBP and KEGG pathway enrichment results of Streptococcus pneumonia targets.

(XLS) Click here for additional data file.

Target-target interactions.

(XLS) Click here for additional data file.

The value of SPECIFICITY, N_HITS, FEATS and PARETO for the 20 pharmacophore model of protein TBC1D1.

(DOCX) Click here for additional data file.

GOBP and KEGG pathway enrichment results of modules 1–11.

(XLS) Click here for additional data file.

GOBP and KEGG pathway enrichment results of STRP1 targets in module 7.

(XLS) Click here for additional data file.
  48 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Target-oriented and diversity-oriented organic synthesis in drug discovery.

Authors:  S L Schreiber
Journal:  Science       Date:  2000-03-17       Impact factor: 47.728

Review 3.  Ubiquitin-mediated proteolysis: biological regulation via destruction.

Authors:  A Ciechanover; A Orian; A L Schwartz
Journal:  Bioessays       Date:  2000-05       Impact factor: 4.345

Review 4.  Medicines from nature: are natural products still relevant to drug discovery?

Authors:  A L Harvey
Journal:  Trends Pharmacol Sci       Date:  1999-05       Impact factor: 14.819

5.  IL-10 modulates host responses and lung damage induced by Pneumocystis carinii infection.

Authors:  Mahboob H Qureshi; Allen G Harmsen; Beth A Garvy
Journal:  J Immunol       Date:  2003-01-15       Impact factor: 5.422

6.  Application of isoabsorption plots generated by high-performance liquid chromatography with diode array detection to the development of multicomponent quantitative analysis of traditional herbal medicines.

Authors:  Luo Fang; Guonong Yang; Yu Song; Fanzhu Li; Nengming Lin
Journal:  J Sep Sci       Date:  2014-09-25       Impact factor: 3.645

7.  Arabidopsis inositol polyphosphate 6-/3-kinase (AtIpk2beta) is involved in axillary shoot branching via auxin signaling.

Authors:  Zai-Bao Zhang; Guang Yang; Fernando Arana; Zhen Chen; Yan Li; Hui-Jun Xia
Journal:  Plant Physiol       Date:  2007-04-13       Impact factor: 8.340

Review 8.  Merging traditional Chinese medicine with modern drug discovery technologies to find novel drugs and functional foods.

Authors:  Rocky Graziose; Mary Ann Lila; Ilya Raskin
Journal:  Curr Drug Discov Technol       Date:  2010-03

9.  ER stress modulates cellular metabolism.

Authors:  Xiaoli Wang; Colins O Eno; Brian J Altman; Yanglong Zhu; Guoping Zhao; Kristen E Olberding; Jeffrey C Rathmell; Chi Li
Journal:  Biochem J       Date:  2011-04-01       Impact factor: 3.857

10.  Cross-species chemogenomic profiling reveals evolutionarily conserved drug mode of action.

Authors:  Laura Kapitzky; Pedro Beltrao; Theresa J Berens; Nadine Gassner; Chunshui Zhou; Arthur Wüster; Julie Wu; M Madan Babu; Stephen J Elledge; David Toczyski; R Scott Lokey; Nevan J Krogan
Journal:  Mol Syst Biol       Date:  2010-12-21       Impact factor: 11.429

View more
  1 in total

1.  Uncovering the active compounds and effective mechanisms of the dried mature sarcocarp of Cornus officinalis Sieb. Et Zucc. For the treatment of Alzheimer's disease through a network pharmacology approach.

Authors:  Yan-Jie Qu; Rong-Rong Zhen; Li-Min Zhang; Chao Gu; Lei Chen; Xiao Peng; Bing Hu; Hong-Mei An
Journal:  BMC Complement Med Ther       Date:  2020-05-25
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.