Literature DB >> 21097881

HIT: linking herbal active ingredients to targets.

Hao Ye1, Li Ye, Hong Kang, Duanfeng Zhang, Lin Tao, Kailin Tang, Xueping Liu, Ruixin Zhu, Qi Liu, Y Z Chen, Yixue Li, Zhiwei Cao.   

Abstract

The information of protein targets and small molecule has been highly valued by biomedical and pharmaceutical research. Several protein target databases are available online for FDA-approved drugs as well as the promising precursors that have largely facilitated the mechanistic study and subsequent research for drug discovery. However, those related resources regarding to herbal active ingredients, although being unusually valued as a precious resource for new drug development, is rarely found. In this article, a comprehensive and fully curated database for Herb Ingredients' Targets (HIT, http://lifecenter.sgst.cn/hit/) has been constructed to complement above resources. Those herbal ingredients with protein target information were carefully curated. The molecular target information involves those proteins being directly/indirectly activated/inhibited, protein binders and enzymes whose substrates or products are those compounds. Those up/down regulated genes are also included under the treatment of individual ingredients. In addition, the experimental condition, observed bioactivity and various references are provided as well for user's reference. Derived from more than 3250 literatures, it currently contains 5208 entries about 1301 known protein targets (221 of them are described as direct targets) affected by 586 herbal compounds from more than 1300 reputable Chinese herbs, overlapping with 280 therapeutic targets from Therapeutic Targets Database (TTD), and 445 protein targets from DrugBank corresponding to 1488 drug agents. The database can be queried via keyword search or similarity search. Crosslinks have been made to TTD, DrugBank, KEGG, PDB, Uniprot, Pfam, NCBI, TCM-ID and other databases.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 21097881      PMCID: PMC3013727          DOI: 10.1093/nar/gkq1165

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Interaction between small molecule and protein plays a critical role in modulating the intrinsic biological processes. One particular application is the discovery of druggable molecules based on the interaction with the target proteins. Target proteins are often those important ones in the development of specific diseases within the organism. Perturbing their functions by druggable molecules will help to cure the disease or relieve the symptoms. Therefore, the information related to protein targets and small molecule has always been highly valued by biomedical and pharmaceutical sciences. During the last decade, several drug–target interaction databases have been made available online which have largely facilitated the mechanistic study and subsequent research of drug discovery. For instance, Therapeutic Targets Database (TTD) (1) is the first therapeutic target database which sorted known and explored therapeutic proteins and nucleic acid targets and related information for corresponding drugs directed at each of these targets. While another important resource is DrugBank (2) which is a unique database that links detailed drug data to comprehensive drug target information. Such information has lead to integration of further resources and computational methods, such as PDTD (3), TarFisDock (4), STITCH (5) and others (6–9) which have served as valuable platforms for target identification, validation and drug actions. Herbal ingredients have long been viewed as precious sources by bio-pharmaceutical sciences because of not only the broad chemical structural diversity, but also the wide range of pharmacological activities and comparatively low side effect. It is estimated that approximately one-third (10) of the top-selling drugs in the world are derived from medicinal herbs. A well-known example is the artemisinin from Artemisia annua to treat malaria. In contrast to the well sorted compound–target information for western drugs, similar information for herbal ingredients is rarely found, perhaps partially because of the complicated nature of herbal medicine. To the author's knowledge, only one database (11) mentioned 78 protein targets for 2597 natural compounds, which obviously needs further updating. On the other hand, millions and millions were input to investigate what the potential targets are for promising herbal ingredients with particular pharmaceutical effects, or whether a synthesized compound has similar target profile with any active compounds from herbal plants. As the pharmacological activity could be inferred from related herbs, linking the herbal ingredients to their protein targets may help to bridge information between the natural products and western drugs via protein targets. Therefore, we here introduced a fully curated database for Herb Ingredients’ Targets (HIT), which is focused on available linking from the single herbal ingredient to its affecting protein targets derived from experimental results. Text mining technologies was firstly applied to PubMed abstracts in order to collect related literatures. Then curation was carefully done to retrieve desired information such as protein target name, action mode, experimental condition and other useful details. As the target information about directly physical interaction for single herbal ingredients is still limited to provide clues to the potential mechanism, indirect targets are collected together as a valuable complement.

THE DATABASE

HIT is currently hosted at http://lifecenter.sgst.cn/hit/. It contains three data fields (Table 1), namely compound information, herb information and protein targets information. The compound information was generated from Chemical Abstracts Service, Pubchem and ‘Dictionary of Natural Products’ (12). TCM-ID (13), a well established TCM integrated resource and the book ‘Traditional Chinese Medicines: Molecular Structures, Natural Sources & Applications’ (14) were used to derive herb information. Considering that more rigorous methods were applied in recent years to detect target–compound interaction, protein targets are curated from Pubmed abstracts published within the last 10 years (2000–2010). The biological annotation for a direct protein target covers detailed action modes of the herbal ingredient, such as activator, inhibitor, binder, agonist, antagonist, substrate or product, and simple target. Kinetic data such as IC50 and Kd/Ki was collected as well if possible. Besides that, the biological effect on indirect targets is indicated as ‘increase/decrease the level of expression/activity’ after being treated with a single herbal compound. The related pathway about the target proteins can be retrieved by following the links to KEGG (15). In addition, the links to TTD and DrugBank could bridge western drugs and herbal molecules at the level of protein targets.
Table 1.

Data fields covered in the entry of HIT

Compound informationHerb informationTarget information
Generic nameLatin nameTarget type (direct/indirect)
StructureChinese pin yinTarget name
IUPAC nameChinese characterBiological effect
AliasHerb functionIC50
Chemical formulaKd/Ki
CAS register numberExperimental environment
PubChem CID linksKey description
Code of compound classMolecular function
Hazard and toxicityCrosslinks
REICS accession numberLiterature support
Other comments
Data fields covered in the entry of HIT The search interface and results pages are illustrated in Figure 1. HIT can be queried via keyword search or similarity search.
Figure 1.

Primary pages in HIT. (A) Screenshot of interface for keyword search. HIT offers three optional search, namely by compound, by herb or by protein target. Text search is also provided by keyword in the whole database. (B) Interface of compound similarity search as an example. (C) Interface of target similarity search via protein sequence. (D) Result page of ‘Keyword Search’ with ‘Compound: EGCG’. (E) Result page of ‘Compound Similarity Search’ with the structure of the compound: EGCG. (F) Result page of ‘Target Similarity Search’ with the sequence of the protein: Fyn kinase (P06241). (G) The further linkage page of the first entry ‘HIT000001’ in D. Variety of chemical information, herbal information and brief protein description is available in this page with crosslinks to NCBI PubChem and PubMed. (H) Screenshot showing the detailed information of protein target: ‘Fyn Kinase’.

Keyword search can be made via herbal compound information [different names, CAS number, CID number, chemical formula, code of compound class, RTECS Accession Number (http://www.cdc.gov/niosh/rtecs/)], herb information (Latin name, Chinese name as Chinese pinyin or character) or protein target information (various protein/gene name or id). Full text search by keyword is provided as well. Similarity search is also available via compound structure or protein sequence. The compound structure can be uploaded as a MOL/SDF file or manually drawing with the build-in software MarvinSketch (http://www.chemaxon.com/marvin/). Target similarity search was enabled by Blast program via protein sequence. Primary pages in HIT. (A) Screenshot of interface for keyword search. HIT offers three optional search, namely by compound, by herb or by protein target. Text search is also provided by keyword in the whole database. (B) Interface of compound similarity search as an example. (C) Interface of target similarity search via protein sequence. (D) Result page of ‘Keyword Search’ with ‘Compound: EGCG’. (E) Result page of ‘Compound Similarity Search’ with the structure of the compound: EGCG. (F) Result page of ‘Target Similarity Search’ with the sequence of the protein: Fyn kinase (P06241). (G) The further linkage page of the first entry ‘HIT000001’ in D. Variety of chemical information, herbal information and brief protein description is available in this page with crosslinks to NCBI PubChem and PubMed. (H) Screenshot showing the detailed information of protein target: ‘Fyn Kinase’.

METHODS

Herb ingredient names

Herb ingredient names are derived from a well established TCM knowledge database TCM-ID which covers 1102 reputable herbs and 9862 herb ingredients. These compound names were used to screen PubMed abstracts and only those abstracts containing the compound names were recorded.

Keywords library

Establishing a key word library is critically important to retrieve the related literatures. We randomly choose individual compound and checked the full-text review papers to establish this library. Fifty nine keywords are listed in Table 2, which are frequently used to describe the interaction between compound and proteins. The keywords are divided into two types. One is the nouns describing the interaction (Type A), while the other (Type B) is the phrases describing the specific effect such as inhibit the activity of some proteins.
Table 2.

Keyword library to describe the interaction between herbal ingredients and proteins

Interaction
Effect
PositiveNegativeGeneral
Type AAgonist; activatorAntagonist; inhibitorBind; target; bound
Type BActivate; Augment; Ameliorate; Derepress; Elevate; Enhance; Hasten; Increase; Induce; Incitate; Initiate Potentiate; Promote; Raise; Stimulate; Up-regulateAbrogate; Abolish; Against; Attenuate; Antagonize; Block; Blunt; Down regulate; Decrease; Degrade; Diminish; Impair; Inhibit; Reduce; Repress; SuppressAffect; Interact; Disturb; Regulate; Impact; Influence; Interfere; Modify; ModulateActivity; Activation; Expression; Level; Pathway; Cleavage; Methylation; Phosphorylation; Severance; Glycosylation; Acetylation
Keyword library to describe the interaction between herbal ingredients and proteins

Text mining and curation

For the above recorded abstracts, text mining was rescanned on them according to below rules: Manual check was done to all the abstracts being text mined to retrieve useful information into HIT. Rule 1: ‘Compound name’ AND ‘any word in type A’ For instance, the sentence ‘(-)-Epigallocatechin-3-gallate is a novel Hsp90 inhibitor’ matches the rule well. Rule 2: ‘Compound name’ AND ‘any word in type B interaction’ AND ‘any word in type B effect’ For example, the sentence ‘procyanidin B2 directly inhibited membrane type-1 (MT1)-MMP activity’ is a perfect match.

Compound similarity search

The calculation of the similarity between two compounds are based on structural fingerprints that generated by Chemistry Development ToolKit (http://almost.cubic.uni-koeln.de/cdk/), using Tanimoto coefficient (16). Given a compound A and a database compound B, the Tanimoto coefficient for binary vectors is defined as where, a and b are the number of bits set on (‘1’ bits) in molecular fingerprints A and B, respectively and c the number of bits shared by A and B. This function only accounts for the sum of ‘1’ bits. That is, bits that are set off are not taken into account in similarity calculations. Tanimoto coefficient is typically above 0.8 for similar compounds (17,18).

DISCUSSION AND FUTURE DEVELOPMENT

In summary, HIT is intended to be a primary resource as a complement to other drug–target databases by providing integrative information between medicinal herbs, herb active compounds and the protein target under different experimental conditions. As one important source for drug discovery, some of the herbal ingredients are under intensive pharmacological research, while plenty of them are still to be discovered during which the molecular mechanism is a big challenge. The application of HIT may represent a valuable support to facilitate the mechanistic study of herbal medicine, to discover new druggable molecules, as well as to identify potential therapeutic targets. However, the action mechanism of herbal medicine is typically featured as ‘multiple ingredients and multiple targets’ which may differ from western drugs to a large extent. The actual biological effects would be much more complicated under different situations when different compounds are grouped together into one herb. It should be aware of that, the biological function a compound A does not always imply the same function for a herb X which contains A because herb X often contains many other compounds. The global and collective effects of many compounds may be different from each single compound. Thus, it is advised that multiple factor analysis and statistical methods should be applied coupled with corresponding experimental and clinical results when efforts are made to drug discovery. HIT is planned for further enlargement. We will continue collecting target information for more herbal active compounds. Disease condition and batch query function will be considered as well. In addition, HIT is free for academic use. The data can be downloaded upon individual request.

FUNDING

Ministry of Science and Technology, China (2008BAI64B02, 2009ZX10004-601 and 2010CB833601, partial); National Natural Science Foundation of China (30900832 and 30976611); Ministry of Education (NCET-08-0399); Shanghai Municipal Education Commission (08ZZ18); ‘Shu Guang’ project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation (07SG22); Shanghai Baiyulan Funding (2010B127). Conflict of interest statement. None declared.
  15 in total

1.  Traditional Chinese medicine information database.

Authors:  J F Wang; H Zhou; L Y Han; X Chen; Y Z Chen; Z W Cao
Journal:  Clin Pharmacol Ther       Date:  2005-07       Impact factor: 6.875

2.  Phytochemical databases of Chinese herbal constituents and bioactive plant compounds with known target specificities.

Authors:  Thomas M Ehrman; David J Barlow; Peter J Hylands
Journal:  J Chem Inf Model       Date:  2007 Mar-Apr       Impact factor: 4.956

3.  Update of TTD: Therapeutic Target Database.

Authors:  Feng Zhu; BuCong Han; Pankaj Kumar; XiangHui Liu; XiaoHua Ma; Xiaona Wei; Lu Huang; YangFan Guo; LianYi Han; ChanJuan Zheng; YuZong Chen
Journal:  Nucleic Acids Res       Date:  2009-11-20       Impact factor: 16.971

4.  Genomes2Drugs: identifies target proteins and lead drugs from proteome data.

Authors:  David Toomey; Heinrich C Hoppe; Marian P Brennan; Kevin B Nolan; Anthony J Chubb
Journal:  PLoS One       Date:  2009-07-10       Impact factor: 3.240

5.  TarFisDock: a web server for identifying drug targets with docking approach.

Authors:  Honglin Li; Zhenting Gao; Ling Kang; Hailei Zhang; Kun Yang; Kunqian Yu; Xiaomin Luo; Weiliang Zhu; Kaixian Chen; Jianhua Shen; Xicheng Wang; Hualiang Jiang
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

6.  wwLigCSRre: a 3D ligand-based server for hit identification and optimization.

Authors:  O Sperandio; M Petitjean; P Tuffery
Journal:  Nucleic Acids Res       Date:  2009-05-08       Impact factor: 16.971

7.  T3DB: a comprehensively annotated database of common toxins and their targets.

Authors:  Emilia Lim; Allison Pon; Yannick Djoumbou; Craig Knox; Savita Shrivastava; An Chi Guo; Vanessa Neveu; David S Wishart
Journal:  Nucleic Acids Res       Date:  2009-11-06       Impact factor: 16.971

8.  STITCH 2: an interaction network database for small molecules and proteins.

Authors:  Michael Kuhn; Damian Szklarczyk; Andrea Franceschini; Monica Campillos; Christian von Mering; Lars Juhl Jensen; Andreas Beyer; Peer Bork
Journal:  Nucleic Acids Res       Date:  2009-11-06       Impact factor: 16.971

9.  DrugBank: a knowledgebase for drugs, drug actions and drug targets.

Authors:  David S Wishart; Craig Knox; An Chi Guo; Dean Cheng; Savita Shrivastava; Dan Tzur; Bijaya Gautam; Murtaza Hassanali
Journal:  Nucleic Acids Res       Date:  2007-11-29       Impact factor: 16.971

10.  KEGG for linking genomes to life and the environment.

Authors:  Minoru Kanehisa; Michihiro Araki; Susumu Goto; Masahiro Hattori; Mika Hirakawa; Masumi Itoh; Toshiaki Katayama; Shuichi Kawashima; Shujiro Okuda; Toshiaki Tokimatsu; Yoshihiro Yamanishi
Journal:  Nucleic Acids Res       Date:  2007-12-12       Impact factor: 16.971

View more
  104 in total

Review 1.  Bioinformatics opportunities for identification and study of medicinal plants.

Authors:  Vivekanand Sharma; Indra Neil Sarkar
Journal:  Brief Bioinform       Date:  2012-05-15       Impact factor: 11.622

2.  A Systems Pharmacology Approach Uncovers Wogonoside as an Angiogenesis Inhibitor of Triple-Negative Breast Cancer by Targeting Hedgehog Signaling.

Authors:  Yujie Huang; Jiansong Fang; Weiqiang Lu; Zihao Wang; Qi Wang; Yuan Hou; Xingwu Jiang; Ofer Reizes; Justin Lathia; Ruth Nussinov; Charis Eng; Feixiong Cheng
Journal:  Cell Chem Biol       Date:  2019-06-06       Impact factor: 8.116

3.  Quantitative and Systems Pharmacology. 1. In Silico Prediction of Drug-Target Interactions of Natural Products Enables New Targeted Cancer Therapy.

Authors:  Jiansong Fang; Zengrui Wu; Chuipu Cai; Qi Wang; Yun Tang; Feixiong Cheng
Journal:  J Chem Inf Model       Date:  2017-10-13       Impact factor: 4.956

4.  Deciphering the therapeutic mechanisms of Xiao-Ke-An in treatment of type 2 diabetes in mice by a Fangjiomics approach.

Authors:  Zhen-zhong Yang; Wei Liu; Feng Zhang; Zheng Li; Yi-yu Cheng
Journal:  Acta Pharmacol Sin       Date:  2015-05-11       Impact factor: 6.150

Review 5.  Review of natural product databases.

Authors:  Tao Xie; Sicheng Song; Sijia Li; Liang Ouyang; Lin Xia; Jian Huang
Journal:  Cell Prolif       Date:  2015-05-25       Impact factor: 6.831

6.  Add-On therapy with Chinese herb medicine Bo-Er-Ning capsule (BENC) improves outcomes of gastric cancer patients: a randomized clinical trial followed with bioinformatics-assisted mechanism study.

Authors:  Boyu Pan; Jingyuan Zang; Jie He; Zhen Wang; Liren Liu
Journal:  Am J Cancer Res       Date:  2018-06-01       Impact factor: 6.166

7.  Bitter or not? BitterPredict, a tool for predicting taste from chemical structure.

Authors:  Ayana Dagan-Wiener; Ido Nissim; Natalie Ben Abu; Gigliola Borgonovo; Angela Bassoli; Masha Y Niv
Journal:  Sci Rep       Date:  2017-09-21       Impact factor: 4.379

8.  Synergistic effects of Chuanxiong-Chishao herb-pair on promoting angiogenesis at network pharmacological and pharmacodynamic levels.

Authors:  Yan Wang; Gang Guo; Bin-Rui Yang; Qi-Qi Xin; Qi-Wen Liao; Simon Ming-Yuen Lee; Yuan-Jia Hu; Ke-Ji Chen; Wei-Hong Cong
Journal:  Chin J Integr Med       Date:  2017-05-27       Impact factor: 1.978

9.  Quantitative and systems pharmacology 4. Network-based analysis of drug pleiotropy on coronary artery disease.

Authors:  Jiansong Fang; Chuipu Cai; Yanting Chai; Jingwei Zhou; Yujie Huang; Li Gao; Qi Wang; Feixiong Cheng
Journal:  Eur J Med Chem       Date:  2018-10-15       Impact factor: 6.514

10.  Ingredients, Anti-Liver Cancer Effects and the Possible Mechanism of DWYG Formula Based on Network Prediction.

Authors:  Yao Li; Han-Min Li; Zhi-Cheng Li; Ming Yang; Rui-Fang Xie; Zhi Hua Ye; Xiang Gao; Xin Zhou
Journal:  Onco Targets Ther       Date:  2020-05-15       Impact factor: 4.147

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.