| Literature DB >> 35304504 |
Ziqi Chen1, Bo Peng1, Vassilis N Ioannidis2, Mufei Li3, George Karypis2, Xia Ning4.
Abstract
Effective and successful clinical trials are essential in developing new drugs and advancing new treatments. However, clinical trials are very expensive and easy to fail. The high cost and low success rate of clinical trials motivate research on inferring knowledge from existing clinical trials in innovative ways for designing future clinical trials. In this manuscript, we present our efforts on constructing the first publicly available Clinical Trials Knowledge Graph, denoted as [Formula: see text]. [Formula: see text] includes nodes representing medical entities in clinical trials (e.g., studies, drugs and conditions), and edges representing the relations among these entities (e.g., drugs used in studies). Our embedding analysis demonstrates the potential utilities of [Formula: see text] in various applications such as drug repurposing and similarity search, among others.Entities:
Mesh:
Year: 2022 PMID: 35304504 PMCID: PMC8933553 DOI: 10.1038/s41598-022-08454-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schema of .
Statistics of node types in .
| Node type | Is study specific? | Statistics |
|---|---|---|
| Study | Yes | 8210 |
| Condition | No | 1394 |
| Drug term | No | 2548 |
| Event group | Yes | 22,725 |
| Adverse event | No | 18,546 |
| Organ | No | 27 |
| Baseline group | Yes | 27,068 |
| Baseline record | Yes | 315,533 |
| Drop group | Yes | 22,272 |
| Period | Yes | 34,330 |
| Drop record | Yes | 123,627 |
| Outcome group | Yes | 32,499 |
| Method | No | 907 |
| Outcome measurement | Yes | 690,626 |
| Outcome analysis | Yes | 107,294 |
| Outcome | Yes | 88,386 |
| Standard outcome | No | 492 |
| Cluster outcome | No | 200 |
Statistics of relation types in .
| Relation type | Node type 1 | #Node 1 | Node type 2 | #Node 2 | #Relations |
|---|---|---|---|---|---|
| Study-Condition | Study | 8210 | Condition | 1394 | 17,259 |
| Study-EventGroup | Study | 8172 | Event group | 22,725 | 22,725 |
| Study-BaselineGroup | Study | 8209 | Baseline group | 27,068 | 27,068 |
| Study-DropGroup | Study | 8210 | Drop group | 22,272 | 22,272 |
| Study-OutcomeGroup | Study | 8210 | Outcome group | 32,499 | 32,499 |
| Study-Outcome | Study | 8210 | Outcome | 88,386 | 88,386 |
| Study-StudiedDrug | Study | 8169 | Drug term | 2373 | 20,982 |
| Study-UsedDrug | Study | 2234 | Drug term | 920 | 3992 |
| Drug-EventGroup | Drug term | 2201 | Event group | 21,790 | 31,528 |
| EventGroup-AdverseEvent | Event group | 20,571 | Adverse event | 18,546 | 966,450 |
| AdverseEvent-Organ | Adverse event | 18,546 | Organ | 27 | 18,546 |
| BaselineGroup-BaselineRecord | Baseline group | 27,068 | Baseline record | 315,533 | 315,533 |
| DropGroup-Period | Drop group | 22,272 | Period | 34,330 | 34,330 |
| Period-DropRecord | Period | 25,956 | Drop record | 123,627 | 123,627 |
| OutcomeGroup-OutcomeMeasurement | Outcome group | 32,240 | Outcome measurement | 690,541 | 690,541 |
| OutcomeGroup-OutcomeAnalysis | Outcome group | 23,923 | Outcome analysis | 107,294 | 209,314 |
| OutcomeAnalysis-Method | Outcome analysis | 91,463 | Method | 907 | 91,463 |
| Outcome-OutcomeAnalysis | Outcome | 45,689 | Outcome analysis | 107,294 | 107,294 |
| Outcome-OutcomeMeasurement | Outcome | 85,905 | Outcome measurement | 690,626 | 690,626 |
| Outcome-ClusterOutcome | Outcome | 88,244 | Cluster outcome | 200 | 88,244 |
| Outcome-StandardOutcome | Outcome | 50,342 | Standard outcome | 492 | 58,819 |
Columns represent: “Relation type”: the type of relation; “Node type 1”: the type of head nodes in the relations; “#Node 1”: the number of unique head nodes with the relations; “Node type 2”: the type of tail nodes in the relations; “#Node 2”: the number of unique tail nodes with the relations; “#Relations”: the number of relations of a relation type.
Similar condition nodes and drug-term nodes for drug repurposing.
| Similarity | Similar nodes | Possible evidence |
|---|---|---|
| 0.597 | Diabetes Mellitus, Type 2 | Alogliptin benzoate, an agent of Benzoates, is now available for treatment of type 2 diabetes[ |
| Benzoates | ||
| 0.587 | Diabetes Mellitus, Type 2 | Pulmonary surfactant involves in delaying the fetal lung biochemical maturation by maternal diabetes[ |
| Pulmonary Surfactants | ||
| 0.576 | Diabetes Mellitus | Pulmonary surfactant involves in delaying the fetal lung biochemical maturation by maternal diabetes[ |
| Pulmonary Surfactants | ||
| 0.574 | Lung Neoplasms | Representatives of triterpenes show anti-cancer properties against multiple types of cancer including lung cancer[ |
| Triterpenes | ||
| 0.562 | Lung Neoplasms | Pregnenediones shows promising activity against lung cancer cell lines[ |
| Pregnenediones |
The average cosine similarity between condition nodes and drug-term nodes is - 0.032.
Similar study nodes.
| Similarity | Similar nodes | Possible evidence | |
|---|---|---|---|
| NCT00795769 | 0.840 | NCT01789255 | NCT00918333 and NCT00720109 investigate therapies for conditions that could be treated by the stem cell transplant (e.g., Lymphoma). All the other studies are on preventing side effects following the stem cell transplant |
| 0.741 | NCT00918333 | ||
| 0.713 | NCT00105001 | ||
| 0.672 | NCT00293384 | ||
| 0.629 | NCT00720109 | ||
| NCT01431274 | 0.826 | NCT01431287 | All the studies investigate therapies for the Chronic Obstructive Pulmonary Disease (COPD) |
| 0.737 | NCT01559116 | ||
| 0.721 | NCT02796651 | ||
| 0.716 | NCT00782509 | ||
| 0.709 | NCT00931385 | ||
| NCT00137111 | 0.825 | NCT00866307 | All the studies investigate therapies for different sub-types of Leukemia (e.g., Acute Lymphoblastic Leukemia, Acute Myeloid Leukemia) |
| 0.748 | NCT00720109 | ||
| 0.747 | NCT00136084 | ||
| 0.744 | NCT00808639 | ||
| 0.724 | NCT00119262 | ||
| NCT00782509 | 0.784 | NCT00796653 | All the studies investigate the safety and efficacy of BI 1744 CL in patients with COPD |
| 0.749 | NCT00793624 | ||
| 0.735 | NCT01040793 | ||
| 0.724 | NCT01040130 | ||
| 0.700 | NCT00782210 | ||
| NCT02105688 | 0.801 | NCT02252016 | All the studies investigate therapies for the Chronic Hepatitis C Virus (HCV) |
| 0.688 | NCT02105467 | ||
| 0.662 | NCT02358044 | ||
| 0.655 | NCT01544920 | ||
| 0.652 | NCT02216422 |
The average cosine similarity among Study nodes is 0.301.
Top-10 most similar condition nodes.
| Similarity | Similar nodes | Possible evidence |
|---|---|---|
| 0.997 | Nephritis | Lupus Nephritis is a common sub-type of Nephritis |
| Lupus Nephritis | ||
| 0.997 | Hepatitis | Hepatitis A is a special sub-type of Hepatitis |
| Hepatitis A | ||
| 0.996 | Rhinitis | Rhinitis, Allergic is a sub-type of Rhinitis caused by allergy |
| Rhinitis, Allergic | ||
| 0.996 | Urinary Bladder Disease | Urinary Bladder Disease is a special sub-type of Urologic Disease |
| Urologic Disease | ||
| 0.996 | Arthritis | Arthritis, Rheumatoid is a chronic inflammatory Arthritis |
| Arthritis, Rheumatoid | ||
| 0.995 | Neovascularization, Pathologic | Both of the conditions are sub-types of Neovascularization |
| Choroidal Neovascularization | ||
| 0.995 | Diabetes Mellitus | Diabetes Mellitus, Type 2 is a common sub-type of Diabetes Mellitus |
| Diabetes Mellitus, Type 2 | ||
| 0.994 | Alopecia | Alopecia Areata is a sub-type of Alopecia |
| Alopecia Areata | ||
| 0.994 | Depression | Depression is also known as major Depressive Disorder in Clinics[ |
| Depressive Disorder | ||
| 0.993 | Keratosis | Keratosis, Actinic is a sub-type of Keratosis |
| Keratosis, Actinic |
The average cosine similarity among condition nodes is 0.311.
Top-10 most similar drug-term nodes.
| Similarity | Similar nodes | Possible evidence |
|---|---|---|
| 0.997 | ABT-267 | Both ABT-267 and Macrocyclic Compounds could be used to treat Hepatitis C Virus (HCV) infection[ |
| Macrocyclic Compounds | ||
| 0.996 | Pulmonary Surfactants | Pulmonary Surfactants is a type of Surface-Active Agents[ |
| Surface-Active Agents | ||
| 0.995 | Phenylethyl Alcohol | Phenylethyl Alcohol and LY2216684 are studied together in study NCT00922636, NCT01243957 and NCT01380691 |
| LY2216684 | ||
| 0.994 | Thioguanine | Thioguanine is a substitute of Mercaptopurine in treating childhood lymphoblastic leukaemia[ |
| Mercaptopurine | ||
| 0.993 | Cilastatin | Cilastatin and Imipenem are commonly used together as a treatment for serious infections[ |
| Imipenem | ||
| 0.985 | Metylperon | Metylperon is an atypical antipsychotic of the Butyrophenone chemical class[ |
| Butyrophenones | ||
| 0.983 | Ubiquinone | Ubiquinone is a form of Coenzyme Q10[ |
| Coenzyme Q10 | ||
| 0.982 | PHiD-CV Vaccine | Both of the drug terms are vaccines for diphtheria[ |
| VAXELIS | ||
| 0.982 | Propafenone | Both Propafenone and Sotalol could maintain sinus rhythm for patients with recurrent symptomatic atrial fibrillation[ |
| Sotalol | ||
| 0.980 | SNAP25 Protein | SNAP25 Protein could block Acetylcholine from releasing at the neuromuscular junction[ |
| Acetylcholine |
The average cosine similarity among drug-term nodes is 0.254.
Top-10 most similar adverse-event nodes.
| Similarity | Similar nodes | Possible evidence |
|---|---|---|
| 0.998 | Blood Luteinising Hormone Increased | Luteinising Hormone (LH) can affect the growth of Uterus Myomatosus by controling the level of estrogen[ |
| Uterus Myomatosus | ||
| 0.997 | Inpatient Hospitalization | Excess length of inpatient hospitalization can lead to ulceration[ |
| Ulceration | ||
| 0.997 | Major Bleeding Event | Patients receiving hemodialysis are at risk for major bleeding event and catheter-related infection[ |
| Infection with Unknown Anc, Catheter-Related | ||
| 0.996 | Blood Luteinising Hormone Increased | The level of LH is related to uterine bleeding[ |
| Major Bleeding Event | ||
| 0.995 | Blood Luteinising Hormone Increased | LH may regulate skin functions via LH receptors on skin[ |
| Skin Procedural Complication | ||
| 0.995 | Skin Procedural Complication | Both are similar to the |
| Uterus Myomatosus | ||
| 0.995 | Infection with Unknown Anc, Catheter-Related | Patients with prostatic obstruction often receive urinary catheters, and are at risk for catheter-related infection[ |
| Prostatic Obstruction | ||
| 0.994 | Gi Tract Perforation | Diabetes can induce Gi Tract Perforation[ |
| Latent Autoimmune Diabetes in Adults | ||
| 0.994 | Cervix Carcinoma Stage III | Both of the adverse events are related with Uterus |
| Vanishing Twin Syndrome | ||
| 0.994 | Major Bleeding Event | Uterus Myomatosus can associate with major bleeding event[ |
| Uterus Myomatosus |
The average cosine similarity among adverse-event nodes is 0.329.
Top-10 most similar standard-outcome nodes.
| Similarity | Similar nodes | Possible evidence |
|---|---|---|
| 0.986 | Aspartate Aminotransferase | Both are enzymes that are tested to check liver damage[ |
| Alanine Aminotransferase | ||
| 0.955 | Swollen Joint Count | Both are used to assess patients with rheumatoid arthritis[ |
| Tender Joint Count | ||
| 0.952 | Calcium | Both are electrolyte that can be tested to monitor a range of medical conditions[ |
| Potassium | ||
| 0.952 | Incomplete Response | Both are used to assess the response to treatment[ |
| Partial Response | ||
| 0.946 | Aspartate Aminotransferase | Both can be tested to check liver damage[ |
| Blood Urea Nitrogen | ||
| 0.941 | Potassium | Both are included in basic metabolic panel blood test[ |
| Blood Urea Nitrogen | ||
| 0.940 | Calcium | Both are included in basic metabolic panel blood test[ |
| Blood Urea Nitrogen | ||
| 0.930 | Alanine Aminotransferase | Both can be tested to check kidney damage[ |
| Blood Urea Nitrogen | ||
| 0.930 | Hemoglobin A1c | Hemoglobin A1c represents the hemoglobin in the blood that has glucose attached to it[ |
| Hemoglobin | ||
| 0.923 | Erythrocyte Sedimentation Rate | Disease Activity Score 28 can be calculated based on Erythrocyte Sedimentation Rate[ |
| Disease Activity Score 28 |
The average cosine similarity among standard-outcome nodes is 0.315.
Figure 2Flow chart of construction.
AACT tables included and not included in .
| Included in | Not included in |
|---|---|
| Baseline counts | Brief summaries |
| Baseline measurements | Calculated values |
| Browse conditions | Central contacts |
| Browse intervention | Countries |
| Conditions | Design group interventions |
| Designs | Design groups |
| Drop withdrawals | Design outcomes |
| Eligibilities | Detailed descriptions |
| Interventions | Documents |
| Id information | Facilities |
| Milestones | Facilities contacts |
| Outcome measure | Facility investigator |
| Outcome analyses | Intervention other names |
| Outcome analysis groups | Ipd information types |
| Outcomes | Keywords |
| Participant flows | Links |
| Reported events | Overall officials |
| Result groups | Pending results |
| Studies | Provided documents |
| Study references | Responsible parties |
| Result agreements | |
| Result contacts | |
| Sponsors |