| Literature DB >> 28983419 |
Feichen Shen1, Hongfang Liu2, Sunghwan Sohn2, David W Larson3, Yugyung Lee1.
Abstract
In the current biomedical data movement, numerous efforts have been made to convert and normalize a large number of traditional structured and unstructured data (e.g., EHRs, reports) to semi-structured data (e.g., RDF, OWL). With the increasing number of semi-structured data coming into the biomedical community, data integration and knowledge discovery from heterogeneous domains become important research problem. In the application level, detection of related concepts among medical ontologies is an important goal of life science research. It is more crucial to figure out how different concepts are related within a single ontology or across multiple ontologies by analysing predicates in different knowledge bases. However, the world today is one of information explosion, and it is extremely difficult for biomedical researchers to find existing or potential predicates to perform linking among cross domain concepts without any support from schema pattern analysis. Therefore, there is a need for a mechanism to do predicate oriented pattern analysis to partition heterogeneous ontologies into closer small topics and do query generation to discover cross domain knowledge from each topic. In this paper, we present such a model that predicates oriented pattern analysis based on their close relationship and generates a similarity matrix. Based on this similarity matrix, we apply an innovated unsupervised learning algorithm to partition large data sets into smaller and closer topics and generate meaningful queries to fully discover knowledge over a set of interlinked data sources. We have implemented a prototype system named BmQGen and evaluate the proposed model with colorectal surgical cohort from the Mayo Clinic.Entities:
Keywords: Biomedical Knowledge Discovery; Pattern Analysis; Predicate; Query Generation
Year: 2016 PMID: 28983419 PMCID: PMC5626454 DOI: 10.4236/iim.2016.83006
Source DB: PubMed Journal: Intell Inf Manag ISSN: 2160-5912
Figure 1The BmQGen framework.
Figure 2The BmQGen dataflow.
Mapping among Clinical Free Text, MedTagger terms and RDF triples.
| Clinical Free Text | MedTagger Terms | RDF Triples |
|---|---|---|
| 1 Patient’s abdominal wound was exacerbated by dressing changes | Abdominal wound exacerbated by dressing change | {abdominal_wound, exacerbated_by, dressing_change} |
| 2 Any problems including increased erythema around the wound | Problems including erythema around wound | {problems, erythema, wound} |
| 3 The residual urine levels drop below certain level | Urine drop below level | {urine, drop, below_level} |
| 4 There is substantial further elevation in patient’s troponins | Place has further elevation in troponins | {place, elevation, troponins} |
| 5 More hypotension requiring initiation of pressor, to achieve satisfactory blood pressure | Hypotension requiring blood | {hypotension, requiring, blood} |
Predicate sharing patterns.
| Patterns | Sharing with S and O through P |
|---|---|
| Subject-Object Share | Si = = Sj & & Oi = = Oj |
| Subject Share | Si = = Sj |
| Object Share | Oi = = Oj |
Predicate connectivity patterns.
| Patterns | Connecting between S and O through P | |
|---|---|---|
|
| ||
| Symbol | Condition | |
| Path Connectivity | Si → P1 → Oi → P2 → Oj | P1 ≠ P2 && Oi = Sj |
| Cycle Connectivity | Si → P1 → Oi → P2 → Oj | P1 ≠ P2 && Oi = Sj && Si = Oj |
Figure 3Predicate neighboring level and weighted similarity.
Hierarchical fuzzy C-Means clustering (HFCM).
| // | ||
| // | ||
| Input: | ||
| // a hierarchy with a set of clusters | ||
| Output: | ||
| 1. | ||
| 2. | ||
| 3. | // | |
| 4. | // optimal | |
| 5. | ||
| 6. | | //fine the optimal |
| 7. | Change1 = false | |
| 8. | ||
| 9. | Change1 = true | |
| 10. | | |
| 11. | ||
| 12. | | |
| 13. | ||
| 14. |
| |
| 15. | | |
| 16. | Change2 = false | |
| 17. | ||
| 18. | | |
| 19. | ||
| 20. | Update Cluster( | |
| 21. | | |
| 22. | ||
| 23. | ||
| 24. | | |
| 25. | ||
| 26. | ||
| 27. | Changed2 = true | |
| 28. | ||
| 29. | //computesil houette width | |
| 30. | | |
| 31. | | |
Figure 4Clustering predicate graph for query generation.
Definition of colorectal postsurgical complication.
| Postsurgical Complication | Description |
|---|---|
| Abscess/Leak (ABSCESS) | An abscess is a painful collection of pus, usually caused by a bacterial infection. Coloanal anastomoses have the highest rates. |
| Bleeding (BLEED) | Minor and major bleeding is common in anastomotic complications. Epinephrine and saline retention enemas are used to manage serious bleeding. Surgical intervention is necessary if situation is getting worse. |
| Deep vein thrombosis (DVT)/pulmonary embolism (PE) (DVTPE) | DVT is a condition wherein a blood clot forms in a vein of the deep system. A piece of the clot can break off and travel through the lung, which can cause heart failure, known as PE. |
| Ileus (ILEUS) | Ileus is defined as bowel obstruction. For small bowel obstruction, 90–100% sensitivity can be achieved by a CT scan of the abdomen and pelvis. |
| Myocardial infraction (MI) | Myocardial infarction is commonly known as a heart attack. It occurs during surgery or within 30 days after surgery. |
| Wound infection (INFECTION) | Wound infections commonly present around the fifth postsurgical day and 5–15% of patients have such complication after colorectal surgery. |
Figure 5Visualization of 6 complication ontologies.
Colorectal surgical cohort.
| # of Subjects | # of Predicates | # of Objects | # of Unique Triples | |
|---|---|---|---|---|
| ABSCESS | 63 | 13 | 89 | 220 |
| BLEED | 58 | 13 | 73 | 142 |
| DVTPE | 19 | 10 | 26 | 32 |
| ILEUS | 227 | 21 | 204 | 624 |
| MI | 52 | 12 | 53 | 132 |
| INFECTION | 26 | 14 | 37 | 60 |
Figure 6Visualization of hierarchical fuzzy C-Means clustering.
Figure 7Detailed information for Topics 1–4.
Figure 8Detailed information for Topics 5–8.
Figure 9Predicate oriented clustering decision making on different levels.
Figure 10Correlation matrices for golden standard.
Figure 11Cross complications queries.
Figure 12Single complication queries.