| Literature DB >> 22851884 |
Yang Zhang1, Simon Fong, Jinan Fiaidhi, Sabah Mohammed.
Abstract
This research aims to describe a new design of data stream mining system that can analyze medical data stream and make real-time prediction. The motivation of the research is due to a growing concern of combining software technology and medical functions for the development of software application that can be used in medical field of chronic disease prognosis and diagnosis, children healthcare, diabetes diagnosis, and so forth. Most of the existing software technologies are case-based data mining systems. They only can analyze finite and structured data set and can only work well in their early years and can hardly meet today's medical requirement. In this paper, we describe a clinical-support-system based data stream mining technology; the design has taken into account all the shortcomings of the existing clinical support systems.Entities:
Mesh:
Year: 2012 PMID: 22851884 PMCID: PMC3407674 DOI: 10.1155/2012/580186
Source DB: PubMed Journal: J Biomed Biotechnol ISSN: 1110-7243
Existing clinical decision support systems.
| Name | Author/source | Based on |
|---|---|---|
| A decision tree for tuberculosis contact investigation [ | Gerald LB, Tang S, Bruce F et al., Am J Respir Crit Care Med 2002; 166: 1122–1127 | Traditional decision tree |
| Iliad [ | Developed by University of Utah School of Medicine's Department of Medical Informatics | Bayesian network |
| An artificial neural network ensemble to predict disposition and length of stay in children presenting with bronchiolitis [ | Walsh P, Cunningham P, Rothenberg SJ, O'Doherty S, Hoey H, Healy R. | Neural network |
| MYCIN [ | Developed at Stanford University by Dr. Edward Shortliffe in the 1970s | Rules |
| BioStream: a system architecture for real-time processing of physiological signals (data stream mining, focus on detection) [ | Amir Bar-Or, David Goddeau, Jennifer Healey, Leonidas Kontothanasis, Beth Logan, Alex Nelson, JM Van Thong | Physical data stream detection QRS (the algorithm was not described clearly in original paper) |
Defects of traditional implementations.
| Algorithm | Defect |
|---|---|
| Traditional decision tree | Only can analyze static and finite data set. Cannot handle data stream |
| Bayesian network | Difficulty to get the probability knowledge for possible diagnosis and not being practical for large complex systems given multiple symptoms |
| Neural network | Training process consume so much time that users cannot use the systems effectively |
| Rules | It is difficult for experts to transfer their knowledge into distinct rules, and it needs many rules to make system effectively |
Figure 1Logical structure of our design.
Figure 2Classification by VFDT.
Figure 3Leaf node structure.
Mapping table example.
| Pointer | RID | Physical address |
|---|---|---|
| P1 | R1 | 00000C900000FFFF |
| P2 | R2 | 000B80000000FFFF |
| … | … | … |
Figure 4Pointer list in leaf node.
Figure 5Training and searching process.
Figure 6How mapping works in the initial training stage for VFDT.
Figure 7Extract most frequently used treatment from similar history records.
Figure 8Prediction process.
Figure 9Feedback process.
Comparison between IBM's system and our design.
| IBM | My design | |
|---|---|---|
| Need offline analysis | Yes | No |
| System resources | Offline analysis (LSML) needs to compute the distance between matrixes and this process will cost a lot of resources | No need to do complex calculations, the most complex calculation is just the update of VFDT |
| Need training | No, but it also needs analysis of the database before formal using (for cluster the records in database) | Yes, before formal use it needs initial training |