| Literature DB >> 21884635 |
Morihito Takita1, Yuji Tanaka, Yuko Kodama, Naoko Murashige, Nobuyo Hatanaka, Yukiko Kishi, Tomoko Matsumura, Yukio Ohsawa, Masahiro Kami.
Abstract
BACKGROUND: Allogenic hematopoietic stem cell transplantation is a curative treatment for patients with advanced hematologic malignancies. However, the long-term mental health issues of siblings who were not selected as donors (non-donor siblings, NDS) in the transplantation have not been well assessed. Data mining is useful in discovering new findings from a large, multidisciplinary data set and the Scenario Map analysis is a novel approach which allows extracting keywords linking different conditions/events from text data of interviews even when the keywords appeared infrequently. The aim of this study is to assess mental health issues on NDSs and to find helpful keywords for the clinical follow-up using a Scenario Map analysis.Entities:
Year: 2011 PMID: 21884635 PMCID: PMC3164612 DOI: 10.1186/2043-9113-1-19
Source DB: PubMed Journal: J Clin Bioinforma ISSN: 2043-9113
Conceptual differences of data mining approach.
| Research area | Electronic medical record | Genomics/Proteomics | This study: Mental health on NDS |
|---|---|---|---|
| Data source | Physicians/nurses' Description, laboratory data and radiologic images on medical record | Gene expression data from cDNA microarray/mass spectrometry | Interview with the subject |
| Expected results | Automatic and effective data extraction/sorting | Extraction of genes/proteins with statistical significance | Extraction of important and rarely-appeared words |
| Concept* | Supervised/Unsupervised approach | Supervised/Unsupervised approach | Unsupervised approach |
| Representative algorism of data mining technique | Data extraction matching with prepared data criteria | To provide statistically meaningful analysis for high-throughput and multi-dimensional biological data in the association with phenotype | To discover unanticipated, rarely appeared key-elements by Scenario Map analysis |
| Aims | Linking between medical record description and research issues | To discover new biomarker or diagnostic method | For better clinical follow-up by understanding unanticipated individual concerns |
Conceptual differences of data mining approach in representative medical research areas are shown. *Supervised approach aims for testing or validation of hypothesis while unsupervised approach used for discovering unanticipated events or knowledge.
Figure 1A working flow. The subject was interviewed using open-ended question style and text data of the interview was generated. KeyGraph was created and tuned by an information engineer in discussion with healthcare professionals. The final KeyGraph was interpreted in detail by healthcare professionals and provided the subject the feedback. Scenario Map analysis includes interactive framework between computer outputs by an information engineer and healthcare professionals to obtain a comprehensive graph.
The list of words in frequency and co-occurrence order.
| Cluster | Word | Frequency |
|---|---|---|
| Pre-transplant | Sibling | 10 |
| The most | 9 | |
| Next | 8 | |
| Place D* | 8 | |
| Doctor A* | 7 | |
| Word | 6 | |
| Results | 6 | |
| Emotion | Child | 126 |
| Mind | 15 | |
| Person A* | 11 | |
| Suffering | 10 | |
| Paralysis | 7 | |
| Absolute | 6 | |
| Transplantation process | Place G* | 12 |
| Telephone | 10 | |
| Doctor B* | 7 | |
| Subject's life | Elder sister | 16 |
| Leukemia | 9 | |
| Nursing | 8 | |
| University | 7 | |
| Other** | Younger sister | 50 |
| Myself | 48 | |
| Bone marrow | 46 | |
| Father | 44 | |
| Transplant | 43 | |
| Mother | 42 | |
| Previous | 24 | |
| Patient | 23 | |
| Kid | 21 | |
| Place A* | 21 | |
| Bank | 18 | |
| Place B* | 16 | |
| Donor | 15 | |
| Hospital | 15 | |
| Blastic crisis | 14 | |
| Mom | 12 | |
| Family | 10 | |
| HLA | 10 | |
| Home | 10 | |
| Together | 7 | |
| Book | 6 | |
Words appearing more than 6 times in the interview were defined as high-frequency in this study. Words in the same cluster have high co-occurrence each other. *Replaced words to protect personal information. **Words independently placed or had low-levels of co-occurrence with the other words in KeyGraph.
Figure 2Key Graph. Black and white nodes indicate high and less frequently used words in the interview, respectively. The solid, dashed and dotted line indicates degree of co-occurrence between nodes as high, middle and low level, respectively. White nodes indicate words that appeared less frequently in the interview. Personal information was exchanged to general words before submission of the manuscript. Abbreviations; NMDP: the National Marrow Donor Program, HLA: Human Leukocyte Antigen.
Figure 3Interpretation of KeyGraph. The clusters and the keywords were extracted based on the interpretation of Figure 2. Each cluster was named by pre-transplant (A), emotion (B), transplant process (C) and subject's life (D). Keywords were shown as boxed text.