| Literature DB >> 30034674 |
Atilla Ergüzen1, Erdal Erdal1.
Abstract
Digital medical image usage is common in health services and clinics. These data have a vital importance for diagnosis and treatment; therefore, preservation, protection, and archiving of these data are a challenge. Rapidly growing file sizes differentiated data formats and increasing number of files constitute big data, which traditional systems do not have the capability to process and store these data. This study investigates an efficient middle layer platform based on Hadoop and MongoDB architecture using the state-of-the-art technologies in the literature. We have developed this system to improve the medical image compression method that we have developed before to create a middle layer platform that performs data compression and archiving operations. With this study, a platform using MapReduce programming model on Hadoop has been developed that can be scalable. MongoDB, a NoSQL database, has been used to satisfy performance requirements of the platform. A four-node Hadoop cluster has been built to evaluate the developed platform and execute distributed MapReduce algorithms. The actual patient medical images have been used to validate the performance of the platform. The processing of test images takes 15,599 seconds on a single node, but on the developed platform, this takes 8,153 seconds. Moreover, due to the medical imaging processing package used in the proposed method, the compression ratio values produced for the non-ROI image are between 92.12% and 97.84%. In conclusion, the proposed platform provides a cloud-based integrated solution to the medical image archiving problem.Entities:
Mesh:
Year: 2018 PMID: 30034674 PMCID: PMC6033252 DOI: 10.1155/2018/3984061
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1The number of MRI by years.
Figure 2Data amounts by years.
Figure 3Flow chart of medical image processing package.
Figure 4Dynamic file structure.
Figure 5System overview.
Figure 6System architecture.
Figure 7Search engine flow chart.
Figure 8MapReduce/Hadoop steps.
Figure 9Sharded cluster method [28].
Figure 10GridFS architecture.
Image segmentation performance comparison.
| Image | Image segment | Image region (%) | Region-Based file size (MB) | Total file size (MB) |
|---|---|---|---|---|
|
| ROI | 29.39 | 2,560 | 8,712 |
| Non-ROI | 70.61 | 6,152 | ||
|
| ||||
|
| ROI | 43.23 | 3,991 | 9,231 |
| Non-ROI | 56.77 | 5,240 | ||
|
| ||||
|
| ROI | 32.12 | 2,777 | 8,647 |
| Non-ROI | 67.88 | 5,870 | ||
|
| ||||
|
| ROI | 28.76 | 2,318 | 8,059 |
| Non-ROI | 71.24 | 5,741 | ||
|
| ||||
|
| ROI | 27.33 | 2,318 | 8,482 |
| Non-ROI | 72.67 | 6,164 | ||
|
| ||||
|
| ROI | 36.21 | 3,390 | 9,361 |
| Non-ROI | 63.79 | 5,971 | ||
|
| ||||
| Average | ROI | 32.84 | 2,892 | 8,749 |
| Non-ROI | 67.16 | 5,856 | ||
Segment-based compression ratios.
| Image | Image segment | Compression ratio |
|---|---|---|
|
| ROI | 3,006 |
| Non-ROI | 92,122 | |
|
| ||
|
| ROI | 2,255 |
| Non-ROI | 97,322 | |
|
| ||
|
| ROI | 3,916 |
| Non-ROI | 96,451 | |
|
| ||
|
| ROI | 3,542 |
| Non-ROI | 95,853 | |
|
| ||
|
| ROI | 2,851 |
| Non-ROI | 97,842 | |
|
| ||
|
| ROI | 3,246 |
| Non-ROI | 94,739 | |
File sizes.
| Image | Original image file size (MB) | Compressed image file size (MB) |
|---|---|---|
|
| 8,712 | 0,919 |
|
| 9,231 | 1,823 |
|
| 8,647 | 0,770 |
|
| 8,059 | 0,714 |
|
| 8,482 | 0,876 |
|
| 9,361 | 1,107 |
| Average | 8,749 | 1,035 |
Figure 12Processing speeds.
Total time.
| Single node (sec) | Proposed method (sec) | |
|---|---|---|
| Image average | 15,5993 | 13,6269 |
| Total time | — | 8,1537 |
Test queries.
| Query number | Query description |
|---|---|
| I | To write data (MIPP data package and indexes) |
| II | To retrieve results containing one numerical value (“HIMS ID = 52721”) |
| III | To retrieve results containing search engine criteria (“PatientName = Erdal,” “PatientSurname = Erdal,” “StartDate = 01.01.2016,” “EndData = 01.01.2017”) |
Write data.
| Dummy record number | Query 1 response time (MS) | |
|---|---|---|
| SQL server | Sharded MongoDB | |
| 1000 | 1,37 | 1,28 |
| 10,000 | 14,98 | 8,11 |
| 100,000 | 143,08 | 76,28 |
| 1,000,000 | 1409,32 | 778,94 |
Select-indexed data.
| Dummy record number | Query 2 response time (MS) | |
|---|---|---|
| SQL server | Sharded MongoDB | |
| 1000 | 1,48 | 4,83 |
| 10,000 | 6,63 | 3,56 |
| 100,000 | 33,69 | 25,72 |
| 1,000,000 | 318,50 | 276,97 |
Select nonindexed data.
| Dummy record number | Query 3 response time (MS) | |
|---|---|---|
| SQL server | Sharded MongoDB | |
| 1000 | 3,23 | 7,56 |
| 10,000 | 13,66 | 13,95 |
| 100,000 | 79,97 | 92,44 |
| 1,000,000 | 814,42 | 650,97 |