Liang Zhang1, Johann Li1, Ping Li2, Xiaoyuan Lu2, Maoguo Gong1, Peiyi Shen1, Guangming Zhu1, Syed Afaq Shah3, Mohammed Bennamoun4, Kun Qian5, Björn W Schuller6,7. 1. Xidian University, Xi'an, China. 2. Data and Virtual Research Room, Shanghai Broadband Network Center, Shanghai, China. 3. College of Science, Health, Engineering and Education, Murdoch University, Perth, Australia. 4. School of Computer Science and Software Engineering, The University of Western Australia, Crawley, Australia. 5. School of Medical Technology, Beijing Institute of Technology, Beijing, China. 6. GLAM - Group on Language, Audio & Music, Imperial College London, London, UK. 7. Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany.
Abstract
In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields, such as computer vision and healthcare. Particularly, DL is experiencing an increasing development in advanced medical image analysis applications in terms of segmentation, classification, detection, and other tasks. On the one hand, tremendous needs that leverage DL's power for medical image analysis arise from the research community of a medical, clinical, and informatics background to share their knowledge, skills, and experience jointly. On the other hand, barriers between disciplines are on the road for them, often hampering a full and efficient collaboration. To this end, we propose our novel open-source platform, i.e., MEDAS-the MEDical open-source platform As Service. To the best of our knowledge, MEDAS is the first open-source platform providing collaborative and interactive services for researchers from a medical background using DL-related toolkits easily and for scientists or engineers from informatics modeling faster. Based on tools and utilities from the idea of RINV (Rapid Implementation aNd Verification), our proposed platform implements tools in pre-processing, post-processing, augmentation, visualization, and other phases needed in medical image analysis. Five tasks, concerning lung, liver, brain, chest, and pathology, are validated and demonstrated to be efficiently realizable by using MEDAS. MEDAS is available at http://medas.bnc.org.cn/.
In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields, such as computer vision and healthcare. Particularly, DL is experiencing an increasing development in advanced medical image analysis applications in terms of segmentation, classification, detection, and other tasks. On the one hand, tremendous needs that leverage DL's power for medical image analysis arise from the research community of a medical, clinical, and informatics background to share their knowledge, skills, and experience jointly. On the other hand, barriers between disciplines are on the road for them, often hampering a full and efficient collaboration. To this end, we propose our novel open-source platform, i.e., MEDAS-the MEDical open-source platform As Service. To the best of our knowledge, MEDAS is the first open-source platform providing collaborative and interactive services for researchers from a medical background using DL-related toolkits easily and for scientists or engineers from informatics modeling faster. Based on tools and utilities from the idea of RINV (Rapid Implementation aNd Verification), our proposed platform implements tools in pre-processing, post-processing, augmentation, visualization, and other phases needed in medical image analysis. Five tasks, concerning lung, liver, brain, chest, and pathology, are validated and demonstrated to be efficiently realizable by using MEDAS. MEDAS is available at http://medas.bnc.org.cn/.
Deep learning is the present cutting-edge technique in computer vision, natural language processing, and other areas, particularly healthcare. Thanks to its power, researchers can use a regular pipeline to process and analyze the data and then obtain excellent results with the aid of deep learning. For instance, there are a lot of recent studies that apply deep learning in their research, especially medical image analysis [20, 38, 49, 87, 97, 100]. However, most researchers, who use deep learning in their research on medical image-related tasks, are professionals in computer science, and not medicine. Due to the lack of computer-related knowledge, it is hard for medical researchers to understand and apply deep learning in their research individually for tasks such as tumor segmentation and nuclei classification. As to computer science researchers, they cannot amply analyze their results without the help of medical researchers. This gap between computer science and the medical field creates a bottleneck for the use of deep learning in medical image analysis.The program, which is designed by the programmer, is a series of instructions to operate the hardware. Programming is a skill to convert what they want to do into instructions, like the metaphor. However, directly operating the hardware with instructions is very difficult for most people. Thus, there are many concepts created, such as sub-routine, dynamically linked libraries, sharing objects, compiler, and framework. These concepts catch the importance of wrap and reuse. Programmers could avoid building everything from scratch step by step but focus on what they should focus on instead. TensorFlow [1], ITK [33], and OpenCV [65] are typical examples to help researchers simplify the implementation of their program in deep learning and image analysis.In medical areas, ITK [33], ANTs [4], FSL [35], Deep Neuro [6], and NiftyNet [24] are the prevalent toolkits, libraries, and frameworks to help researchers develop programs to analyze medical images and data. These tools, libraries, and frameworks can help them to register, process, visualize, and analyze images. However, it still requires a significant level of programming skills for interested medical researchers to apply them in their researchers.Generally, when using the computer as a tool to solve a problem, the approaches can be categorized into three levels. The first level is to control the computer by programming from bottom to top; the second level is to solve the problem by combining other libraries via programming; the third level is to interact with out-of-the-box software via the user interface. The first two levels require expert programming skills, and that limits the usage of the computer from non-professionals. This creates a challenging situation for these studies.However, when we take a closer look at the most use-cases of deep learning-based medical image analysis, one easily sees that pre-processing, augmentation, neural networks, post-processing, visualization, augmentation, and debugging are commonly used pipelines, and the medical researchers are not the ones to create these tools but the ones to set up the parameter and to use them. Furthermore, those researchers could simply combine these tools and make up their models without programming when applying deep learning in their studies with visualization programming.Nevertheless, all these frameworks and toolkits are not integrated as a system. Researchers need to assemble their programs from here and there one by one with their programming skills, unlike out-of-the-box tools, for example, Microsoft Excel and IBM’s SPSS. In order to help researchers build deep learning models easily, the MEDical open-source platform As Service (MEDAS) is proposed in the oncoming. MEDAS provides a collaborative and interactive platform that allows researchers to work together to build their algorithms by coding or visualization programming.The main idea of MEDAS is to provide a scalable platform and integrate a set of tools to cover the implementation of deep learning models for medical image analysis. Moreover, MEDAS not only provides tools and utilities, functions, and modules commonly used in deep learning but can also help researchers to manage their computing resources and refine their models. Currently, MEDAS primely provides tools and components for the classification, detection, and segmentation tasks of the MRI images, CT images, and pathology images, respectively.We organize the remainder sections as follows. Section 2 introduces the related work on deep learning, medical image analysis, Docker, and other technologies. Section 3 expounds on our main idea of rapid implementation and verification, i.e., RINV. Section 4 discusses the main components that MEDAS provides for users to implement their algorithms and models. Section 5 introduces the utilities that MEDAS provides to simplify programming, management, and refining. Section 6 introduces several case studies of MEDAS, including pulmonary nodule detection & attribute classification, liver contour segmentation, multi-organ segmentation, Alzheimer’s disease classification, and nuclei segmentation. Finally, Section 7 provides a discussion of open questions, and Section 8 concludes this paper.
Related work
In this section, we introduce (1) the related toolkits and software for medical image analysis, (2) the deep learning frameworks used in most relevant works, and (3) other related technologies and software.
Toolkits of medical image
For analysis of the medical images, many institutions, companies, and researchers created toolkits, and we list some commonly used of them.ANTs Advanced Neuroimaging Tools [4] is a toolkit for brain images and provides functions to visualize, process, and analyze the multi-modal image of the brain.FreeSurfer FreeSurfer [22] is an open-source toolkit for processing and analyzing MR images, and it includes functions about skull stripping, image registration, subcortical segmentation, cortical surface reconstruction, cortical segmentation, cortical thickness estimation, longitudinal processing, fMRI Analysis, tractography, and GUI-based visualization.ITK Insight Segmentation and Registration Toolkit [52] is the most popular toolkit widely used in medical image analysis. The functions provided by ITK include basic operations of medical images, visualization, pre-processing, registration, and segmentation. It is implemented with C++ and offers templates and bindings for Python, Java, and other languages.
Deep learning-based medical image toolkits
We listed some toolkits and software based on deep learning and focused on medical image analysis.DeepNeuro DeepNeuro [6] is an open-source toolkit, which provides out-of-the-box algorithm modules and applications based on deep learning.MIScnn Medical Image Segmentation with Convolutional Neural Networks [62], which was released recently, targeted medical image segmentation based on convolutional neural networks and deep learning. It provides pipelines and programming-based methods to help users to create their dedicated models.NiftyNet NiftyNet [24] is another open-source toolkit, similar to DeepNeuro, and provides a series of components such as dataset splitting, data augmentation, data processing, pre-designed networks, and evaluation metrics. NiftyNet aims at medical image analysis with deep learning.
Deep learning frameworks
Deep learning frameworks can help researchers to avoid wasting time on the implementation and verification of the algorithms for the low level. Here, we list the most popular deep learning frameworks.Caffe Jia et al. created the Caffe framework [36], which is an abbreviation for Convolutional Architecture for Fast Feature Embedding. It provides a useful open-source deep learning framework to fill the gap between different devices and platforms.PyTorch Facebook released Torch—a scientific computing framework. It widely supports machine learning algorithms on the GPU. A few years later, Facebook released another deep learning framework, named PyTorch [65, 66], which puts Python first. Now, it is one of the most popular deep learning frameworks for researchers.TensorFlow Google released a deep learning framework named TensorFlow [1], aimed at tensor-based deep learning. TensorFlow is based on dataflow graphs and can run on different devices, including CPU, GPU, and Google’s TPU. TensorFlow is widely used in both research and industry because it can run on scaled from a personal computing device to server clusters. Moreover, Google also open-sourced several tools for TensorFlow, for example, TensorBoard.
Docker and visual programming
Docker [32] is a kind of container platform and also is an industrial-level resource management solution. Docker takes on the management task of computing resources, which frees its users to focus on their researches. It allows containers to be launched in a short time, and it also allows the mass of applications to run on the host and keep the host without any affection.NVIDIA released nvidia-docker [78] in 2015, which makes it possible to use a CUDA-enabled GPU in Docker containers. In this way, researchers can use the GPU to accelerate algorithms in Docker.Kubernetes [90] is one of the most famous Docker cluster management software, which can save one from managing a lot of workstations or servers. Users could just submit their tasks, run them on a machine, and supervise them on a web-based user interface.Visualization programming allows users to create programs by manipulating program pipelines graphically or by drag-and-drop elements, such as Unreal Engine’s Blueprints Visual Scripting [56] and Scratch [55]. That allows naive programmers or researchers not familiar with programming to build deep learning models quickly by drag-and-drop operations.A version of the “convolution block”. It is combined with convolution layers, ReLU active functions, and batch normalization layer
Rapid implementation and verification
The naive motivation behind MEDAS is to make the application of deep learning easier for both computer and medical researchers in their works. The applications of deep learning-based methods require many computer-concerned skills and knowledge. To be able to use these methods, researchers need to know how to configure the software and hardware, how to program based on the libraries and frameworks, and other advanced operations. Thus, we provide MEDAS as an out-of-the-box software and aim to provide a way of implementation and verification but hiding the details of configuring low-level software and hardware.Such an idea to implement and verify a model is called “Rapid Implementation aNd Verification” (RINV). RINV aims at the workflow from the sketch to the final program and results. Based on this idea, MEDAS provides tools and utilities to help researchers simplify the implementation and verification to focus on the research of the model and algorithm. We introduce RINV in this section, and Sect. 4 & 5 present the tools and utilities based on RINV.Just like a medical researcher does not have to build a CT scanner before he or she wants to scan, they should not be required to spend unnecessary time on the implementation of deep learning algorithms before using it, either. Most of the algorithms and mathematical models are a combination of sub-algorithms, sub-pipelines, and other models. For example, as shown in Fig. 1, a type of the “convolution block”, which is widely used in deep learning, is combined by a series of sub-layers. Thus, for medical researchers without the knowledge of deep learning and computer science, combining the existing models and setting up the parameters is the best way to apply deep learning methods in their research. One example is simply dragging and dropping with visualization programming.
Fig. 1
A version of the “convolution block”. It is combined with convolution layers, ReLU active functions, and batch normalization layer
The patchwork of algorithms and models only provided a way to simplify the implementation. However, to drive them to work, the hardware and software should also be configured and managed correctly, besides the implementation. That is called “resources auto-management”.Generally, there are a lot of steps involved to convert an idea, a formula, or a model to a program or even a basic system. Such a process of transforming can be split into four tiers, as shown in Fig. 2. At , researchers need to do everything by themselves. They need to implement and verify the algorithms with C++ and assembly, convert mathematical formulas to a program, make sure the program runs on the correct device, manage computing resources, visualize results, and so on. At , researchers can use naive algorithm toolkits to implement the complex program but still need to manage the device resources manually. At , the management of computing resources should be handled automatically. aims to convert mathematical formulas to results directly.
Fig. 2
There are four tiers of deep learning model development. Tier one is the implementation with C/C++ and assembly, such as for cuDNN [13]. The next tier is the combination of the basic blocks, for example, by using TensorFlow or PyTorch. Tier three includes the management of resources to help users focus on the model itself. Tier four aims to convert formulas to a program directly, meaning the implementation and verification are automatically completed by the software
Tier four is a moonshot, but still a utopian design. However, researchers mostly prefer tier four, because it is not required with coding but can get results easily. Our aim for MEDAS is to achieve functions of tier three, which can provide efficient tools for users to implement and verify their models and algorithms and help them manage their resources efficiently.Back to verification, it is different from implementation and testing in software development. The verification of the deep learning-based methods and applications focus on two points, (1) the evaluation of results by metrics and visualization and (2) the interpretability. Therefore, we add the visualization, analysis, and interpretability function into the MEDAS.There are four tiers of deep learning model development. Tier one is the implementation with C/C++ and assembly, such as for cuDNN [13]. The next tier is the combination of the basic blocks, for example, by using TensorFlow or PyTorch. Tier three includes the management of resources to help users focus on the model itself. Tier four aims to convert formulas to a program directly, meaning the implementation and verification are automatically completed by the software
Why RINV works?
MEDAS focuses on the application of deep learning-based algorithms to medical image analysis, and the prime focuses are the classification, detection, and segmentation tasks of medical image analysis. The codes of these tasks are sharing a similar architecture. There are three key parts of the codes: (1) how to process medical data, (2) how to design and train their model, and (3) how to optimize parameters.For non-computer researchers, the first two challenges to run their codes are (1) to configure the environment of their computer and (2) to write the codes, particularly with non-deep-learning parts. MEDAS can help them configure the environment, and MEDAS can also avoid coding repetitively by reusing the tools provided by MEDAS, for both computer and non-computer researchers. MEDAS models the program to train a deep learning algorithm into seven parts: datasets management, pre-processing, data augmentation, neural network, post-processing, visualization, and training components. MEDAS provides components of these parts to let researchers reuse them so that their implementation and verification of their algorithms can be simplified. The details of these components are shown in the following sections.
Core: tools of deep learning
Similar to existing toolkits and frameworks, MEDAS provides a series of tools to allow users to create algorithms and models with the idea of “rapid implementation and verification” by the combination of bricks. We introduce these tools in this section, and the whole architecture and other utilities of MEDAS present in Sect. 5.After analyzing the pipeline of deep learning from our and others’ researches [10, 15, 29, 47, 71, 72, 85, 102, 103], we found that the pipeline in these studies shares similarities. The workflow of medical image processing is relatively fixed. For most algorithms and methods, for example, graph cut, the workflow usually includes:Each step or component has its purpose of processing, and the importance of these pipelines is obvious. Figure 3 shows the workflow of the typical deep learning-based medical image processing pipelines.
Fig. 3
The general flow of an application of deep learning for MRI image analysis. The flow shows pipelines and components about pre-processing, post-processing, augmentation, evaluation, visualization, and others
Dataset managementPre-processing [5, 38, 64, 91, 99]Augmentation [38, 50, 67, 83]Kernel algorithmPost-processing [38]Visualization [101] and other operationsThe general flow of an application of deep learning for MRI image analysis. The flow shows pipelines and components about pre-processing, post-processing, augmentation, evaluation, visualization, and othersMEDAS implements a series of tools to meet these requirements, including pre-processing, post-processing, data augmentation, artificial neural network, visualization, and other tools.
Pre-processing
As its name implies, pre-processing is the step before the training of neural networks and includes feature processing and data processing. The typical example of feature processing includes feature extraction, noise reduction, data normalization, and modalities registration, while the typical data processing contains format conversion, annotation transformation, and others. We implement the necessary tools to help researchers process the data before they train their models.There usually exists a data bias in medical images. For radiography, such as CT and PET, the images are noisy due to the different pieces of equipment, operators, and protocols [73, 80]. Therefore, MEDAS implements the commonly used registration tool and N4 bias field correction tool [94] to process data.For pathology, the difference in stain concentration and brands might cause different results in images [54, 74, 77]. Thus, the stain normalization tool [95] and the stain deconvolution tool [77] are applied.Furthermore, for general purposes, the normalization tool, resample tool, rescale tool, mask generating tool, resize tool, and other tools are implemented. To process data files, MEDAS also implements a series of tools, including format conversion, annotation conversion, and others.
Augmentation
Usually, the scale of datasets in the medical areas is considerably smaller than in others [7, 44, 81]. The public medical image datasets generally have 100 to 1000 cases, while other datasets—for example, for 3D object detection [98]—usually feature thousands and even millions of data. Therefore, augmentation is necessary to enlarge the size of the dataset. Medical image datasets “always” lack data, compared to other areas, because it takes too much time, cost, and manpower to collect and annotate medical images.Augmentation is an efficient method to make the model more robust, not only in medical image analysis but also in other areas. Augmenting with mirroring, rotating, cropping, and deep-learning-based methods, for example, Generative Adversarial Network, are frequently used. Augmentation diversifies the data by making it look “different”—which can improve the model performance [25, 60]. The key to augmentation is that the distribution of data expands so that the robustness of the model increases.MEDAS provides general transformation tools, Gaussian random noise, rescaling, and others. Gaussian random noise uses noise to enhance the robustness of the model, while some tools desensitize the noise of the scale and the bias by resampling and transforming the distribution of the data.
Artificial neural network
The neural network is the most important part of deep learning. MEDAS provides several tools integrated with different types of neural networks for training and inferring. Meanwhile, MEDAS plans to integrate a neural architecture search tool, which aims at automatically designing neural networks for specific tasks.The neural network (model) training is a fixed workflow, which includes forward propagation, loss calculation, and backward propagation [46] and is built by connecting “blocks” such as “max-pooling layer”, ”convolution layer”, “fully connected layer”, “ResBlock”, “Dense Block”, and so on [2, 28, 31, 43, 76, 84, 88]. Loss function influences the search in the parametric space, and the different loss functions meet the different tasks. As the neural network intends to be applied merely as a tool by medical researchers, they are considered to be users and not developers. Therefore, the tools with pre-designed models can be the best choice and can meet the needs of researchers who want to focus on the application side of matters.Since a few neural networks have achieved significant success in many medical image analysis tasks, MEDAS implements those networks as tools for segmentation, classification, and other tasks. For instance, the 3D Mask RCNN [27] and 3D Dual-Path Net [12] are integrated for the detection and classification tasks on radiography images. The U-Net [76] and V-Net [58] are integrated for the segmentation task. Besides, the U-Net is also available to be used in classification tasks. The other similar neural networks are also integrated for these tasks.Though the prime framework currently supported by MEDAS is PyTorch, MEDAS also supports other frameworks, such as TensorFlow. MEDAS implements the compatibility layer so that the heterogeneous models can be trained, respectively. When users need to reuse their trained model, users need to load the parameter of the model saved in the step of training and execute the model. MEDAS will manage models which are encapsulated as tools, and the parameter of the model is archived in the storage of MEDAS.
Post-processing
Post-processing is a strategy that can improve the result. For segmentation tasks, post-processing can make predictions more “smooth”. For example, [26] employed an FCN-based neural network, which is simpler to UNet and VNet, but achieves better performance compared with pure UNet. The key to its success is post-processing. It uses “horizontal and vertical gradient maps”, “energy landscape”, and other features in the post-processing and then use the watershed algorithm to process. Furthermore, the Conditional Random Field [11], Graph Cut [37], and other traditional algorithms can also be used as post-processing to optimize the results of a neural network.In a few cases, the output of the neural network is a probability or a probability map. The tools, for example, binary normalization, can be used for the classification and segmentation tasks, which will reach better results compared to a simple threshold.Besides the post-processing tools introduced above, the MEDAS also provides another series of post-processing tools for the neural network itself. The model compression and pruning tools can help researchers generate smaller but faster models with better accuracy. MEDAS employed the following tools:Parameter pruning and sharing [14, 19, 45, 96]Low-rank factorization [18, 89]Knowledge distillation [16, 48]
Visualization
Generally speaking, the visualization can be categorized into result visualization, metric visualization, and analysis visualization.The result shows the neural network output and keeps important links between the model and the clinic side [30, 34, 51, 101, 104]. The input and output of the neural network in the medical image analysis, are not the color-based 2-D images. It is hard to show them directly, in particular when we want to analyze the relationship between the input and the output. Thus, a well-designed tool of visualization can help users present and analyze their researches corresponding with the clinical aspects, such as segmentation visualization, mesh-based image visualization, point cloud-based lesion visualization, and others.For metric visualization, MEDAS implements tools to record the metrics and visualize them as the image, for example, the loss visualization tool.For analysis visualization, MEDAS implements a series of tools for different kinds of tasks. The saliency visualization, attention visualization, feature visualization, gradient propagation visualization, t-SNE visualization, sensitivity analysis, and other visualization tools are implemented.
Others
MEDAS also includes other tools, for example, dataset management. The dataset management tool aims at the management of the dataset. For example, if one wants to split one’s data into a training set and a testing set, one can use the dataset split tool.
Architecture of MEDAS
Different from traditional toolkits or frameworks, MEDAS is a system but not just a collection of functions and tools. Researchers can only utilize traditional toolkits and frameworks via programming, but MEDAS provides visualization programming to help researchers intuitively and easily implement their algorithms and models. In the following subsections, we discuss the visualization-based programming, auto-machine learning, Python API, and resource management, and other features of MEDAS.The general architecture of MEDAS: from the bottom (machine) to the top (user). The user can use MEDAS and its tools (Sect. 4), auto-machine learning (Sect. 5.2), resources management (Sect. 5.4), and other components via Python API (Sect. 5.3) or visualization programming interface (Sect. 5.1).Figure 4 shows the architecture of MEDAS. From the bottom to top, the figure depicts each component of MEDAS, including: visualization programming, Python API, tools, auto-machine learning, and resource management. The users can interact with MEDAS via Python API or visualization programming, while the latter provides more functions integrated by MEDAS, such as auto-machine learning. Resource management is a part of MEDAS does not provide any application programming interface. The resources management controls the tasks scheduling and device allocation, which directly interact with the machine.
Fig. 4
The general architecture of MEDAS: from the bottom (machine) to the top (user). The user can use MEDAS and its tools (Sect. 4), auto-machine learning (Sect. 5.2), resources management (Sect. 5.4), and other components via Python API (Sect. 5.3) or visualization programming interface (Sect. 5.1).
Moreover, we introduce the technical details of the implementation of MEDAS.
Visualization programming
Until the invention of the graphical user interface (GUI), anyone who wanted to use a computer needs to operate the machine by itself or the professional operator. During this time, the operators were the experts of computers who dressed up in formal attire and worked in a specific room to handle the science problems from other scientists, and the interfaces of the computer were the teletypewriter-based terminal or the monitor-based terminal. The computer has its own rules, but these rules broke after the rise of the GUI. Software developers convert instructions from the “human rules” to “computer rules”, which is called “implementation”. GUI-based software can efficiently help non-professionals to translate their ideas from “human rules” to “computer rules”, to execute them, and to show the results of the execution.Usually, a GUI is considerably more intuitive than a Command Line Interface (CLI) or any text-based interface—especially, for the people not or less familiar with computers. If well designed, GUI can render the operation of tools simple, visualization of results more accessible and the users efficient, but CLI cannot.MEDAS provides a web-based interface for researchers to manage and browse their tasks and data. The interface includes a visualization programming module, where researchers can implement their models by dragging, dropping, and connecting. Based on the website, the researchers can access MEDAS anywhere with the Internet, and it is client-free, but a web browser suffices.
Auto-machine learning
The backbone design and hyper-parameters search are the key to deep learning to the current state-of-the-art. However, the design and refinement of the model are not trivial. Therefore, MEDAS integrates auto-machine learning utilities.Optimizing the hyper-parameters of the deep learning models is not a straightforward task and requires in-depth expertise. Generally, the parameter of a deep learning model can be optimized by gradient descent. However, the hyper-parameter needs to be optimized manually, and the model also needs to be designed by hand. The first challenge of hyper-parameters optimization is its search space. The hyper-parameters include discrete and continuous values, and the relationship between parameters and results cannot be formulated in a closed way to obtain an analytic solution. Thus the search space is too huge to set up and search in, directly. Besides, the second challenge is that we could only change some of the hyper-parameters after hours or days of training. Therefore, the optimization of hyper-parameters needs to take days or weeks, and several attempts to choose a not bad hyper-parameter set. The third challenge is that the models for different medical image tasks are different, so the distributions of hyper-parameters are changed with different models, which means there is no general searching algorithm to find the best. These challenges make it difficult to optimize the hyper-parameters. The optimization is a kind of alchemy, which lacks regular rules. With that in mind, MEDAS employs automated hyper-parameter optimization based on Bayesian Optimization.The principle of the provided Bayesian-based hyper-parameter automatic search. The above figure shows the prediction of hyper-parameters at . Two blue points show the observation x; the black line presents the posterior mean of the prediction; the dashed line is the objective function ; the green area represents the possible functions, while the blue area is the acquisition function . The maximum point of is the next point of the hyper-parameter to be optimized. We use a set of the sine function to explain how Bayesian optimization searches the hyper-parameter. The key idea of Bayesian optimization is the iterative repetition of fitting and search. The methods, such as Gaussian process and regression random forest, are employed for fitting the data (x, y), where x denotes the hyper-parameter , and y denotes the performance of the model, i.e., . The acquisition function, such as Expected Improvement and Upper Confidence Bound, is employed for searching the next best x of the modelWhen we optimize the hyper-parameter of the model , we actually need to optimize another model , which represents the best score of the metric for the function f with the hyper-parameter , to obtain the optimal hyper-parameters. For optimization, , it is hard to deduce the analytical formula of ; hence, we use a set of functions to estimate the distribution of as Fig. 5 shows. After training the original model and getting the hyper-parameter result of , we can remove the functions which do not fit the result. Then, we get a subset . After several iterations, the distribution of approximates the final one. Ultimately, we can obtain an approximation of the optimal hyper-parameters.
Fig. 5
The principle of the provided Bayesian-based hyper-parameter automatic search. The above figure shows the prediction of hyper-parameters at . Two blue points show the observation x; the black line presents the posterior mean of the prediction; the dashed line is the objective function ; the green area represents the possible functions, while the blue area is the acquisition function . The maximum point of is the next point of the hyper-parameter to be optimized. We use a set of the sine function to explain how Bayesian optimization searches the hyper-parameter. The key idea of Bayesian optimization is the iterative repetition of fitting and search. The methods, such as Gaussian process and regression random forest, are employed for fitting the data (x, y), where x denotes the hyper-parameter , and y denotes the performance of the model, i.e., . The acquisition function, such as Expected Improvement and Upper Confidence Bound, is employed for searching the next best x of the model
Python API
Data, format, input, and output
Different modalities of the medical image have different formats. Therefore, the tool employs SimpleITK and OpenSlide [79] to handle the different formats of medical images. Furthermore, MEDAS can load and save Portable Network Graphics images (both single images and series of images) and Numpy objects.Plug and slotThe inputs to a tool might be all kinds of files, numbers, or just a Numpy array. Therefore, MEDAS employs “plug” and “slot” to process these inputs with differentiation and to deliver them to the kernel function with assimilation. The plug takes charge of the process of inputs, while the slot handles the inputs and passes them to the kernel function, where computing. The plug automatically converts the formats of inputs. For example, when the input is a string, but the tool accepts a float number, the plug will try to parse the string.ConstructorSimilar to the input, the output also has different kinds of formats. Therefore, MEDAS employs “constructor” to process the result of the kernel function. The constructor converts the results to different kinds of formats, including DICOM, NIfTI, and Numpy array. The variable simply passes through the variable constructor to the following modules, while the image constructor saves the data to an image file or passes it to the following modules.
Computing backend
MEDAS employs Numpy, OpenCV, and other libraries to implement algorithms, but not C/C++. The low-level algorithm’s implementation is not a high priority, due to the lack of time and manpower. However, there is a reserved feature—“computing backend”, inspired by TensorFlow’s design. The implementation of a faster version with CPU, GPU, and FPGA, or other devices can be added to the system via the “computing backend”, at later development, and different backends can be selected when executing the instance initialization.
Continuous programming
Inspired by Either Monad in Haskell [57, 70], MEDAS implements an abstract class named “Either”, which aims at processing the results and errors. “Either” of MEDAS has two states: success and failure, just like the one in Haskell. The tools execute one by one, and only if the previous execution is successful, the current one is able to execute. For example, setting up parameters must be done successfully before calculating.
Others
Logging MEDAS employs a flexible logging system, which can output to a terminal or stored in the system. Such a logging system supports users to monitor, diagnose, and debug models flexibly.Testing suit MEDAS provides a small kit for testing, by which modules included in MEDAS or third-parties can be well tested. At the same time, we employ tools to test MEDAS automatically, which is known as continuous integration.
Resource management
Resource management is important in deep learning, medical image analysis, and other similar tasks. Let us discuss this kind of situation. When a researcher uses one computer with one GPU, the management means execution and termination by the researcher. When two researchers share one computer with a GPU, communication between the two researchers is needed for the scheduling of individual tasks. When several users share GPU clusters, the situation rapidly becomes complicated. One may easily imagine a typical scenario where every user wants to use more resources and complete their tasks as quickly as possible.The computing resources include not only GPUs, but also storage, memory, bandwidth, software, and even energy. Cloud computing, grid computing, IaaS, PaaS, SaaS, and CaaS1 are the concepts presented to solve the problem of resource management. Task-based scheduling can meet the demand for resource management of deep learning when the GPU, CPU, memory, and disk are considered as the main resources.The management of resources usually includes task management and device management, as shown in Fig. 4. The task management takes charge of the scheduling, while the device manager is in charge of controlling and organizing the hardware.
Implementation details
In this section, we introduce the technical details about MEDAS so that the design ideas of MEDAS will be more reproducible.User InterfacesMEDAS is a web-based system, so all the interaction is based on web pages. We use vue.js, a front-end framework, to create a web-based.Back-endThe programs, who manage the data, tasks, and resources, are mainly written in Java with SpringBoot and MyBatis, and at the same time, the non-core microservers are written in Go with Iris. For the programs related to deep learning and medical image programs, we use Python to implement referred to relevant papers with PyTorch, Numpy, SimpleITK, OpenCV, and other software.Tasks and Resources ManagementWe set the basic units of task scheduling as containers with resources limitations, and the management assigned the containers to users to execute their programs. The management employs Docker and Kubernetes to manage containers and resources, and the back-end of MEDAS communicates with Kubernetes to allocate containers. Docker containers use the “control group” to establish a sandbox with resources limitations. The number of GPUs is controlled with a different set of the plan according to the calculation scale. Docker and Kubernetes control the device management. For storage, we employ NFS for containers managed by Kubernetes to store data.Workflow and data flow of the pulmonary nodule detection and the attribute classification (case study 1). The workflow includes five parts: input, pre-processing, dataset management, neural network, and visualization. A 3D Mask RCNN is employed to detect, while a 3D Dual-path net is employed for attribute classificationThe total loss and separate classification loss. The left plot shows the training loss (blue line) and testing loss (orange line). The right plot shows the loss of different classifiers
Application case studies
In the previous sections, we introduced the tools and systems of MEDAS. In this section, we present different case studies performed using MEDAS and selected varying themes of tasks. Deep learning-based methods are employed throughout these case studies. The following subsections present these cases which were executed on the MEDAS system. These case studies include:On purpose to foster comparability and reproducibility, we chose public datasets in these case studies. Each case study introduces the workflow of the model, and the pipeline is implemented with visualization programming via simple drag and drop or programming via Python API. The results of the model show in each case study. These case studies are executed with MEDAS via the container2.Pulmonary nodule detection & attribute classificationLiver contour segmentationMulti-organ segmentationAlzheimer’s Disease classificationNuclei segmentation
Case study 1:pulmonary nodule detection and attribute classification
The detection and attribute classification of the pulmonary nodule is a common medical image analysis task and is important for lung cancer diagnosis and clinical treatment. In this case study, we employ the neural network based on DeepLung [106], to detect and classify the pulmonary nodule. The dataset, we used in this case study to train the neural network model, is the LUNA 16 dataset, which is based on the LIDC-IDRI dataset [3].The pulmonary nodules were detected in two subjects. The red marks are the detected pulmonary nodules, while the blue points are the edges of the lung
Workflow
Workflow and data flow for the case study 2. The workflow includes six parts: input, pre-processing, dataset management, neural network, visualization, and “analysisFigure 6 shows the basic workflow of this case study, which includes five parts:
Fig. 6
Workflow and data flow of the pulmonary nodule detection and the attribute classification (case study 1). The workflow includes five parts: input, pre-processing, dataset management, neural network, and visualization. A 3D Mask RCNN is employed to detect, while a 3D Dual-path net is employed for attribute classification
Input The input of the whole workflow includes the CT images of the chest and the annotations. These data are stored in the network attached storage (NAS) and can be mounted to the container when needed.Pre-processing Pre-processing tools convert the formats of the image and annotation, mask the lung area on CT, and rescale the value of the image to [0, 1].Dataset management Dataset management split the dataset into a training set and a testing set to train and evaluate the models.Neural network We employ 3D Mask RCNN [27] for pulmonary nodule detection, while the 3D Dual-Path Net [12, 106] is used for attribute classification.Visualization We employ a point cloud-based nodule visualization tool to display the pulmonary nodule detected by the 3D Mask RCNN and a loss visualization tool to show the training loss of the model.
Implementation
Simple steps by dragging and dropping with MEDAS can implement the workflow mentioned in the previous. Then we launch the Docker container, mount data from NAS, and execute the task.
Result and visualization
We train the 3D Mask RCNN model and the 3D Dual-Path Net with the training set and test them with the testing set. Figure 7 presents the training loss of the 3D Dual-Path Net. The left plot shows the total loss, while the right plot presents the loss for each classifier.
Fig. 7
The total loss and separate classification loss. The left plot shows the training loss (blue line) and testing loss (orange line). The right plot shows the loss of different classifiers
3D Point cloud-based visualizationMRI and CT are dense 3D images. When we are viewing such 3D images, we can only view the cross-section of a 3D image. We, therefore, develop a cloud point-based visualization tool to visualize the segmentation result in MRI and CT. Figure 8, which is rendered via this tool, shows the result of the 3D Mask RCNN, a. k. a., the pulmonary nodules.
Fig. 8
The pulmonary nodules were detected in two subjects. The red marks are the detected pulmonary nodules, while the blue points are the edges of the lung
The segmentation of the liver of three subjects. The window width and level of CT images are 400 and 0. The red area is of ground truth but not segmentation result; the green area is the segmentation result but not ground truth; the yellow area is the right area segmented by model
Case study 2: liver contour segmentation
The liver-related radiographic analysis is also a focus of the research based on deep learning-based methods. The first step of the analysis is usually the segmentation of the liver contour, so in this case study, we employ VNet [58], to segment liver contours. The public dataset LiTS [7], which is aimed at detection and segmentation of the liver and tumors, is used to train the model.As shown in Fig. 9, the workflow of this case study includes six parts:
Fig. 9
Workflow and data flow for the case study 2. The workflow includes six parts: input, pre-processing, dataset management, neural network, visualization, and “analysis
Input The input part is the source of data.Pre-processing We employ the pre-processing tool to convert formats of images.Dataset management The dataset is split into a training set and a testing set by a dataset management tool.Neural network The VNet is employed to segment the liver contours from the images and trained with the training set. Then, we use the trained model to initialize the prediction tool of the model for testing.Visualization The training loss is visualized with the loss visualization tool, while the segmentation results are presented with the segmentation visualization tool.Analysis The prediction and ground truth are analyzed by computing the Dice score.The algorithm can be implemented by using MEDAS’s visualization programming. However, in this case study, we show the alternative option available for users to program in MEDAS. The setup, execution, and results checking with the training tool will be shown as an example.To use the tool, there are four steps to follow: The codes are shown in the following:Initializing instancesSetting up the toolExecuting the toolChecking the resultsWith continuous programming, the code above is equal to the below one:The network for liver contour segmentation is trained on the LiTS dataset, and the Dice score of the model obtains 0.92 on the testing set. Figure 10 presents the results of the segmentation task, while Fig. 11a visualizes the training loss.
Fig. 10
The segmentation of the liver of three subjects. The window width and level of CT images are 400 and 0. The red area is of ground truth but not segmentation result; the green area is the segmentation result but not ground truth; the yellow area is the right area segmented by model
Fig. 11
Visualization of the training loss for case study 2 (left) and 3 (right)
Visualization of the training loss for case study 2 (left) and 3 (right)The workflow of multi-organ segmentation (case study 3). The workflow includes the pre-processing of data and annotations, the training, the evaluation, and the visualization
Case study 3: multi-organ segmentation
Multi-organ segmentation can help machines understand the structure of the human body, which is very important for all the relevant tasks. Therefore, some researchers have focused on the single- or multi-organ segmentation tasks, such as the liver [21, 53], and the pancreas [9, 105]. In this case study, we use VNet-based neural network for the multi-organ segmentation task, SegTHOR [93]. SegTHOR challenge focuses on the segmentation of 4 organs at risk: heart, aorta, trachea, and esophagus. This dataset provides about 40 CT images of the chest.
Workflow and implementation
As shown in Fig. 12, the workflow of this case study includes six parts:
Fig. 12
The workflow of multi-organ segmentation (case study 3). The workflow includes the pre-processing of data and annotations, the training, the evaluation, and the visualization
Input The input includes the images and annotations of the chest and is stored in NAS as a dataset.Pre-processing Pre-processing tools rescale the range of the image values with a window width and a window level, resample the images to change their size.Dataset management The dataset management tool splits the dataset into a training and a testing set randomly.Neural network We employ a VNet-based neural network to segment organs from the chest CT images, and the model is trained and tested with the SegTHOR dataset.Visualization & analysis The segmented images can be visualized via the segmentation visualization tool, and the result analysis tool analyzes the results and generates a report in the MS-Excel format.
Task management
After the user sets up and submits the task, MEDAS begins to prepare launch a docker container to execute the user’s task. First, the scheduler of MEDAS checks the resource limitation of the user and system. A task will be executed only if the required computing resources are ready and the resources currently used by the user have not reached the limit of its account. Then, MEDAS encapsulates the codes and mounts the archive of code and datasets to the container. Finally, the scheduler of MEDAS allocates the computing resources required by the user, such as GPU, CPU, memory, and storage, and launches the docker container.When there are more than one user and one GPU (or computing resource) in the system, the scheduler strategy will be complex. MEDAS will reject the task if it requires interaction, but will queue the task in line when not. For hyper-parameter searching, the new tasks will be queue only if the old ones finish. The rejection of hyper-parameter searching tasks occurs only after all parameters are reached or execution times are limited.Figure 13 shows the obtained visualization results, and Fig. 11b shows the training loss.
Fig. 13
Visualization of case study 3. The green area is the esophagus; the red area is the heart; the blue area is the aorta; the orange area is the trachea
Visualization of case study 3. The green area is the esophagus; the red area is the heart; the blue area is the aorta; the orange area is the trachea
Case study 4: Alzheimer’s disease classification
Alzheimer’s disease (AD) is a kind of progressive neuro-degenerative disorder impairing the functions of memory and cognition according to [59]. Till now, there is no approach to cure the disease or even significantly slow down its deterioration, but there are some methods to tell the difference between AD and normal control (NC) subjects, e.g. [39-41]. In this section, we employ U-Net [76], and modify it for classification tasks, for example, AD versus NC.In this case study, all the subjects are selected from a public AD dataset named “the Alzheimer’s Disease Neuroimaging Database”, i.e., ADNI [61]. We select scans of AD and NC subjects to train a classifier.As shown in Fig. 14, the workflow of this case study includes five parts:
Fig. 15
The heat map generated by the tool in MEDAS with block-based and contour-based occlusions. The first two rows resemble the analysis with color spacing split into different ranges. The third row includes the analysis results with occlusion. The fourth row resembles the activation heat-map. The last row depicts the sensitivity analysis result
Input The input loads the data from the dataset.Pre-processing The pre-processing tool generates two images from one original image by selecting two voxels from a box region included 8 voxels.Dataset management The dataset management tool splits the dataset into a training set and a testing set.Neural network We employ a UNet-based neural network for the classification task to filter AD from NC scans.Visualization & analysis The sensitivity analysis tool helps to identify what is relevant for the neural network by generating a heat map that shows how the neural network behaves when a patch of the image is occluded.The workflow of Alzheimer’s disease classification, case study 4. The workflow includes the pre-processing of data and annotations, the training, the evaluation, and visualization
Result
We trained the model on MEDAS with the default parameters. The average accuracy of the classification task on the testing set is 0.95.
Interpretable visualization
Generally, deep learning is considered a black box. It is difficult for researchers to understand what has been learned by the neural network and why the algorithm works so well. Researchers can establish models from clear reasons and targets for traditional algorithms, but for deep learning, only a general target is selected to let gradient descent optimize their models. A general neural network model for a complex task might include more than millions of parameters that are hard to optimize and different to find out the effect of each parameter.MEDAS employs many tools to help researchers analyze and visualize their models and results. In Fig. 15, we employ three methods to analyze and visualize the attention of our network. Such tools can easily be used for similar tasks to generate heat map-based interpretable images. Block-based and contour-based occlusions are employed to interpret our model.The heat map generated by the tool in MEDAS with block-based and contour-based occlusions. The first two rows resemble the analysis with color spacing split into different ranges. The third row includes the analysis results with occlusion. The fourth row resembles the activation heat-map. The last row depicts the sensitivity analysis result
Case Study 5: Nuclei segmentation
Nuclei segmentation is one of the basic tasks in pathology image analysis, whether based on traditional [69] or deep learning-based methods [63, 86, 92]. The diagnostic of pathology images is based on many terms representing objects, such as nuclei, cells, and glands. Researchers extract features from these objects and use them in further diagnosis. For example, the mitosis analysis task is based on nuclei segmentation or detection. We use a U-Net-based model [76] to segment the nuclei on the dataset described in dataset MoNuSeg [44].As shown in Fig. 16, the workflow includes six parts:
Fig. 16
The workflow of nuclei segmentation (case study 5). The workflow includes the pre-processing of data and annotations, the training, the evaluation, and the visualization
Input The input loads the data from the dataset.Pre-processing The pre-processing tools convert formats and normalize the stain of the pathology image.Dataset management The dataset management tool splits the dataset into two sets, while the neural network uses the training set to train the model and uses the testing set to validate it.Neural network We employ UNet-like neural networks, including FCN, UNet, ResUNet, and DPUNet. The hyper-parameter controlled which model is used.Post-processing The post-processing tool handles the results of the segmentation. We employ the binary normalization tool to improve the segmentation results.Visualization The visualization tool depicts the final results to the user.The workflow of nuclei segmentation (case study 5). The workflow includes the pre-processing of data and annotations, the training, the evaluation, and the visualizationAfter the general design of the workflow that can be done on the draft, the user can drop selected tools in the editor and connect them according to the data and control flow to implement the workflow. Then, the data is uploaded into the platform from a locally hosted or online storage system, which is connected with the annotation systems. Finally, the task is launched on MEDAS with the given workflow, and the model is trained. Subsequently, the results and intermediate data are stored in the system.
Hyper-parameter optimization
This case study is an example of hyper-parameter optimization. Selected hyper-parameters in the neural network were carefully picked for optimization.The hyper-parameters optimized include the maximum epoch of training, learning rate, criterion function, and model. The range of the hyper-parameter “max epoch” is set to be chosen within 64 to 256, while the learning rate search range is set from 0.0001 to 0.01. The criteria could be selected in dice loss (dice), binary cross-entropy (bce), and Lovász loss (lovasz), while the models could be selected in FCN, UNet, ResUNet, and DPUNet. At the same time, “mean AJI” is selected as the optimization objective.We performed 100 iterations to search with the Bayesian optimization algorithm. The best result of the hyper-parameter optimization and the top five results by manual optimization are shown in Table 1 and Table 2. Further, Fig. 17 shows the relationship between the hyper-parameters and the metric “mean AJI”. Most of the combinations with DPUNet as a model and dice as a criterion function show better performance, i.e., higher “mean AJI” score, and the scores of these combinations are between 0.5925 to 0.6075. As shown in Fig. 17, the epoch number of the training iterations does not result in a remarkable effect on the metric, compared with the criterion function and the model. Further, the smaller learning rate proves the best choice, in general.
Table 1
The best results of the optimization. The max epoch, criterion, learning rate, number of training epochs, and model are selected as parameters
Epoch
Criterion
Learning rate
Model
Mean AJI
172
Dice
4.081e-3
DPUNet
0.6073
Table 2
The top-five results of manual optimization with different parameters
Epoch
Criterion
Learning rate
Model
Mean AJI
200
Dice
0.5e-3
ResUNet
0.5855
200
Dice
0.5e-3
DPUNet
0.5854
256
Lovasz
0.25e-3
ResUNet
0.5832
128
Bce
1.0e-3
DPUNet
0.5828
500
Lovasz
1.0e-3
FCN
0.5821
Fig. 17
The visualization of hyper-parameters via the parallel coordinates. The top one shows all the hyper-parameters. The color of the lines is related to the metric AJI: the higher, the brighter. The bottom one shows the hyper-parameters, whose metric AJI ranges between 0.5925 and 0.6075
The best results of the optimization. The max epoch, criterion, learning rate, number of training epochs, and model are selected as parametersThe top-five results of manual optimization with different parametersTable 2 shows the manual optimization result as a comparison. When we try to optimize these hyper-parameters manually, we are usually facing several problems.The most important one is how to optimize the parameters as it is difficult to find an analytical solution. As outlined, MEDAS employs the Bayesian optimization algorithm aiming to find optimal hyper-parameters.The second problem is the time. Manual optimization needs a lot of time. After we launch the task, we need to wait for the task to finish to test another set of parameters. If we have executed a task, we cannot launch another task after the latest one has finished, because we cannot estimate when the task will finish exactly.The third problem is the resource. Manual optimization usually needs more resources to reach a good result, since it tends to be slower and inefficient.The visualization of hyper-parameters via the parallel coordinates. The top one shows all the hyper-parameters. The color of the lines is related to the metric AJI: the higher, the brighter. The bottom one shows the hyper-parameters, whose metric AJI ranges between 0.5925 and 0.6075The best result of the hyper-parameters is chosen as the final result. The DPUnet network is used as the model, and the dice loss is selected as the criterion function. The model is trained within 172 epochs, and the learning rate is . The mean AJI score reaches 0.6073. The segmentation of the nuclei is shown in Fig. 18, while the AJI score of different organs is shown in Table 3.
Fig. 18
Result of case study 5. Each column shows the results of four different organs. The top row is the original image with pre-processing. The middle one is the segmentation with post-processing by binary normalization, while the bottom row is the ground truth
Table 3
The AJI metric of the validation set for different organs
Organ
Breast
Liver
Bladder
Colon
AJI
0.6517
0.5310
0.6543
0.5424
Organ
Prostate
Stomach
Kidney
Mean
AJI
0.6147
0.6437
0.6135
0.6073
Result of case study 5. Each column shows the results of four different organs. The top row is the original image with pre-processing. The middle one is the segmentation with post-processing by binary normalization, while the bottom row is the ground truthThe AJI metric of the validation set for different organs
Discussion
MEDAS
Deep learning-based medical image analysis is an interdisciplinary task, which combines computer and medicine knowledge. However, on the one hand, for medical researchers, deep learning is more like an approach applied in medical image analysis because the medicine researchers would not know too much about it, and that is also because the wall between the computer and medicine blocks it. On the other hand, for computer researchers, such research should focus on the algorithm or models but the fact is that the most researchers spent some of their time in programming, fine-tuning, and other mechanical and repetitive tasks, on which they should not have spent too much time.Targeting the problems above, we implement MEDAS for the idea of rapid implementation and verification. MEDAS provides a set of tools for medical image analysis, and with wrapping and reusing, researchers can simply and rapidly implement their algorithms without wasting their time on mechanical repetitive tasks. MEDAS also provides a platform including visualization programming, hyper-parameter optimization, resources management, and other components that further simplify the implementation and verification.However, MEDAS cannot solve all the problems in the processing of applying deep learning in medical image analysis. MEDAS can remove the barriers that stop the medical researchers from applying deep learning in their researchers, and simply the implementation of algorithms for computer researchers. But MEDAS cannot remove all barriers between medical and computer knowledge. To reduce such knowledge asymmetry, we plan to create an application to let researchers share their knowledge, which is named “Knowledge Base”.
Outlook
The combination of deep learning and medical image analysis will still be a hot topic in the next few years, and a key problem in the present context is to break the wall between medicine, deep learning, and computer science knowledge. The innovation of accessible technologies and methods, like MEDAS, will help the progress in this area.
Automatic DL in medical informatics
Automatic DL can help researchers to automatically design models and search for the best hyper-parameters. Neural network architecture and hyper-parameter optimization search are the problems that deep learning researchers need to face. This comes, as the choice of hyper-parameters and the design of the neural networks do not follow any specific rules. The rule to design the neural network cannot be expressed with a formula or any other mathematical approach, which can be optimized. The skillful design of a neural network can be time-consuming and difficult and requires expertise.Luckily, it is possible, by now, for medical researchers to input their data into the system, model and optimize automatically, and fetch the best model, hyper-parameter, and results. The algorithms to search for the best architecture of a neural network were suggested [23, 68]. Neural architecture search and hyper-parameter optimization can help to overcome the difficulty of manual neural network design and refinement.
Knowledge
Medical knowledge can help researchers to understand what the machine has learned, and provide the explanation on the medical and clinical level. The interpretability analysis of deep learning also provides the ability to find out the features, when it is ignored by human.Medicine- and deep learning-concerned surveys, papers, and even blog posts can be collected as a kind of knowledge base. For medical researchers, they can quickly find deep learning-related knowledge that is used in their research, while deep learning researchers can also rapidly retrieve medical knowledge. With MEDAS and such a knowledge base, both deep learning and medical researchers can accelerate their research.
Lacking data
One difference between general computer vision and medical image analysis in deep learning is that the latter usually lack data. First, most datasets are on a small scale. Compared with many other computer vision datasets, such as ImageNet [17], most medical datasets only include tens or hundreds of subjects. Second, each group or laboratory might have their private datasets, but mostly on a small scale. These small isolated datasets make it difficult to use them alone, but together.Federal learning and decentralized learningFederal learning [8, 42, 82] or other decentralized learning can share what machines have learned without sharing the data. Based on platforms such as MEDAS, and decentralized learning, such as federal learning, researchers from different institutions can efficiently collaborate.Few-shot learningFew-shot learning hits a critical spot in medical image analysis, lacking data. When researchers apply deep learning to medical image analysis, one of the big challenges is lacking data. Due to humans can quickly learn from a few data, so many researchers focus on research of few-shot learning of medical image analysis. One example is the researcher of Rezaei et al. [75], which covers a review of zero-shot learning from autonomous vehicles to COVID-19 diagnosis.Active learningActive learning is another method to solve the problem of lacking data by reducing the cost of annotation data. Active learning can learn the knowledge from a small set of training data at the beginning, and then, generate the labels for unlabeled data by interacting with experts.
Platform in software engineering
Besides the topics about the development of algorithms, the topics related to software engineering in medical image analysis are also important. There are two topics that MEDAS needs to improve in the future.Link to PACS/RISThe hospitals and medical centers usually have their PACS or RIS. Compared to typical data access methods, direct access to PACS and RIS can help researchers access more data, and at the same time, the AI-based medical image analysis algorithms can be easier applied to the clinical environment. However, direct access risks privacy disclosure and the strictest privacy security strategy that impedes access to data. Therefore, how to design the strategy of access PACS and RIS is the improvement of MEDAS and other platforms in the future.AI Models EvaluationThe typical evaluation of AI-based medical image analysis is measured by the metrics, for example, accuracy. However, these metrics can not tell the users whether it is safe or dependable, mainly when a platform “markets” it to users. Users might care about whether the model can be easily attacked by one pixel changed or whether it can be easily trained for their new tasks. Therefore, it is essential for MEDAS and other platforms to research the evaluation of the model’s safety, usability, and performance.
Plan
MEDAS is now still at the initial stage, and our users are most of the researchers from our partners. In other words, MEDAS is self-sufficient. MEDAS does not meet all the needs of medical image analysis, and current focuses are primely the detection, classification, and segmentation tasks of MRI, CT, and pathology images.The development plan is dependent on the community suggestion. Our platform can be used to assist medics and learn from medics, with the help from the technologies, such as federated learning, active learning, life-long learning, etc. Based on meta-learning and active learning, MEDAS can reduce the workload of annotation. We, therefore, currently plan to integrate more useful tools to meet most needs from the community, and the long-term plan currently includes an annotation tool with active learning and other algorithms to reduce the workload of labeling.
Summary
In this work, we introduced our platform, named MEDAS, to render the application of deep learning in the medical image analysis more user-friendly, easy, and hence accessible. We designed the pipeline and user interface based on our experience of development and analysis. The pipeline includes pre-processing, post-processing, augmentation, neural network, and visualization & debugging modules. We have also performed several case studies to demonstrate the efficient operation of MEDAS.
Authors: Ludovico Minati; Trudi Edginton; Maria Grazia Bruzzone; Giorgio Giaccone Journal: Am J Alzheimers Dis Other Demen Date: 2009 Apr-May Impact factor: 2.035
Authors: Daniel Jimenez-Carretero; David Bermejo-Peláez; Pietro Nardelli; Patricia Fraga; Eduardo Fraile; Raúl San José Estépar; Maria J Ledesma-Carbayo Journal: Med Image Anal Date: 2018-11-26 Impact factor: 8.545
Authors: Jose Dolz; Karthik Gopinath; Jing Yuan; Herve Lombaert; Christian Desrosiers; Ismail Ben Ayed Journal: IEEE Trans Med Imaging Date: 2018-10-30 Impact factor: 10.048
Authors: Zaneta Swiderska-Chadaj; Hans Pinckaers; Mart van Rijthoven; Maschenka Balkenhol; Margarita Melnikova; Oscar Geessink; Quirine Manson; Mark Sherman; Antonio Polonia; Jeremy Parry; Mustapha Abubakar; Geert Litjens; Jeroen van der Laak; Francesco Ciompi Journal: Med Image Anal Date: 2019-08-21 Impact factor: 8.545
Authors: Konstantinos Kamnitsas; Christian Ledig; Virginia F J Newcombe; Joanna P Simpson; Andrew D Kane; David K Menon; Daniel Rueckert; Ben Glocker Journal: Med Image Anal Date: 2016-10-29 Impact factor: 8.545
Authors: Geert Litjens; Clara I Sánchez; Nadya Timofeeva; Meyke Hermsen; Iris Nagtegaal; Iringo Kovacs; Christina Hulsbergen-van de Kaa; Peter Bult; Bram van Ginneken; Jeroen van der Laak Journal: Sci Rep Date: 2016-05-23 Impact factor: 4.379
Authors: Eli Gibson; Wenqi Li; Carole Sudre; Lucas Fidon; Dzhoshkun I Shakir; Guotai Wang; Zach Eaton-Rosen; Robert Gray; Tom Doel; Yipeng Hu; Tom Whyntie; Parashkev Nachev; Marc Modat; Dean C Barratt; Sébastien Ourselin; M Jorge Cardoso; Tom Vercauteren Journal: Comput Methods Programs Biomed Date: 2018-01-31 Impact factor: 5.428
Authors: Jan Egger; Daniel Wild; Maximilian Weber; Christopher A Ramirez Bedoya; Florian Karner; Alexander Prutsch; Michael Schmied; Christina Dionysio; Dominik Krobath; Yuan Jin; Christina Gsaxner; Jianning Li; Antonio Pepe Journal: J Digit Imaging Date: 2022-01-21 Impact factor: 4.056