| Literature DB >> 35879937 |
Thitirat Siriborvornratanakul1.
Abstract
Introduction: The emergence of automated machine learning or AutoML has raised an interesting trend of no-code and low-code machine learning where most tasks in the machine learning pipeline can possibly be automated without support from human data scientists. While it sounds reasonable that we should leave repetitive trial-and-error tasks of designing complex network architectures and tuning a lot of hyperparameters to AutoML, leading research using AutoML is still scarce. Thereby, the overall purpose of this case study is to investigate the gap between current AutoML frameworks and practical machine learning development. Case description: First, this paper confirms the increasing trend of AutoML via an indirect indicator of the numbers of search results in Google trend, IEEE Xplore, and ACM Digital Library during 2012-2021. Then, the three most popular AutoML frameworks (i.e., Auto-Sklearn, AutoKeras, and Google Cloud AutoML) are inspected as AutoML's representatives; the inspection includes six comparative aspects. Based on the features available in the three AutoML frameworks investigated, our case study continues to observe recent machine learning research regarding the background of image-based machine learning. This is because the field of computer vision spans several levels of machine learning from basic to advanced and it has been one of the most popular fields in studying machine learning and artificial intelligence lately. Our study is specific to the context of image-based road health inspection systems as it has a long history in computer vision, allowing us to observe solution transitions from past to present. Discussion and evaluation: After confirming the rising numbers of AutoML search results in the three search engines, our study regarding the three AutoML representatives further reveals that there are many features that can be used to automate the development pipeline of image-based road health inspection systems. Nevertheless, we find that recent works in image-based road health inspection have not used any form of AutoML in their works. Digging into these recent works, there are two main problems that best conclude why most researchers do not use AutoML in their image-based road health inspection systems yet. Firstly, it is because AutoML's trial-and-error decision involves much extra computation compared to human-guided decisions. Secondly, using AutoML adds another layer of non-interpretability to a model. As these two problems are the major pain points in modern neural networks and deep learning, they may require years to resolve, delaying the mass adoption of AutoML in image-based road health inspection systems. Conclusions: In conclusion, although AutoML's utilization is not mainstream at this moment, we believe that the trend of AutoML will continue to grow. This is because there exists a demand for AutoML currently, and in the future, more demand for no-code or low-code machine learning development alternatives will grow together with the expansion of machine learning solutions. Nevertheless, this case study focuses on selected papers whose authors are researchers who can publish their works in academic conferences and journals. In the future, the study should continue to include observing novice users, non-programmer users, and machine learning practitioners in order to discover more insights from non-research perspectives.Entities:
Keywords: Artificial Intelligence; AutoML; Automated Machine Learning; Human Behavior; Machine Learning; Road Health Inspection
Year: 2022 PMID: 35879937 PMCID: PMC9299412 DOI: 10.1186/s40537-022-00646-8
Source DB: PubMed Journal: J Big Data ISSN: 2196-1115
Fig. 1The rising trend of AutoML in Google Trend (worldwide), IEEE Xplore, and ACM Digital Library during 2012 to 2021. The vertical axis represents the numbers of search results according to the “AutoML” search keyword in each year. Note that values plotted in this figure were manually collected from each search engine on December 15, 2021, without any coding or scraping software involved
The top ten most popular computer science’s sub-categories ranked by the numbers of arXiv’s search results (the first column) from highest to lowest
| N | Category | |
|---|---|---|
| Code | Detail | |
| 108,709 | cs.LG | Machine learning |
| 74,677 | cs.CV | Computer vision and pattern recognition |
| 46,390 | cs.AI | Artificial intelligence |
| 38,974 | cs.IT | Information theory |
| 35,122 | cs.CL | Computation and language |
| 22,034 | cs.CR | Cryptography and security |
| 19,403 | cs.DS | Data structures and algorithms |
| 18,573 | cs.RO | Robotics |
| 17,775 | cs.NI | Networking and internet architecture |
| 16,121 | cs.DC | Distributed, parallel, and cluster computing |
Note that the numbers shown in this table were manually collected by searching each category code (the second column) in https://arxiv.org/search/ on May 9, 2022, without any coding or scraping software involved
Fig. 2The conceptual framework
Fig. 3Our decision diagram regarding the three AutoML frameworks discussed in “AutoML’s background” section
Fig. 4An example of a road scanning vehicle. This image is retrieved on May 15, 2022 from https://www.roadscanners.com/products/road-clinic-rdsv/full-rdsv-system-road-data-collection/