| Literature DB >> 35408166 |
Manzura Jorayeva1, Akhan Akbulut1, Cagatay Catal2, Alok Mishra3.
Abstract
Software defect prediction studies aim to predict defect-prone components before the testing stage of the software development process. The main benefit of these prediction models is that more testing resources can be allocated to fault-prone modules effectively. While a few software defect prediction models have been developed for mobile applications, a systematic overview of these studies is still missing. Therefore, we carried out a Systematic Literature Review (SLR) study to evaluate how machine learning has been applied to predict faults in mobile applications. This study defined nine research questions, and 47 relevant studies were selected from scientific databases to respond to these research questions. Results show that most studies focused on Android applications (i.e., 48%), supervised machine learning has been applied in most studies (i.e., 92%), and object-oriented metrics were mainly preferred. The top five most preferred machine learning algorithms are Naïve Bayes, Support Vector Machines, Logistic Regression, Artificial Neural Networks, and Decision Trees. Researchers mostly preferred Object-Oriented metrics. Only a few studies applied deep learning algorithms including Long Short-Term Memory (LSTM), Deep Belief Networks (DBN), and Deep Neural Networks (DNN). This is the first study that systematically reviews software defect prediction research focused on mobile applications. It will pave the way for further research in mobile software fault prediction and help both researchers and practitioners in this field.Entities:
Keywords: deep learning; machine learning; mobile application; review; software defect prediction; software fault prediction; systematic literature review
Mesh:
Year: 2022 PMID: 35408166 PMCID: PMC9003321 DOI: 10.3390/s22072551
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Systematic Literature Review process.
Research questions in this research.
| RQ | Research Questions |
|---|---|
| RQ1 | Which platforms are addressed in mobile defect prediction? |
| RQ2 | Which datasets are used in mobile defect prediction studies? |
| RQ3 | Which machine learning types are used in mobile defect prediction studies? |
| RQ4 | Which machine learning algorithms are applied in mobile defect prediction? |
| RQ5 | Which evaluation metrics are used in mobile defect prediction? |
| RQ6 | Which validation approaches were used in mobile defect prediction? |
| RQ7 | Which software metrics were adopted in mobile defect prediction? |
| RQ8 | Which ML algorithm works best for mobile defect prediction? |
| RQ9 | What are the challenges and research gaps in mobile defect prediction? |
Exclusion criteria [32].
| ID | Exclusion Criteria |
|---|---|
| 1. | The paper includes only an abstract (this criterion is not about the accessibility of the paper, we included both open access and subscription basis papers) |
| 2. | The paper is not written in English |
| 3. | The article is not a primary study paper |
| 4. | The content does not provide any experimental results |
| 5. | The study does not describe in detail how machine learning is applied |
Figure 2Distribution of the selected papers.
Quality evaluation questions. “Yes” scores 2; “partial” scores 1; “no” scores 0.
| ID | Questions |
|---|---|
| Q1 | Are the aims of the study clearly declared? |
| Q2 | Are the scope and context of the study clearly defined? |
| Q3 | Is the proposed solution clearly explained and validated by an empirical study? |
| Q4 | Are the variables used in the study likely to be valid and reliable? |
| Q5 | Is the research process documented adequately? |
| Q6 | Are all study questions answered? |
| Q7 | Are the negative findings presented? |
| Q8 | Are the main findings stated clearly in terms of credibility, validity, and reliability? |
Figure 3Quality score distribution of selected papers (x axis: paper score, y-axis: number of papers).
Figure 4Selected publications per year.
Figure 5Distribution of type of publications.
Platforms.
| Platforms | Total |
|---|---|
| Android | 21 |
| Windows Phone | 1 |
| Web Applications | 20 |
| Mobile Applications | 5 |
Repositories.
| Repositories | Datasets | Web Address |
|---|---|---|
| GIT Repository | Contact, MMS, Bluetooth, Email, | |
| Sharejar and Source Forge | Calendar, Gallery2, and Telephony |
|
| GitHub Repository | Bootstrap, Avaya Communicator, The K-9 Mail client, Space Blaster game, K-9 issue report. | |
| Mobile Applications | Connectbot, Boardgame Geek, AnkiDroid, Android Wallpaper, Quiksearchbox, |
|
Figure 6Repositories.
Figure 7Distribution of Machine Learning Types.
Figure 8Machine Learning Algorithms.
Figure 9Evaluation metrics.
Figure 10Distribution of validation approaches.
Figure 11Distribution of metrics types.
Figure 12Machine Learning Algorithms Performance.
Figure 13Distribution of Deep Learning Algorithms.
Challenges and possible solutions.
| Challenges | Proposed Solutions | Reference |
|---|---|---|
| Metric selection limitations for mobile software | Use alternate code and process metrics | [ |
| Faults in Android data | Remove faults | [ |
| Limited mobile app repository | Use of public repository | [ |
| Repeated data/code in the project | Domain Adaptation | [ |
| Small dataset problem | Not mentioned | [ |
| Different programming language problem | Defect prediction only GIT open-source Android, Java, and C++ uncertain | [ |
| Modeling problem | Not mentioned | [ |
| Different platforms and languages | Not mentioned | [ |
| Extensive datasets | Not mentioned | [ |
| Not fully automated | Manually code, log, bug, and review control | [ |
| Imbalance Class problem | Sampling methods, Under sampling methods | [ |
| Manual feature engineering | Not mentioned | [ |