| Literature DB >> 35808230 |
Manzura Jorayeva1,2, Akhan Akbulut1, Cagatay Catal3, Alok Mishra4,5.
Abstract
Smartphones have enabled the widespread use of mobile applications. However, there are unrecognized defects of mobile applications that can affect businesses due to a negative user experience. To avoid this, the defects of applications should be detected and removed before release. This study aims to develop a defect prediction model for mobile applications. We performed cross-project and within-project experiments and also used deep learning algorithms, such as convolutional neural networks (CNN) and long short term memory (LSTM) to develop a defect prediction model for Android-based applications. Based on our within-project experimental results, the CNN-based model provides the best performance for mobile application defect prediction with a 0.933 average area under ROC curve (AUC) value. For cross-project mobile application defect prediction, there is still room for improvement when deep learning algorithms are preferred.Entities:
Keywords: Android applications; deep learning; machine learning; mobile application; software defect prediction; software fault prediction
Mesh:
Year: 2022 PMID: 35808230 PMCID: PMC9268998 DOI: 10.3390/s22134734
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Methodology of SLR.
Datasets.
| Datasets | Repository | Lines | Downloads |
|---|---|---|---|
| Afwall | 1025 | 500,000 | |
| Alfresco | 1004 | 50,000 | |
| androidSync | 209 | 100,000 | |
| androidWallpaper | 588 | 5,000,000 | |
| anySoftKeyboard | 2971 | 25,271 | |
| Apg | 3780 | N/A | |
| atmosphere | 5474 | 1,000,000 | |
| chatSecure | 2579 | N/A | |
| 548 | 5,000,000,000 | ||
| flutter | 10,405 | 100,000 | |
| kiwis | 1373 | 1,000,000 | |
| owncloudandroid | 3700 | 100,000 | |
| Pageturner | 164 | 50,000 | |
| 222 | 50,000,000 |
Figure 2Dataset classes distribution.
Figure 3Correlation between features.
Figure 4Data balancing after SMOTE.
Figure 5Artificial neural networks.
Figure 6Convolutional neural networks.
Figure 7Long short term memory cells.
Figure 8Long short term memory architecture.
Figure 9Long short term memory model (a) accuracy and (b) loss.
Figure 10Confusion matrix example.
Figure 11Confusion matrix of ANN model.
Figure 12Confusion matrix of CNN model.
Figure 13Confusion matrix of LSTM model.
Figure 14ROC and AUC curve.
ANN defect prediction results (within-project analysis).
| Datasets | Accuracy (%) | Precision | Recall | F-1 Score | AUC |
|---|---|---|---|---|---|
| aFall | 65 | 0.73 | 0.69 | 0.80 | 0.91 |
| Alfresco | 68 | 0.74 | 0.70 | 0.77 | 0.92 |
| androidSync | 70 | 0.70 | 0.70 | 0.83 | 0.96 |
| androidWalpaper | 84 | 0.71 | 0.84 | 0.77 | 0.94 |
| anySoftKeyboard | 75 | 0.93 | 0.80 | 0.79 | 0.90 |
| Apg | 70 | 0.76 | 0.83 | 0.75 | 0.89 |
| atmosphere | 71 | 0.73 | 0.72 | 0.81 | 0.93 |
| chatSecure | 69 | 0.82 | 0.68 | 0.73 | 0.91 |
| 72 | 0.74 | 0.70 | 0.70 | 0.84 | |
| flutter | 70 | 0.80 | 0.76 | 0.72 | 0.90 |
| kiwis | 67 | 0.71 | 0.90 | 0.69 | 0.93 |
| owncloudandroid | 70 | 0.72 | 0.80 | 0.71 | 0.96 |
| Pageturner | 68 | 0.73 | 0.76 | 0.73 | 0.92 |
| 72 | 0.70 | 0.71 | 0.77 | 0.90 | |
| Average | 70.79 | 0.75 | 0.76 | 0.755 | 0.915 |
CNN defect prediction results (within-project analysis).
| Datasets | Accuracy | Precision | Recall | F-1 Score | AUC |
|---|---|---|---|---|---|
| aFall | 67 | 0.73 | 0.70 | 0.69 | 0.95 |
| Alfresco | 70 | 0.67 | 0.80 | 0.71 | 0.96 |
| androidSync | 64 | 0.94 | 0.70 | 0.65 | 0.94 |
| androidWalpaper | 66 | 0.82 | 0.66 | 0.70 | 0.96 |
| anySoftKeyboard | 72 | 0.75 | 0.70 | 0.83 | 0.90 |
| Apg | 69 | 0.75 | 0.77 | 0.70 | 0.93 |
| atmosphere | 70 | 0.68 | 0.90 | 0.80 | 0.91 |
| chatSecure | 67 | 0.70 | 0.75 | 0.73 | 0.96 |
| 71 | 0.75 | 0.69 | 0.81 | 0.90 | |
| flutter | 68 | 0.84 | 0.70 | 0.72 | 0.94 |
| kiwis | 73 | 0.76 | 0.74 | 0.69 | 0.92 |
| owncloudandroid | 70 | 0.80 | 0.72 | 0.69 | 0.96 |
| Pageturner | 69 | 0.75 | 0.82 | 0.70 | 0.90 |
| 70 | 0.73 | 0.69 | 0.85 | 0.93 | |
| Average | 69 | 0.76 | 0.738 | 0.734 | 0.933 |
LSTM defect prediction results (within-project analysis).
| Datasets | Accuracy | Precision | Recall | F-1 Score | AUC |
|---|---|---|---|---|---|
| aFall | 69 | 0.77 | 0.80 | 0.69 | 0.95 |
| Alfresco | 70 | 0.71 | 0.77 | 0.82 | 0.93 |
| androidSync | 73 | 0.83 | 0.79 | 0.65 | 0.86 |
| androidWalpaper | 68 | 0.80 | 0.93 | 0.77 | 0.94 |
| anySoftKeyboard | 71 | 0.73 | 0.69 | 0.80 | 0.91 |
| Apg | 72 | 0.74 | 0.80 | 0.77 | 0.90 |
| atmosphere | 69 | 0.70 | 0.72 | 0.71 | 0.89 |
| chatSecure | 73 | 0.80 | 0.75 | 0.72 | 0.95 |
| 70 | 0.72 | 0.74 | 0.83 | 0.90 | |
| flutter | 67 | 0.70 | 0.76 | 0.74 | 0.92 |
| kiwis | 70 | 0.71 | 0.90 | 0.73 | 0.96 |
| owncloudandroid | 72 | 0.83 | 0.70 | 0.69 | 0.90 |
| Pageturner | 69 | 0.75 | 0.71 | 0.70 | 0.93 |
| 70 | 0.73 | 0.80 | 0.77 | 0.90 | |
| Average | 70.21 | 0.751 | 0.775 | 0.742 | 0.917 |
Cross-project analysis results (AUC).
| Projects | ANN | CNN | LSTM |
|---|---|---|---|
| aFall | 0.65 | 0.67 | 0.69 |
| Alfresco | 0.68 | 0.70 | 0.70 |
| androidSync | 0.70 | 0.64 | 0.73 |
| androidWalpaper | 0.84 | 0.66 | 0.68 |
| anySoftKeyboard | 0.75 | 0.72 | 0.71 |
| Apg | 0.70 | 0.69 | 0.72 |
| atmosphere | 0.71 | 0.70 | 0.69 |
| chatSecure | 0.69 | 0.67 | 0.73 |
| 0.72 | 0.71 | 0.70 | |
| flutter | 0.70 | 0.68 | 0.67 |
| kiwis | 0.67 | 0.73 | 0.70 |
| owncloudandroid | 0.70 | 0.70 | 0.72 |
| Pageturner | 0.68 | 0.69 | 0.69 |
| 0.72 | 0.70 | 0.70 | |
| Average | 0.71 | 0.69 | 0.70 |