| Literature DB >> 35174274 |
Bilal I Al-Ahmad1, Ala' A Al-Zoubi2,3, Md Faisal Kabir4,5, Marwan Al-Tawil3, Ibrahim Aljarah3.
Abstract
Software engineering is one of the most significant areas, which extensively used in educational and industrial fields. Software engineering education plays an essential role in keeping students up to date with software technologies, products, and processes that are commonly applied in the software industry. The software development project is one of the most important parts of the software engineering course, because it covers the practical side of the course. This type of project helps strengthening students' skills to collaborate in a team spirit to work on software projects. Software project involves the composition of software product and process parts. Software product part represents software deliverables at each phase of Software Development Life Cycle (SDLC) while software process part captures team activities and behaviors during SDLC. The low-expectation teams face challenges during different stages of software project. Consequently, predicting performance of such teams is one of the most important tasks for learning process in software engineering education. The early prediction of performance for low-expectation teams would help instructors to address difficulties and challenges related to such teams at earliest possible phases of software project to avoid project failure. Several studies attempted to early predict the performance for low-expectation teams at different phases of SDLC. This study introduces swarm intelligence -based model which essentially aims to improve the prediction performance for low-expectation teams at earliest possible phases of SDLC by implementing Particle Swarm Optimization-K Nearest Neighbours (PSO-KNN), and it attempts to reduce the number of selected software product and process features to reach higher accuracy with identifying less than 40 relevant features. Experiments were conducted on the Software Engineering Team Assessment and Prediction (SETAP) project dataset. The proposed model was compared with the related studies and the state-of-the-art Machine Learning (ML) classifiers: Sequential Minimal Optimization (SMO), Simple Linear Regression (SLR), Naïve Bayes (NB), Multilayer Perceptron (MLP), standard KNN, and J48. The proposed model provides superior results compared to the traditional ML classifiers and state-of-the-art studies in the investigated phases of software product and process development. ©2022 Al-Ahmad et al.Entities:
Keywords: Artificial intelligence; Data mining; Machine learning; Optimization; PSO; Software engineering
Year: 2022 PMID: 35174274 PMCID: PMC8802785 DOI: 10.7717/peerj-cs.857
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1PSO initialization process.
Figure 2Overview of the proposed assessment model.
PSO’s parameter settings.
| Algorithm | Parameter | Value |
|---|---|---|
| PSO | Acceleration constants | [2.1, 2.1] |
| Inertia w | [0.9, 0.6] | |
| Swarm size | 30 | |
| Number of itearions | 100 |
Performance of the classifiers in the first phase for software product.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.125 | 0.119 | 0.444 | 0.125 | 0.195 | 0.503 | 55.41 |
| SLR | 0.125 | 0.119 | 0.444 | 0.125 | 0.195 | 0.597 | 55.41 |
| NB | 0.063 | 0.095 | 0.333 | 0.063 | 0.105 | 0.442 | 54.05 |
| MLP | 0.375 | 0.238 | 0.545 | 0.375 | 0.444 | 0.641 | 59.46 |
| Standard KNN | 0.500 | 0.405 | 0.485 | 0.500 | 0.492 | 0.548 | 55.40 |
| PSO-1NN |
| 0.238 |
|
|
|
|
|
| PSO-5NN |
|
| 0.486 |
| 0.507 | 0.557 | 55.41 |
| PSO-7NN | 0.375 | 0.405 | 0.414 | 0.375 | 0.393 | 0.519 | 50 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the fifth phase for software product.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.813 | 0.048 | 0.929 | 0.813 | 0.867 | 0.882 | 89.19 |
| SLR | 0.813 | 0.167 | 0.788 | 0.813 | 0.8 | 0.935 | 82.43 |
| NB | 0.813 | 0.214 | 0.743 | 0.813 | 0.776 | 0.905 | 79.73 |
| MLP | 0.813 | 0.048 | 0.929 | 0.813 | 0.867 | 0.95 | 89.19 |
| Standard KNN | 0.938 | 0.119 | 0.857 | 0.938 | 0.896 | 0.909 | 90.54 |
| PSO-1NN |
| 0.119 | 0.861 |
|
| 0.925 |
|
| PSO-5NN | 0.781 | 0.071 | 0.893 | 0.781 | 0.833 |
| 86.48 |
| PSO-7NN | 0.781 |
|
| 0.781 | 0.862 | 0.919 | 89.19 |
Notes.
Numbers in bold indicate the best values.
Figure 3The AUC values obtained by the proposed model and the best values achieved by ML techniques for software product.
The most relevant features at the first phase of software product development.
| Number of selected features | 18 |
|---|---|
| teamMemberCount | |
| teamDistribution | |
| helpHoursTotal | |
| leadAdminHoursAverage | |
| standardDeviationInPersonMeetingHoursTotalByWeek | |
| averageInPersonMeetingHoursAverageByWeek | |
| standardDeviationInPersonMeetingHoursAverageByWeek | |
| Selected features | standardDeviationHelpHoursAverageByWeek |
| averageGlobalLeadAdminHoursResponseCountByWeek | |
| standardDeviationMeetingHoursTotalByStudent | |
| averageHelpHoursAverageByStudent | |
| uniqueCommitMessageCount | |
| averageCommitCountByWeek | |
| standardDeviationUniqueCommitMessageCountByWeek | |
| averageUniqueCommitMessagePercentByWeek | |
| standardDeviationUniqueCommitMessagePercentByWeek | |
| averageCommitCountByStudent | |
| standardDeviationCommitMessageLengthAverageByStudent |
Performance of the classifiers in the first phase for software process.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0 | 0 | na | 0 | na | 0.5 | 66.21 |
| SLR | 0 | 0 | na | 0 | na | 0.578 | 66.21 |
| NB | 0 | 0.041 | 0 | 0 | na | 0.499 | 63.51 |
| MLP | 0 | 0 | na | 0 | na | 0.579 | 66.21 |
| Standard KNN |
|
| 0.662 |
| 0.769 | 0.499 | 63.51 |
| PSO-1NN |
| 0.76 |
|
|
|
|
|
| PSO-5NN | 0 | 0.02 | 0 | 0 | 0 | 0.396 | 64.86 |
| PSO-7NN | 0 | 0.02 | 0 | 0 | 0 | 0.45 | 64.86 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the fifth phase for software process.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.52 | 0.061 | 0.813 | 0.52 | 0.634 | 0.729 | 79.73 |
| SLR | 0.56 | 0.061 | 0.824 | 0.56 | 0.667 | 0.898 | 81.08 |
| NB | 0.44 | 0.163 | 0.579 | 0.44 | 0.5 | 0.754 | 70.27 |
| MLP | 0.8 | 0.204 | 0.667 | 0.8 | 0.727 | 0.837 | 79.73 |
| Standard KNN | 0.898 | 0.120 | 0.936 | 0.898 | 0.917 | 0.889 | 89.18 |
| PSO-1NN |
| 0.12 |
|
|
| 0.909 |
|
| PSO-5NN | 0.72 | 0.061 | 0.857 | 0.72 | 0.783 |
| 86.487 |
| PSO-7NN | 0.56 |
| 0.824 | 0.56 | 0.667 | 0.901 | 81.08 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the second phase for software process.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.56 | 0.204 | 0.583 | 0.56 | 0.571 | 0.678 | 71.62 |
| SLR | 0.6 | 0.143 | 0.682 | 0.6 | 0.638 | 0.745 |
|
| NB | 0.2 | 0.082 | 0.556 | 0.2 | 0.294 | 0.712 | 67.57 |
| MLP | 0.28 | 0.082 | 0.636 | 0.28 | 0.389 | 0.769 | 70.27 |
| Standard KNN |
|
| 0.741 |
|
| 0.639 | 71.62 |
| PSO-1NN | 0.44 | 0.163 | 0.579 | 0.44 | 0.5 | 0.633 | 70.27 |
| PSO-5NN | 0.36 | 0.061 |
| 0.36 | 0.486 | 0.761 | 74.32 |
| PSO-7NN | 0.36 | 0.061 |
| 0.36 | 0.486 |
| 74.32 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the third phase for software process.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.52 | 0.082 | 0.765 | 0.52 | 0.619 | 0.719 | 78.38 |
| SLR | 0.44 | 0.061 | 0.786 | 0.44 | 0.564 | 0.764 | 77.027 |
| NB | 0.4 |
| 0.385 | 0.4 | 0.392 | 0.642 | 58.11 |
| MLP | 0.84 | 0.102 | 0.808 | 0.84 | 0.824 | 0.819 |
|
| Standard KNN | 0.755 | 0.240 | 0.860 | 0.755 | 0.804 | 0.758 | 75.67 |
| PSO-1NN |
| 0.28 |
|
|
|
| 86.48 |
| PSO-5NN | 0.28 | 0.082 | 0.636 | 0.28 | 0.389 | 0.761 | 70.27 |
| PSO-7NN | 0.16 | 0.041 | 0.667 | 0.16 | 0.258 | 0.739 | 68.92 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the fourth phase for software process.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.44 | 0.061 | 0.786 | 0.44 | 0.564 | 0.689 | 77.027 |
| SLR | 0.52 | 0.041 | 0.867 | 0.52 | 0.65 | 0.863 | 81.08 |
| NB | 0.24 | 0.122 | 0.5 | 0.24 | 0.324 | 0.53 | 66.22 |
| MLP | 0.56 | 0.061 | 0.824 | 0.56 | 0.667 | 0.794 | 81.08 |
| Standard KNN | 0.857 | 0.240 | 0.875 | 0.857 | 0.866 | 0.809 | 82.43 |
| PSO-1NN |
| 0.12 |
|
|
|
|
|
| PSO-5NN | 0.6 | 0.061 | 0.833 | 0.6 | 0.698 | 0.904 | 82.43 |
| PSO-7NN | 0.52 |
| 0.813 | 0.52 | 0.634 | 0.0.863 | 79.7297 |
Notes.
Numbers in bold indicate the best values.
Figure 4The AUC values obtained by the proposed model and the best values achieved by ML techniques for software process.
Accuracy average and standard deviation results of all investigated phases for software product.
| Datasets | PSO-1NN | |
|---|---|---|
|
|
| |
| First phase (Product) | 57.66 | 7.69 |
| Second phase (Product) | 60.81 | 4.05 |
| Third phase (Product) | 88.74 | 5.46 |
| Fourth phase (Product) | 93.69 | 1.56 |
| Fifth phase (Product) | 89.64 | 2.82 |
Accuracy average and standard deviation results of all investigated phases for software process.
| Datasets | PSO-1NN | |
|---|---|---|
|
|
| |
| First phase (Process) | 65.76 | 5.46 |
| Second phase (Process) | 70.72 | 0.78 |
| Third phase (Process) | 81.08 | 7.15 |
| Fourth phase (Process) | 90.54 | 1.36 |
| Fifth phase (Process) | 88.73 | 2.82 |
The most relevant features at the first phase of software process development.
| Number of selected features | 16 |
|---|---|
| teamMemberCount | |
| femaleTeamMembersPercent | |
| teamMemberResponseCount | |
| meetingHoursAverage | |
| meetingHoursStandardDeviation | |
| nonCodingDeliverablesHoursAverage | |
|
| averageMeetingHoursTotalByWeek |
| averageMeetingHoursAverageByWeek | |
| standardDeviationMeetingHoursAverageByWeek | |
| averageInPersonMeetingHoursTotalByWeek | |
| averageResponsesByStudent | |
| standardDeviationMeetingHoursAverageByStudent | |
| averageInPersonMeetingHoursAverageByStudent | |
| commitCount | |
| commitMessageLengthTotal | |
| averageCommitMessageLengthTotalByWeek |
Performance of the classifiers in the second phase for software product.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.438 | 0.167 | 0.667 | 0.438 | 0.528 | 0.635 |
|
| SLR |
| 0.214 | 0.64 |
|
|
|
|
| NB | 0.125 | 0.024 |
| 0.125 | 0.216 | 0.647 | 60.81 |
| MLP | 0.375 | 0.19 | 0.60 | 0.375 | 0.462 | 0.687 | 62.16 |
| Standard KNN | 0.438 | 0.262 | 0.560 | 0.438 | 0.491 | 0.588 | 60.81 |
| PSO-1NN | 0.375 | 0.143 | 0.667 | 0.375 | 0.48 | 0.6167 | 64.86 |
| PSO-5NN | 0.406 | 0.31 | 0.5 | 0.406 | 0.448 | 0.515 | 56.76 |
| PSO-7NN |
|
| 0.533 |
| 0.516 | 0.552 | 59.60 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the third phase for software product.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.719 | 0.095 | 0.52 | 0.719 | 0.78 | 0.734 | 82.43 |
| SLR | 0.781 | 0.119 | 0.833 | 0.781 | 0.806 | 0.888 | 83.08 |
| NB | 0.5 | 0.095 | 0.8 | 0.5 | 0.615 | 0.813 | 72.97 |
| MLP | 0.625 | 0.048 |
| 0.625 | 0.741 | 0.903 | 81.8 |
| Standard KNN | 0.875 | 0.119 | 0.848 | 0.875 | 0.862 | 0.878 | 87.83 |
| PSO-1NN |
| 0.071 | 0.906 |
|
|
|
|
| PSO-5NN | 0.656 |
| 0.778 | 0.656 | 0.712 | 0.841 | 77.03 |
| PSO-7NN | 0.625 |
| 0.769 | 0.625 | 0.69 | 0.826 | 75.68 |
Notes.
Numbers in bold indicate the best values.
Performance of the classifiers in the fourth phase for software product.
| Algorithm | TP rate | FP rate | Precision | Recall | F-Measure | AUC | Accuracy |
|---|---|---|---|---|---|---|---|
| SMO | 0.438 | 0.095 | 0.778 | 0.438 | 0.56 | 0.792 | 70.27 |
| SLR | 0.813 |
| 0.897 | 0.813 | 0.852 | 0.918 | 87.84 |
| NB | 0.438 | 0.095 | 0.778 | 0.438 | 0.56 | 0.792 | 70.27 |
| MLP | 0.906 | 0.214 | 0.763 | 0.906 | 0.829 | 0.926 | 83.78 |
| Standard KNN | 0.938 | 0.143 | 0.833 | 0.938 | 0.882 | 0.897 | 89.18 |
| PSO-1NN |
| 0.071 |
|
|
| 0.949 |
|
| PSO-5NN | 0.875 | 0.143 | 0.824 | 0.875 | 0.848 |
| 86.49 |
| PSO-7NN | 0.781 | 0.167 | 0.781 | 0.781 | 0.781 | 0.911 | 81.08 |
Notes.
Numbers in bold indicate the best values.