| Literature DB >> 32604814 |
Abstract
The emergence of new technologies to incorporate and analyze data with high-performance computing has expanded our capability to accurately predict any incident. Supervised Machine learning (ML) can be utilized for a fast and consistent prediction, and to obtain the underlying pattern of the data better. We develop a prediction strategy, for the first time, using supervised ML to observe the possible impact of weak radiofrequency electromagnetic field (RF-EMF) on human and animal cells without performing in-vitro laboratory experiments. We extracted laboratory experimental data from 300 peer-reviewed scientific publications (1990-2015) describing 1127 experimental case studies of human and animal cells response to RF-EMF. We used domain knowledge, Principal Component Analysis (PCA), and the Chi-squared feature selection techniques to select six optimal features for computation and cost-efficiency. We then develop grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. The dataset has been tested with ten different classifiers, and the outputs are estimated using the k-fold cross-validation method. The assessment of a classifier's prediction performance is critical for assessing its suitability. Hence, a detailed comparison of the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 - specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) for each classification method were observed. Our findings suggest that the Random Forest algorithm exceeds in all groups in terms of all performance measures and shows AUC = 0.903 where k-fold = 60. A robust correlation was observed in the specific absorption rate (SAR) with frequency and cumulative effect or exposure time with SAR×time (impact of accumulated SAR within the exposure time) of RF-EMF. In contrast, the relationship between frequency and exposure time was not significant. In future, with more experimental data, the sample size can be increased, leading to more accurate work.Entities:
Keywords: Bioelectromagnetics; RF-EMF exposure assessment; human and animal cells; in-vitro studies; machine learning; supervised learning
Mesh:
Year: 2020 PMID: 32604814 PMCID: PMC7345599 DOI: 10.3390/ijerph17124595
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Potential features, attributes, or variables of bioelectromagnetic experiments (in-vitro, in-vivo, and epidemiological studies) that could be utilized in ML algorithms.
Supervised machine learning algorithms for in-vitro studies in Bioelectromagnetics: weak radiofrequency electromagnetic fields (RF-EMF) on living organisms.
| Study | Experimental Type for Data Collection | Species | Data Size (No of Experimental Observations) | Features/Attributes/Variables | Machine Learning Technique | Algorithms | Prediction Accuracy (Highest) | Computation Time/CPU Time (sec) | Programming Languages, Tools and Computer Details (System Information) |
|---|---|---|---|---|---|---|---|---|---|
| In vivo (RF-EMF directly expose to whole plants) | Plant | 169 | Species, frequency, SAR, power flux density, electric field strength, exposure durations, and cellular response (presence or absence) | Supervised Machine Learning (classification) | Random Forest, J48, JRip, Random Tree, Bayes Net, Naive Bayes, Decision Table, OneR | 95.26% | 0.2 | MATLAB (MathWorks Inc., Natick, MA, USA) R2015b, one-way ANOVA procedure in SPSS Statistics (Version 23, IBM, Armonk, NY, USA) and Weka tool (Waikato Environment for Knowledge Analysis, Version 3.9, University of Waikato, Hamilton, New Zealand), on computer with 1.7 GHz Intel Core i7 CPU, 4 GB 1600 MHz DDR3 RAM | |
| In vivo (RF-EMF directly expose to whole plants) | Plant | 169 | Species, frequency, SAR, power flux density, electric field strength, exposure durations, and cellular response (presence or absence) | Supervised Machine Learning (classification) | k-Nearest Neighbor (kNN), Random Forest | 91.17% | 3.38–408.84 | Python 3.6.0 on macOS Sierra (Version 10.12.6), on computer with 1.7 GHz Intel Core i7 CPU, 4 GB 1600 MHz DDR3 RAM | |
| In-vitro (RF-EMF directly expose to human and animal cells/tissue) | Human and animal cells | 1127 | Species (year of study, human and animal cells/tissue), frequency, SAR, exposure durations, and cellular response (presence or absence) | Supervised Machine Learning (classification) | Random Forest, Bagging, J48, SVM (Linear Kernel), Jrip, Decision Table, BayesNet, Naive Bayes, Logistic Regression | 83.56% | 0.3 | MATLAB (MathWorks Inc., Natick, MA, USA) R2019b and Weka tool (Waikato Environment for Knowledge Analysis, Version 3.9, University of Waikato, Hamilton, New Zealand), on a computer with macOS High Sierra (Version 10.13.6, Apple, Cupertino, CA, USA), on computer with 1.7 GHz Intel Core i7 CPU, 4 GB 1600 MHz DDR3 RAM. |
Supervised machine learning or classification algorithms for the analysis were used to generate the results.
| Algorithm/Classifier Name | Classifier Type | Description | Capabilities (Features/Attributes Allowed by the Algorithm) | Citation |
|---|---|---|---|---|
| K-nearest neighbours’ classifier (kNN) | Lazy | The appropriate value of K, based on cross-validation, can be selected. The kNN (k-number of neighbours) uses the nearest neighbour search algorithm. Using cross-validation, the algorithm chooses the best k value between 1 and the value mentioned as the kNN parameter | Numeric, nominal, binary, date, unary, missing values | Aha (1991) [ |
| Random Forest | Trees | Random forests algorithm builds a forest of random trees. This considers a mixture of tree predictors (where each tree depends on the independent values of a random vector sampled) and employs similar distribution for all trees in the forest. When various trees in the forest become huge, the generalization error for forests converges as far as possible to a limit. The error of the forest tree classifiers relies upon the power of the individual trees and the correlation between the trees. In this method, the data does not require to be re-scaled or transformed. Primarily, Random forest tackles outliers by binning them | Numeric, nominal, binary, date, unary, missing values | Breiman (2001) [ |
| Bagging | Meta | A Bagging classifier is a meta-estimator that provides base classifiers with each on random subsets of the original dataset. Then it aggregates the prediction to form a final prediction. Such a meta-estimator can be used to reduce the variance of a black-box estimator (e.g., a decision tree) by introducing randomization into its construction procedure | Numeric, nominal, binary, date, unary, missing values | Breiman (1996) [ |
| J48 | Trees | The J48 is a classification algorithm which generates a decision tree which produces a pruned or unpruned C4.5 decision tree. A number of folds decide the volume of data used for reduced-error pruning. One-fold is utilized for pruning, and the rest is for growing the tree | Numeric, nominal, binary, date, unary, missing values | Quinlan (1993) [ |
| Support-vector machines (SVM, Linear Kernel) | Function | The SVM classifier globally substitutes all missing values. This also transforms nominal attributes into binary values. Then, by default, it normalizes all attributes. Hence, the coefficients in the output are based on the normalized data, not the original data, which is essential for interpreting the classifier. To achieve probability estimates, use the option that provides the logistic regression method to the outputs of the support vector machine | Numeric, nominal, binary, unary, missing values | Platt (1998) [ |
| Jrip | Rules | The JRip class implements a propositional rule learner called Repeated Incremental Pruning to Produce Error Reduction (RIPPER). It is established in association rules with reduced error pruning (REP), a popular and efficient method seen in decision tree algorithms. The algorithm operates through a few phases: initialization, building stage, grow phase, prune phase, optimization and selection stage | Numeric, nominal, binary, date, unary, missing values | Cohen (1995) [ |
| Decision Table | Rules | The Decision Table is a class for building and utilizing an easy decision table in the classifier. It also is represented as a programming language or as in decision trees as a series of if-then-else and switch-case statements. The learning decision tables comprises choosing the correct attributes to be incorporated. A decision table is seen as balanced if it includes each conceivable mixture of input variables | Numeric, nominal, binary, date, unary, missing values | Kohavi (1995) [ |
| Bayesian Network (BayesNet) | Bayes | Bayes Network is a statistical model that uses a conditional probability approach. It uses different search algorithms and quality measures. This leads to data structures (network structure and conditional probability distributions) and facilities common to Bayes Network learning algorithms. Since ADTrees are memory intensive, computer memory restrictions may arise. Nevertheless, switching this option away makes the structure learning algorithms moderate and run with more limited memory | Numeric, nominal, binary, date, unary, missing values | Friedman et al. (1997) [ |
| Naive Bayes | Bayes | Naive Bayes is based on Bayes’ Theorem. It chooses numeric estimator precision values based on analysis of the training data. Due to that reason, this is not an updateable classifier which is in typical usage of initialized among zero training instances. This uses a kernel estimator for numeric attributes than a normal distribution | Numeric, nominal, binary, date, unary, missing values | John and Langley (1995) [ |
| Logistic Regression | Function | Logistic regression uses a statistical technique for predicting binary classes and it estimates the probability of an event occurring. Missing values are replaced, and nominal attributes are transformed into numeric attributes using filters | Numeric, nominal, binary, date, unary, missing values | Cessie and Houwelingen [ |
Descriptions of the selected six features (attributes or variables) of the analysis.
| Features | Symbol | Type | Feature Type | Description (Domain) |
|---|---|---|---|---|
| Species (human, animal) |
| Nominal | Input | Different cell types have been grouped into two (human or animal cells) |
| Frequency of weak RF-EMF (Hz) |
| Numeric | Input | 800–2450 (MHz) |
| Specific absorption rate, SAR (W/kg) | SAR | Numeric | Input | Up to 50 W/kg—Specific Absorption Rate (SAR) is a proportion of the rate at which energy is absorbed per unit mass by a living organism when exposed to a radiofrequency electromagnetic field (RF-EMF). |
| Duration of exposure time | T | Numeric | Input | 2 min–120 h |
| SAR×exposure time (Halgamuge et al., 2020) [ |
| Numeric | Input | Cumulative effect or impact of accumulated SAR within the exposure period |
| Cellular response (presence or absence) | R | Binary | Output | Presence/Absence |
An overview of the utilized laboratory experiments that provided positive association (cellular response—presence) between weak RF-EMF and cells.
| No | Affected Cells | Frequency (Hz) | Specific Absorption Rate, SAR (W/kg) | Exposed Time (min) | Radiation Exposure Facility Details |
|---|---|---|---|---|---|
| 1 | Human peripheral blood mononuclear cells (PBMC) | 900, 1800 | 0.024, 0.18, 0.4, 2, 5 | 15, 120, 880 | Waveguide, anechoic chamber, cavity resonator |
| 2 | Human Blood Lymphocytes | 800, 830, 895, 900, 905, 910, 915, 954, 1300, 1800, 1909.8, 1950, 2450 | 0.0054, 0.037, 0.05, 0.18, 0.21, 0.3, 0.5, 0.77, 1, 1.25, 1.5, 2, 2.5, 2.6, 2.9, 3, 3.6, 4.1, 4.3, 5, 6, 8.8, 9, 10, 12.3, 50 | TEM cell, waveguide, horn antenna, wire patch cell (WPC), rectangular waveguide (R18), rectangular waveguide (WR 430), waveguide with cavity resonator, anechoic chamber with horn antenna, trumpet-like aerial | |
| 3 | Human Monocytes, monocytic cells (U937), Human Mono Mac 6 cells (MM6) | 900, 1300, 1800 | 0.18, 0.77, 1, 2, 2.5 | 15, 20, 60, 880 | Rectangular waveguides (R18) with cavity resonator, anechoic chamber with horn antenna |
| 4 | Human B lymphoblastoid cell (TK6, CCRF-CEM) | 1800 | 2 | 40, 480 | Rectangular waveguides |
| 5 | Human T lymphoblastoid cells (Molt-4 T) | 813.5, 836.5, 900 | 0.0024, 0.0026, 0.0035, 0.024, 0.026, 3.2 | 120, 1260, 2880 | TEM cell |
| 6 | Human Leukocytes, human blood neutrophils, human white blood cells | 900, 1800, 1909.8 | 2, 5, 10, 1909.8 | 15, 160, 180, 1440 | TEM cell, waveguide, microstrip transmission line |
| 7 | Human leukemia cells (HL60), human erythroleukemic cells (K562) | 900, 1800, 2450 | 0.000025, 0.000041, 1.8, 2, 2.5, 10 | 120, 180, 240, 360, 480, 880, 1440 | GTEM cell, circular waveguide with cavity resonator, waveguide (TM01) |
| 8 | Human Whole Blood Samples, blood platelets, hemoglobin (HbA), human blood serum | 835, 900, 910, 940, 2375 | 0.24, 0.6, 1, 1.17, 2.4, 12 | 1, 3, 5, 7, 15, 30, 60, 90, 120 | Cavity resonator, spiral antenna setup |
| 9 | Glial cells: Astroglial (astrocytes) cells, astrocytoma cells and microglial cells | 835, 900, 1800 | 1.8, 2.4, 2.5, 12 | 420, 480, 880 | waveguide with cavity resonator |
| 10 | Human glioma cells (LN71, MO54, H4, SHG44) | 900, 954, 2450 | 1.2, 1.5, 5, 10, 50 | 60, 120, 240, 480, 1056, 3000 | GTEM cell, circular waveguide with cavity resonator |
| 11 | Human glioblastoma cells (U87MG, U251MG, A172, T98, U87) | 835 | 2.4, 12 | 420 | |
| 12 | Human neuroblastoma cells (NB69, SK-N-SH, SH-SY5Y, NG108-15) | 872, 900, 1760, 1800, 2200 | 0.023, 0.086, 0.77, 1, 1.5, 1.8, 2.5, 5, 6 | 5, 15, 20, 30, 60, 120, 240, 480, 1440 | Waveguide, wire-patch cell (WPC), waveguide with cavity resonator, chamber with a monopole antenna |
| 13 | Human primary, epidermal keratinocytes, keratinocytes cells (HaCaT) | 900 | 2 | 2880 | Wire-patch antenna |
| 14 | Human fibroblasts, human diploid fibroblasts, human dermal fibroblasts, human skin fibroblasts | 900, 1800, 1950, 2450 | 0.05, 0.2, 1, 1.2, 2, 3 | 20, 60, 80, 320, 480, 580, 2880 | Waveguide, anechoic chamber, wire-patch antenna, rectangular waveguides |
| 15 | Jurkat Cells, Jurkat human T Lymphoma cells | 1800, 2450 | 2, 4 | 160, 2880 | Waveguide, antenna horn |
| 16 | Embryonic carcinoma (EC-P19), Epidermoid carcinoma | 1710, 1950 | 0.0036, 0.4, 1.5, 2 | 60, 120, 180, 480 | Waveguide, waveguide (R14) |
| 17 | Hepatocarcinoma cell line HepG2 | 900, 1800, 2200 | 0.023, 2 | 20, 40, 60, 80, 1440 | Waveguide, horn antena |
| 18 | Human lens epithelial cells (HLECs), eye lens epithelial cells | 1800 | 1, 2, 3, 3.5, 4 | 10, 20, 30, 40, 120, 180, 480, 560, 1440 | Waveguide, rectangular waveguide (R18) |
| 19 | Human epithelial amnion cells (AMA), bronchial epithelial cells (BEAS-2B), human ovarian surface epithelial cells (OSE-80PC), epithelial carcinoma cells, Human HeLa, HeLa S3 | 960, 1800 | 0.0021, 1, 2.1, 3 | 20, 30, 540, 3900 | TEM cell, waveguide, dipole antenna |
| 20 | Human amniotic cell, amniotic epithelial cells (FL) | 960, 1800 | 0.0002, 0.002, 0.02, 0.1, 0.5, 1, 2, 4 | 15, 20, 30, 40, 240 | TEM cell, waveguide |
| 21 | Human breast carcinoma cells (MCF-7) | 900, 1800, 2450 | 0.00018, 0.00036, 0.00058, 0.36, 2 | 60 | Exposure chamber, antenna with falcon tube holder |
| 22 | Human breast epithelial cells (MCF10A), breast fibroblasts | 2100 | 0.607 | 240, 1440 | Horn antenna |
| 23 | Human Spermatozoa | 850, 900, 1800, 1950 | 0.0006, 0.4, 1, 1.3, 1.46, 2, 2.8, 3, 4.3, 5.7, 10.1, 27.5 | 4, 10, 60, 180, 960 | Waveguide, exposure chambers, omni-directional antenna, waveguide in TE10 mode with cavity resonator and monopole antenna |
| 24 | Human Endothelial cells (EA.hy926, EA.hy926v1 and EA.hy296) | 900, 1800 | 0.77, 1.8, 2, 2.2, 2.4, 2.5, 2.8 | 20, 60, 480 | Waveguide, exposure chamber, waveguide with resonator (TE10 mode), waveguide with cavity resonator |
| 25 | Human Trophoblast cells (HTR-8/SV neo cells)/Human lipid membrane (liposomes) | 1800, 1817, 2450 | 0.0028, 0.0056, 2, 38 | 3, 10, 60, 80, 160, 320, 480 | TEM cell, waveguide, dipole antenna, waveguide with cavity resonator |
| 26 | Mast cell lines (HMC-1)—mast cell leukemia | 864.3 | 7 | 140 | Resonant chamber |
| 27 | FC2 cells, human-hamster hybrid cells (AL) | 835, 900 | 0.0107, 0.0172, 2 | 30, 120 | TEM cell |
| 28 | Human adipose derived stem cells | 2450 | 0.24 | 3000 | |
| 29 | Human dendritic cells | 1800 | 4 | 20, 240, 480 | |
| 30 | Human embryonic kidney cells (HEK 293 T) | 940 | 0.09 | 15, 30, 45, 60, 90 | Waveguide |
| 31 | Human umbilical vein endothelial cells (HUVEC) | 1800 | 3 | 20, 500 | Waveguide |
| 32 | Human hair cell, human scalp hair follicle, human dermal papilla cells (hDPC) | 900, 1763 | 0.974, 2, 10 | 15, 30, 60, 180, 420 | Rectangular cavity-type chamber (TE102 mode) |
An overview of the utilized laboratory experiments that provided a positive association (cellular response—presence) between weak RF-EMF and for animal cells.
| No | Affected Cells | Frequency (Hz) | Specific Absorption Rate, SAR (W/kg) | Exposed Time (min) | Radiation Exposure Facility Details |
|---|---|---|---|---|---|
| 1 | Rat primary microglial cells, mouse microglial cells (N9) | 1800, 2450 | 2, 6 | 20, 60, 120, 240 | Waveguide, rectangular horn antenna in an anechoic chamber |
| 2 | Rat glioblastoma cells (C6, C6BU-1) | 1950 | 5.36 | 720, 1440, 2880 | Dipole antenna |
| 3 | Rat astrocytes | 872, 900, 1800, 1950 | 0.3, 0.46, 0.6, 1.5, 2, 2.5, 3, 5.36, 6 | 5, 10, 20, 60, 120, 240, 480, 520, 720, 1440, 2880, 5760 | Waveguide, dipole antenna, horn antenna, rectangular waveguide |
| 4 | Rat brain capillary endothelial cells (BCEC) | 1800 | 0.3, 0.46 | 2880, 5760 | Rectangular waveguide |
| 5 | Mouse neuroblastoma cells (N2a, N18TG-2, NG108-15) | 915 | 0.001, 0.005, 0.01, 0.05, 0.1 | 30 | TEM cell |
| 6 | Rat neurons, murine cholinergic neurons (SN56) | 900, 1800 | 0.25, 1, 2 | 120, 480, 1440, 2880, 4320, 5760, 7200, 8640 | TEM cells, wire-patch cell, rectangular waveguides |
| 7 | Rat/mouse brain cells | 1600, 2450 | 0.00052, 0.23, 0.48, 1.19, 1.2, 2.99, 6.42, 11.21 | Cylindrical waveguide (T11 mode), cylindrical waveguide (T11 mode) | |
| 8 | Rat/mouse bone marrow | 2450 | 12 | 5, 10, 15 | Waveguide |
| 9 | Mouse spermatozoa, Murine spermatocyte-derived cells (GC-2) | 900, 1800 | 0.09, 1, 2, 4 | 20, 5040 | Waveguide, rectangular waveguide |
| 10 | Embryonic mouse fibroblasts cells (C3H10T1/2, NIH3T3, L929), Mouse embryonic skin cells (M5-S), Rat1 cells | 835.62, 847.74, 872, 875, 900, 915, 916, 950, 1800, 2450 | 0.0015, 0.024, 0.03, 0.1, 0.13, 0.24, 0.33, 0.6, 0.91, 1, 2, 2.4, 2.5, 4.4, 5 | 5, 10, 15, 20, 30, 40, 60, 80, 240, 480, 960, 1440, 5760 | Waveguide, radial transmission line, chamber with monopole antenna, magnetron, rectangular waveguide |
| 11 | Mouse embryonic carcinoma cells (P19), Mouse embryonic stem cells, Mouse embryonic neural stem cells (BALB/c) | 800, 1710, 1800 | 1, 1.5, 1.61, 2, 4, 5, 50 | 20, 60, 120 | Waveguide, rectangular waveguide (R18) |
| 12 | Mouse lymphoma cells (L5178Y Tk+/-), Rat basophilic leukemia cells (RBL-2H3), Murine Cytolytic T lymphocytes (CTLL-2) | 835, 915, 930, 2450 | 0.0081, 0.6, 1.5, 25, 40 | 5, 15, 30, 120, 240, 420 | Waveguide, GTEM cell, anechoic chamber, aluminium exposure chamber |
| 13 | Rat granulosa cells (GFSH-R17) | 1800 | 1.2, 2 | 80, 320, 480 | Rectangular waveguides |
| 14 | Rat pheochromocytoma cells (PC12) | 1800 | 2 | 80, 320, 480 | Waveguide |
| 15 | Chinese Hamster Cells (CHO), Ovary (CHO-K1), Chinese hamster lung cells (CHL) | 1800 | 3 | 20, 480 | Waveguide |
| 16 | Chinese hamster fibroblast cells (V79) | 864, 935, 2450 | 0.04, 0.08, 0.12, 0.51 | 15, 60, 120, 180 | TEM cell, GTEM cell |
| 17 | Melanoma cell membrane (B16) | 900 | 3.2 | 120 | Wire patch cell (WPC) |
| 18 | Rat chemoreceptors membranes | 900 | 0.5, 4, 12, 18 | 15 | Waveguide (TE10 mode) |
| 19 | Hamsters pineal glands cells | 1800 | 0.008, 0.08, 0.8, 2.7 | 420 | Radial wave guide |
| 20 | Chick embryos | 915, 2450 | 1.2, 1.75, 2.5, 8.4, 42.6 | 3, 120 | TEM cell, coaxial device |
| 21 | Rabbit lens, Rabbit lens epithelial cells (RLEC) | 2450 | 0.0026, 0.0065, 0.013, 0.026, 0.052 | 480 | TEM cell |
| 22 | Guinea pig cardiac myocytes, pig astrocytes | 900, 1300, 1800 | 0.001 | 8 | TEM cell |
| 23 | Isolated frog auricle | 885, 915 | 8, 10 | 10, 40 | Coplanar stripline slot irradiator |
| 24 | Isolated frog nerve cord | 915 | 20, 30 | ||
| 25 | Snail neurons | 2450 | 0.0125, 0.125, 85 | 30, 45 | Waveguide, waveguide in TE10 mode |
Grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. This will produce five different feature groups or distributions for each laboratory experiment.
| Group | Selected Features |
|---|---|
|
| Specie, frequency of weak RF-EMF, SAR, exposure time, SAR×exposure time, cellular response (presence or absence) |
|
| Specie, frequency of weak RF-EMF, SAR, exposure time, SAR×exposure time, cellular response (presence or absence) |
|
| Frequency of weak RF-EMF, SAR, exposure time, SAR×exposure time, cellular response (presence or absence) |
|
| Specie, frequency of weak RF-EMF, exposure time, cellular response (presence or absence) |
|
| Specie, SAR, exposure time, SAR×exposure time, cellular response (presence or absence) |
Correctly classified instances where for each classification algorithm for all groups using k-fold cross-validation (Train 90% : Test 10%).
| Group | Model | Fold = 10 | Fold = 20 | Fold = 30 | Fold = 40 | Fold = 50 | Fold = 60 | Fold = 70 | Fold = 80 | Fold = 90 |
|---|---|---|---|---|---|---|---|---|---|---|
| Group A | Random Forest | 82.362 | 82.203 | 83.240 | 83.399 | 82.841 | 83.559 | 83.081 | 83.240 | 83.240 |
| Group A | kNN | 76.457 | 76.696 | 76.856 | 76.696 | 76.856 | 77.015 | 76.935 | 76.536 | 76.616 |
| Group A | Bagging | 79.090 | 79.649 | 80.766 | 79.888 | 79.968 | 80.367 | 79.729 | 80.048 | 81.165 |
| Group A | J48 | 78.532 | 78.851 | 78.133 | 79.649 | 78.931 | 78.611 | 79.729 | 79.249 | 78.691 |
| Group A | Decision Table | 75.579 | 75.658 | 75.419 | 75.020 | 75.738 | 75.579 | 75.977 | 74.940 | 75.179 |
| Group B | Random Forest | 80.447 | 81.484 | 80.607 | 81.006 | 80.766 | 81.165 | 81.804 | 81.405 | 80.926 |
| Group B | kNN | 79.888 | 80.607 | 80.607 | 80.766 | 80.527 | 80.686 | 80.447 | 80.686 | 80.447 |
| Group B | Bagging | 77.574 | 78.292 | 77.494 | 78.452 | 78.532 | 79.329 | 78.053 | 78.931 | 78.372 |
| Group B | J48 | 75.898 | 78.212 | 77.893 | 78.133 | 77.175 | 78.133 | 78.292 | 78.053 | 77.574 |
| Group B | Decision Table | 75.658 | 75.339 | 75.738 | 75.339 | 76.297 | 75.818 | 75.818 | 75.579 | 76.058 |
| Group C | Random Forest | 82.203 | 82.682 | 82.841 | 83.160 | 82.841 | 83.959 | 83.001 | 83.160 | 83.639 |
| Group C | kNN | 78.532 | 78.851 | 78.851 | 79.010 | 79.090 | 79.329 | 79.170 | 78.931 | 78.931 |
| Group C | Bagging | 79.0902 | 79.569 | 79.809 | 79.489 | 79.888 | 80.048 | 79.649 | 79.569 | 79.729 |
| Group C | J48 | 76.377 | 77.175 | 78.053 | 77.095 | 77.095 | 78.452 | 77.334 | 77.813 | 78.212 |
| Group C | Jrip | 75.020 | 75.579 | 75.578 | 75.499 | 74.860 | 75.499 | 74.940 | 75.419 | 76.217 |
| Group D | Random Forest | 80.447 | 81.484 | 80.607 | 81.006 | 80.766 | 81.165 | 81.804 | 81.405 | 80.926 |
| Group D | kNN | 79.888 | 80.607 | 80.607 | 80.766 | 80.527 | 80.686 | 80.447 | 80.686 | 80.447 |
| Group D | Bagging | 77.574 | 78.292 | 77.494 | 78.452 | 78.532 | 79.329 | 78.053 | 78.931 | 78.372 |
| Group D | J48 | 75.898 | 78.212 | 77.893 | 78.133 | 77.175 | 78.133 | 78.292 | 78.053 | 77.574 |
| Group D | Decision Table | 75.658 | 75.339 | 75.738 | 75.339 | 76.297 | 75.818 | 75.818 | 75.579 | 76.056 |
Figure 2Root-mean-square error (RMSE) values <0.42 for different classifiers.
Figure 3The area under the ROC Curve for all classifiers: excellent (0.9–1), good (0.8–0.9), fair (0.7–0.8), poor (0.6–0.7) and fail (0.5–0.6).
Area under the Receiver Operating Characteristics (ROC) curve (AUC) using excellent (0.9–1) and good (0.8–0.9) values in all groups (Train 90% : Test 10%).
| Group | Model | Fold = 10 | Fold = 20 | Fold = 30 | Fold = 40 | Fold = 50 | Fold = 60 | Fold = 70 | Fold = 80 | Fold = 90 |
|---|---|---|---|---|---|---|---|---|---|---|
| Group A | Random Forest | 0.899 | 0.901 | 0.902 | 0.901 | 0.900 | 0.903 | 0.902 | 0.902 | 0.902 |
| Group A | Bagging | 0.872 | 0.879 | 0.882 | 0.874 | 0.878 | 0.878 | 0.874 | 0.882 | 0.879 |
| Group A | BayesNet | 0.809 | 0.814 | 0.814 | 0.815 | 0.813 | 0.813 | 0.813 | 0.814 | 0.812 |
| Group A | J48 | 0.853 | 0.853 | 0.841 | 0.855 | 0.852 | 0.850 | 0.849 | 0.854 | 0.849 |
| Group A | Decision Table | 0.827 | 0.838 | 0.836 | 0.836 | 0.840 | 0.839 | 0.839 | 0.834 | 0.833 |
| Group B | Random Forest | 0.894 | 0.896 | 0.895 | 0.897 | 0.896 | 0.896 | 0.897 | 0.897 | 0.897 |
| Group B | kNN | 0.873 | 0.874 | 0.873 | 0.876 | 0.877 | 0.873 | 0.874 | 0.875 | 0.873 |
| Group B | Bagging | 0.872 | 0.872 | 0.870 | 0.872 | 0.873 | 0.875 | 0.870 | 0.877 | 0.873 |
| Group B | BayesNet | 0.807 | 0.810 | 0.810 | 0.810 | 0.808 | 0.807 | 0.806 | 0.808 | 0.807 |
| Group B | J48 | 0.834 | 0.841 | 0.838 | 0.841 | 0.838 | 0.837 | 0.832 | 0.837 | 0.834 |
| Group B | Decision Table | 0.822 | 0.819 | 0.818 | 0.815 | 0.815 | 0.820 | 0.813 | 0.812 | 0.822 |
| Group C | Random Forest | 0.895 | 0.898 | 0.902 | 0.899 | 0.900 | 0.903 | 0.897 | 0.902 | 0.901 |
| Group C | kNN | 0.800 | 0.802 | 0.808 | 0.804 | 0.808 | 0.811 | 0.811 | 0.806 | 0.808 |
| Group C | Bagging | 0.870 | 0.876 | 0.881 | 0.874 | 0.876 | 0.874 | 0.872 | 0.88 | 0.878 |
| Group C | BayesNet | 0.808 | 0.813 | 0.812 | 0.812 | 0.810 | 0.810 | 0.810 | 0.809 | 0.809 |
| Group C | J48 | 0.848 | 0.847 | 0.849 | 0.842 | 0.841 | 0.852 | 0.840 | 0.843 | 0.842 |
| Group C | Decision Table | 0.818 | 0.816 | 0.813 | 0.810 | 0.812 | 0.804 | 0.811 | 0.811 | 0.813 |
| Group D | Random Forest | 0.894 | 0.896 | 0.895 | 0.897 | 0.896 | 0.896 | 0.897 | 0.897 | 0.897 |
| Group D | kNN | 0.873 | 0.874 | 0.873 | 0.876 | 0.877 | 0.873 | 0.874 | 0.875 | 0.873 |
| Group D | Bagging | 0.872 | 0.872 | 0.870 | 0.872 | 0.873 | 0.875 | 0.870 | 0.877 | 0.873 |
| Group D | BayesNet | 0.807 | 0.810 | 0.810 | 0.810 | 0.808 | 0.807 | 0.806 | 0.808 | 0.807 |
| Group D | J48 | 0.834 | 0.841 | 0.838 | 0.841 | 0.838 | 0.837 | 0.832 | 0.837 | 0.834 |
| Group D | Decision Table | 0.822 | 0.819 | 0.818 | 0.815 | 0.815 | 0.820 | 0.813 | 0.812 | 0.822 |
Figure 4Top seven classification algorithms performed in terms of Area under the ROC Curve and accuracy out of ten algorithms that we used in this study. Group details are shown in Table 5.
Figure 5Random Forest algorithm outperforms all groups and demonstrated (AUC = 0.903 when fold = 60).
Evaluation measures of binary classifiers: assessment of a classifier’s prediction performance where k-fold = 60 (Train 90% : Test 10%).
| Classification Modle | PCC | RMSE | Precision | Sensitivity or Recall | (1− Specificity) | Area under the ROC Curve | Precision-Recall (PRC Area) |
|---|---|---|---|---|---|---|---|
| Random Forest | 83.559 | 0.352 | 0.815 | 0.843 | 0.829 | 0.903 | 0.878 |
| kNN | 77.015 | 0.456 | 0.748 | 0.774 | 0.767 | 0.800 | 0.741 |
| Bagging | 80.367 | 0.375 | 0.783 | 0.809 | 0.799 | 0.878 | 0.845 |
| SVM | 52.514 | 0.689 | 0.496 | 0.319 | 0.709 | 0.514 | 0.480 |
| Naive Bayes | 51.317 | 0.563 | 0.313 | 0.025 | 0.950 | 0.521 | 0.472 |
| Bayes Net | 74.701 | 0.419 | 0.746 | 0.704 | 0.785 | 0.813 | 0.782 |
| J48 | 78.611 | 0.399 | 0.752 | 0.816 | 0.759 | 0.850 | 0.803 |
| Jrip | 75.020 | 0.428 | 0.745 | 0.716 | 0.781 | 0.785 | 0.772 |
| Decision Table | 75.579 | 0.403 | 0.731 | 0.764 | 0.749 | 0.839 | 0.792 |
| Logistic Regression | 52.993 | 0.498 | 0.505 | 0.275 | 0.758 | 0.545 | 0.486 |
Figure 6Correlations among attributes for RF-EMF on human and animal cells (maroon indicating strong correlation and blue signaling no correlation). Features that were selected for this analysis were frequency, SAR, exposure time, and SAR×exposure time (impact of accumulated SAR within the exposure period).
Figure 7Influence of computer processor speed (CPU) and memory capacity (random-access memory (RAM) size) on prediction accuracy and computation time for Study 1, Study 2, and Study 3 (this study) shown in Table 1.