| Literature DB >> 35121776 |
Ricardo P Pinheiro1, Sidney M L Lima2, Danilo M Souza3, Sthéfano H M T Silva3, Petrônio G Lopes3, Rafael D T de Lima3, Jemerson R de Oliveira3, Thyago de A Monteiro3, Sérgio M M Fernandes3, Edison de Q Albuquerque3, Washington W A da Silva4, Wellington P Dos Santos4.
Abstract
Java vulnerabilities correspond to 91% of all exploits observed on the worldwide web. The present work aims to create antivirus software with machine learning and artificial intelligence and master in Java malware detection. Within the proposed methodology, the suspected JAR sample is executed to intentionally infect the Windows OS monitored in a controlled environment. In all, our antivirus monitors and considers, statistically, 6824 actions that the suspected JAR file can perform when executed. Our antivirus achieved an average performance of 91.58% in the distinction between benign and malware JAR files. Different initial conditions, learning functions and architectures of our antivirus are investigated. The limitations of commercial antiviruses can be supplied by intelligent antiviruses. Instead of blacklist-based models, our antivirus allows JAR malware detection preventively and not reactively as Oracle's Java and traditional antivirus modus operandi.Entities:
Year: 2022 PMID: 35121776 PMCID: PMC8817023 DOI: 10.1038/s41598-022-05921-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Results of commercial antiviruses. Expanded results of 86 worldwide commercial antiviruses are in the authorial repository[11].
| Antivirus | Detection (%) | False negative (%) | Omission (%) |
|---|---|---|---|
| McAfee-GW-Edition | 99.10 | 0.90 | 0.00 |
| NANO-Antivirus | 97.70 | 2.20 | 0.10 |
| AegisLab | 97.60 | 2.10 | 0.30 |
| Kaspersky | 96.80 | 2.90 | 0.30 |
| ZoneAlarm | 96.70 | 2.90 | 0.40 |
| Avast | 96.60 | 3.30 | 0.10 |
| AVG | 96.60 | 3.30 | 0.10 |
| ESET-NOD32 | 95.90 | 4.10 | 0.00 |
| McAfee | 95.60 | 4.40 | 0.00 |
| Avira | 94.80 | 3.30 | 1.90 |
| Cylance | 0.20 | 0.00 | 99.80 |
| WhiteArmor | 0.20 | 91.30 | 8.50 |
| Alibaba | 0.20 | 98.50 | 1.30 |
| ALYac | 0.10 | 95.80 | 4.10 |
| Bkav | 0.10 | 97.60 | 2.30 |
| Paloalto | 0.00 | 0.00 | 100.00 |
| SentinelOne | 0.00 | 0.00 | 100.00 |
| Endgame | 0.00 | 0.00 | 100.00 |
| CrowdStrike | 0.00 | 0.00 | 100.00 |
| Agnitum | 0.00 | 0.00 | 100.00 |
Result of the submission of three malware to VirusTotal. Expanded results of 86 worldwide commercial antiviruses are in the authorial repository[11].
| Antivirus | VirusShare | VirusShare | VirusShare |
|---|---|---|---|
| McAfee-GW-Edition | Artemis!Trojan | Artemis | PWS-Zbot.gen.jr |
| NANO-Antivirus | Trojan.Android.SMSSend.numyx | Trojan.Android.Opfake.oefcg | Trojan.Java.CVE20113544.cspflc |
| AegisLab | Troj.Sms.Androidos!c | SUSPICIOUS | Troj.W32.Generic!c |
| Kaspersky | HEUR:Trojan-SMS. AndroidOS.Fakelogo.a | HEUR:Trojan-SMS. AndroidOS.Fakelogo.a | HEUR:Trojan. Win32.Generic |
| ZoneAlarm | HEUR:Trojan-SMS. AndroidOS.Fakelogo.a | HEUR:Trojan-SMS. AndroidOS.Fakelogo.a | HEUR:Trojan. Win32.Generic |
| Avast | Android:RuFraud-I | Android:RuFraud-I | Java:CVE-2011-3544-BD |
| AVG | Android:RuFraud-I | Android:RuFraud-I | Java:CVE-2011-3544-BD |
| ESET-NOD32 | Android/TrojanSMS.Agent.K | Android/TrojanSMS.Agent.K | a variant of Java/ Exploit.CVE-2011-3544.DF |
| McAfee | Artemis!9EF6966B98A5 | Artemis!BEE5A7C75B6A | RDN/Generic |
| Avira | ANDROID/SmsAgent.CQ.Gen | ANDROID/SmsAgent.CQ.Gen | EXP/CVE-2011-3544 |
| Sophos | Andr/Jifake-B | Andr/Opfake-A | Mal/Generic-S |
| Symantec | Android.Fakemini | Android.Fakemini | Trojan.MalJava |
| IkarusV | Trojan.AndroidOS.FakeInst | Trojan.AndroidOS.FakeInst | Java.CVE |
| MAX | Malware | malware | Malware |
| TrendMicro-HouseCall | Suspicious_GEN.F47V0322 | AndroidOS_OPFAKE.A, | Suspicious_GEN.F47V0322 |
| Emsisoft | Android.Trojan.FakeInst.CB | Android.Trojan.FakeInst.CB | Gen:Variant.Barys.841 |
| GData | Android.Trojan.FakeInst.CB | Android.Trojan.FakeInst.CB | Gen:Variant.Barys.841 |
| BitDefender | Android.Trojan.FakeInst.CB | Android.Trojan.FakeInst.CB | Gen:Variant.Barys.841 |
| Tencent | Trojan.Android.FakeLogo.aa | Trojan.Android.FakeLogo.aa | Win32.Trojan.Jorik.Hvje |
| Arcabit | Android.Trojan.FakeInst.CB | Android.Trojan.FakeInst.CB | Trojan.Barys.841 |
Example of a statistical repository based on malware detection.
| Features | ||
|---|---|---|
| Check Wi-fi | Access the | Access Image |
| 1 | 0 | 1 |
Figure 1Diagram of the proposed methodology.
Figure 2(a) Successful performance of the kernel compatible with dataset. (b) Inaccurate classification of the Linear kernel in a non-linearly separable distribution. (c,d) Successful performances by Dilation and Erosion kernels.
Result of ELM networks. The parameters vary according to the set . There are only the best and worst-case descriptions.
| Kernel | (C, | Train rate (%) | Test rate (%) | Train time (sec.) | Test time (sec.) |
|---|---|---|---|---|---|
| Wavelets | 3.11 ± 0.07 | 0.76 ± 0.03 | |||
| 67.40 ± 1.97 | 47.91 ± 3.76 | 3.12 ± 0.08 | 0.78 ± 0.03 |
Result of ELM Networks. The number of neurons in the hidden layer varies according to 100, 500.
| Kernel | Neurons | Train rate (%) | Test rate (%) | Train time (sec.) | Test time (sec.) |
|---|---|---|---|---|---|
| Hard limit | 100 | 50.03 ± 0.00 | 49.75 ± 0.00 | 0.48 ± 0.01 | 0.03 ± 0.01 |
| 500 | 50.03 ± 0.00 | 49.75 ± 0.00 | 2.51 ± 0.03 | 0.13 ± 0.02 | |
| Tribas | 100 | 50.00 ± 0.03 | 50.00 ± 0.26 | 0.49 ± 0.02 | 0.02 ± 0.01 |
| 500 | 50.00 ± 0.03 | 50.00 ± 0.26 | 1.76 ± 0.05 | 0.14 ± 0.01 | |
| Fuzzy-Dilation | 500 | 95.71 ± 0.28 | 87.28 ± 2.40 | 2.01 ± 0.05 | 0.14 ± 0.02 |
| 100 | 84.37 ± 0.29 | 81.61 ± 2.32 | 0.55 ± 0.02 | 0.03 ± 0.01 | |
| Fuzzy-Erosion | 500 | 95.70 ± 0.33 | 87.72 ± 2.55 | 2.16 ± 0.07 | 0.16 ± 0.01 |
| 100 | 84.67 ± 0.37 | 82.16 ± 1.93 | 0.65 ± 0.01 | 0.03 ± 0.01 | |
| Dilation | 500 | 52.36 ± 0.57 | 5.68 ± 0.05 | ||
| 100 | 81.20 ± 0.35 | 78.86 ± 1.18 | 7.34 ± 0.18 | 0.77 ± 0.02 | |
| Erosion | 500 | 78.38 ± 0.24 | 70.58 ± 2.81 | 53.23 ± 2.41 | 5.78 ± 0.23 |
| 100 | 54.99 ± 0.11 | 53.05 ± 1.95 | 8.17 ± 0.15 | 0.86 ± 0.03 |
Significant values in bold.
Figure 3Boxplots referring to the accuracies of the authorial antivirus and the state-of-the-art.
Figure 4Boxplots regarding the processing times of the authorial antivirus and the state-of-the-art.
Comparison among the authorial antivirus and the state-of-the-art.
| Technique | Train rate (%) | Test rate (%) | Train time (sec.) | Test time (sec.) |
|---|---|---|---|---|
| Authorial Antivirus | 97.63 ± 0.13 | 91.58 ± 1.77 | 52.36 ± 0.57 | 5.68 ± 0.05 |
| Antivirus made by Lima et al. (2021), worst c.[ | 50.26 ± 0.89 | 50.26 ± 0.71 | 9.67 ± 1.76 | 0.44 ± 0.14 |
| Antivirus made by Lima et al. (2021), best c.[ | 96.71 ± 2.10 | 95.67 ± 1.85 | 580.07 ± 228.44 | 0.46 ± 0.19 |
| Antivirus made by Su et al.[ | 78.37 ± 0.47 | 78.31 ± 3.37 | 3337.15 ± 19.08 | 3.53 ± 0.14 |
| Antivirus made by Vinayakumar et al.[ | 149968.09 ± 33112.72 | 71.64 ± 23.13 | ||
| Antivirus made by Maniath et al.[ | 61.80 ± 20.01 | 60.01 ± 18.61 | 5483.06 ± 161.34 | 0.27 ± 0.01 |
| Deep Learning made by Wozniak et al.[ | 50.22 ± 0.19 | 48.05 ± 1.65 | 24102.06 ± 629.56 | 1.14 ± 0.02 |
| Antivirus made by Hou et al.[ | 50.00 ± 0.30 | 50.01 ± 2.63 | 153916.96 ± 725.12 | 0.08 ± 0.01 |
| Antivirus made by Hardy et al.[ | 99.57 ± 1.17 | 96.49 ± 1.89 | 6854.74 ± 300.59 | 0.14 ± 0.01 |
| Antivirus made by Kalash et al.[ | 52.93 ± 1.06 | 53.40 ± 2.84 | 897.77 ± 8.32 | 366.45 ± 4.99 |
| Deep Learning made by Santos et al.[ | 50.00 ± 0.00 | 50.00 ± 0.00 | 695.26 ± 29.67 | 7.90 ± 2.56 |
Significant values in bold.
Confusion matrix of the authorial antivirus and the state-of-the-art (%).
| Technique | Train | Test | |||
|---|---|---|---|---|---|
| M. | B. | M. | B. | ||
| Authorial Antivirus | M. | 0.94 ± 0.25 | 5.49 ± 1.83 | ||
| B. | 3.72 ± 0.16 | 10.79 ± 3.25 | |||
| Antivirus made by Lima et al. (2021), | M. | 57.24 ± 49.24 | 57.44 ± 49.07 | ||
| worst conf.[ | B. | 42.60 ± 49.51 | 42.64 ± 49.45 | ||
| Antivirus made by Lima et al. (2021), | M. | 5.73 ± 2.62 | 6.58 ± 1.99 | ||
| best conf.[ | B. | 0.84 ± 1.65 | 2.06 ± 1.77 | ||
| Antivirus made by Su | M. | 25.14 ± 0.65 | 25.14 ± 4.09 | ||
| B. | 16.86 ± 1.89 | 16.86 ± 4.49 | |||
| Antivirus made by Vinayakumar | M. | 2.37 ± 7.39 | 3.40 ± 6.96 | ||
| et al.[ | B. | 0.46 ± 1.33 | 3.12 ± 2.11 | ||
| Antivirus made by MANIATH, S | M. | 4.98 ± 15.67 | 11.46 ± 15.99 | ||
| et al.[ | B. | 34.14 ± 23.38 | 35.52 ± 22.88 | ||
| Deep Learning made by WOZNIAK, M. | M. | 30.00 ± 48.30 | 30.00 ± 48.30 | ||
| et al.[ | B. | 70.00 ± 48.30 | 70.00 ± 48.30 | ||
| Antivirus made by HOU, S. | M. | 0.00 ± 0.00 | 0.00 ± 0.00 | ||
| et al.[ | B. | 0.00 ± 0.00 | 0.00 ± 0.00 | ||
| Antivirus made by HARDY, | M. | 0.08 ± 0.22 | 1.99 ± 2.03 | ||
| et al.[ | B. | 0.75 ± 2.02 | 4.90 ± 2.30 | ||
| Antivirus made by KALASH, M. | M. | 43.82 ± 2.25 | 42.61 ± 5.97 | ||
| et al.[ | B. | 43.08 ± 15.14 | 42.77 ± 15.12 | ||
| Deep Learning made by SANTOS, | M. | 0.00 ± 0.00 | 0.00 ± 0.00 | ||
| et al.[ | B. | 100.00 ± 0.00 | 100.00 ± 0.00 | ||
T-students and Wilcoxon hypothesis test of the authorial antivirus and the state-of-the-art.
| Comparison | t-students (parametric test) | Wilcoxon (non-parametric test) | ||
|---|---|---|---|---|
| Hypothesis | Hypothesis. | |||
Authorial Antivirus Antivirus made by Lima et al. (2021), worst conf. | 1 | 4.2134e−41 | 1 | 1.30487e−11 |
Authorial Antivirus Antivirus made by Lima et al. (2021), best conf. | 1 | 3.81625e−09 | 1 | 2.86398e−09 |
Authorial Antivirus Antivirus made by Su et al. (2018) | 1 | 6.621e−19 | 1 | 2.5046e−11 |
Authorial Antivirus Antivirus made by Vinayakumar et al. (2019) | 1 | 3.803e−06 | 1 | 8.83703e−08 |
Authorial Antivirus Antivirus made by Maniath et al. (2017) | 1 | 7.16622e−11 | 1 | 1.04946e−05 |
Authorial Antivirus Deep Learning made by Wozniak et al. (2015) | 1 | 9.26937e−41 | 1 | 2.14306e−11 |
Authorial Antivirus Antivirus made by Hou et al. (2016) | 1 | 2.98566e−36 | 1 | 2.37833e−11 |
Authorial Antivirus Antivirus made by Hardy et al. (2016) | 1 | 1.33009e−11 | 1 | 5.13671e−10 |
Authorial Antivirus Antivirus made by Kalash et al. (2018) | 1 | 1.68447e−35 | 1 | 2.5046e−11 |
Authorial Antivirus Deep Learning made by Santos et al. (2019) | 1 | 5.42386e−42 | 1 | 1.02645e−12 |