| Literature DB >> 32140577 |
Abstract
BACKGROUND: Lung cancer is one of the deadliest cancer in the world. Hundreds of researches are presented annually in the field of lung cancer treatment, diagnosis and early prediction. The current research focuses on the early prediction of lung cancer via analysis of the most dangerous risk factors.Entities:
Keywords: Cancer prevention; Computer science; Lung cancer; Lung symptoms; Prediction tool; Risk factors
Year: 2020 PMID: 32140577 PMCID: PMC7044659 DOI: 10.1016/j.heliyon.2020.e03402
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Figure 1Analysis of Risk Factors of the Studied Database (colors represent risk levels: blue for high, cyan for medium and red for low), numbers above histograms mean cases from 1000 patients of the database.
Figure 2Analysis of Symptoms of the Studied Database (colors represent risk levels: blue for high, cyan for medium and red for low).
Figure 3Statistics of symptoms and factors according to local medical questionnaires (A) For factors (B) for symptoms.
Examples of our LCPT Test Inputs and the Corresponding Results.
| A | B | C | D | E | F | G | H | I | J | K |
|---|---|---|---|---|---|---|---|---|---|---|
| 1. | 0 | 0 | 0 | 5 | 0 | No | 0 | 0 | 30 | 10.1143% |
| 2. | 0 | 0 | 0 | 5 | 0 | No | 0 | 0 | 28 | 10.1143% |
| 3. | 0 | 0 | 0 | 5 | 0 | No | 0 | 0 | 63 | 15.1143% |
| 4. | 20 | 20 | 0 | 7 | 0 | No | 1 | 0 | 40 | 36.4325% |
| 5. | 20 | 20 | 0 | 7 | 7 | No | 1 | 0 | 60 | 50.5563% |
| 6. | 40 | 40 | 0 | 7 | 4 | Yes | 1 | 0 | 60 | 66.7183% |
| 7. | 15 | 20 | 0 | 7 | 0 | No | 1 | 0 | 45 | 30.1998% |
| 8. | 40 | 40 | 0 | 7 | 0 | Yes | 1 | 0 | 65 | 58.5606% |
| 9. | 40 | 40 | 0 | 7 | 0 | Yes | 1 | 1 | 70 | 63.5606% |
| 10. | 0 | 0 | 10 | 7 | 0 | Yes | 1 | 1 | 61 | 52.3547% |
| 11. | 10 | 35 | 0 | 5 | 0 | No | 0 | 0 | 35 | 22.4141% |
| 12. | 5 | 20 | 0 | 7 | 7 | No | 1 | 0 | 20 | 30.2949% |
| 13. | 0 | 0 | 0 | 7 | 0 | No | 0 | 1 | 45 | 19% |
| 14 | 50 | 40 | 0 | 7 | 0 | No | 0 | 0 | 67 | 40.7183% |
| 15. | 0 | 0 | 5 | 7 | 8 | No | 0 | 0 | 62 | 41.604% |
| 16. | 0 | 0 | 5 | 7 | 8 | Yes | 1 | 1 | 23 | 61.604% |
| 17. | 0 | 0 | 10 | 7 | 8 | Yes | 0 | 1 | 50 | 48.3547% |
| 18. | 0 | 0 | 0 | 5 | 0 | No | 1 | 0 | 61 | 20% |
| 19. | 35 | 20 | 0 | 5 | 5 | No | 1 | 1 | 55 | 45.8117% |
| 20. | 50 | 40 | 0 | 7 | 0 | No | 1 | 1 | 66 | 50.5469% |
| 21. | 0 | 0 | 0 | 2 | 0 | No | 1 | 0 | 25 | 4% |
| 22. | 10 | 40 | 0 | 0 | 0 | No | 1 | 0 | 33 | 12.7605% |
| 23. | 5 | 24 | 0 | 0 | 4 | No | 0 | 0 | 23 | 17.7024% |
| 24. | 17 | 20 | 0 | 3 | 0 | No | 0 | 0 | 37 | 19.5111% |
| 25. | 0 | 0 | 2 | 0 | 0 | No | 0 | 0 | 57 | 6.37287% |
| 26. | 0 | 0 | 10 | 2 | 0 | No | 0 | 0 | 41 | 12.3547% |
| 27. | 2 | 8 | 0 | 2 | 0 | Yes | 0 | 0 | 18 | 26.7227% |
| 28. | 0 | 0 | 0 | 0 | 0 | Yes | 0 | 1 | 62 | 10% |
| 29. | 0 | 0 | 0 | 0 | 5 | No | 1 | 0 | 27 | 15% |
| 30. | 5 | 12 | 0 | 0 | 0 | No | 1 | 0 | 46 | 13.7525% |
A: Individual's number, B: Period of smoking (Years), C: Number of cigarettes (per day), D: Period of passive smoking (Hours per day), E: Pollution degree (1–10), F: Expose to radiation (Hours per day), G: Genetic Factor, H: Alcohol consumption (1 for yes, 0 for no), I: Inflammations of lung (1 for very frequent, 0 for little), J: Age, K: LCPT Output.
The number of occurrence through trees (NOTT) and degree of importance (DOI) of each risk factor and symptom.
| Risk Factor | Number of Occurrence Through Trees | Degree of Importance |
|---|---|---|
| Smoking | 5 | 0.76 |
| Age | 6 | 0.72 |
| Passive smoking | 8 | 0.67 |
| Balanced diet | 3 | 0.62 |
| Occupational hazards | 2 | 0.61 |
| Air pollution | 5 | 0.61 |
| Genetic risk | 12 | 0.55 |
| Alcohol consumption | 10 | 0.51 |
| Chronic lung disease | 5 | 0.33 |
| Gender | 0 | 0 |
| Obesity | 0 | 0 |
| Dry cough | 5 | 0.54 |
| Wheezing | 4 | 0.49 |
| Snoring | 3 | 0.45 |
| Coughing Blood | 3 | 0.44 |
| Fatigue | 7 | 0.41 |
| Swallowing difficulty | 5 | 0.39 |
| Shortness of breath | 6 | 0.38 |
| Clubbing of finger nails | 7 | 0.37 |
| Chest pain | 5 | 0.34 |
| Weight loss | 1 | 0.17 |
Figure 4Final risk degree: For factors (A), for symptoms (B).
Figure 5Random tree generated by RF algorithm according to the analysis of the dataset.
The Sensitivity, Specificity and Accuracy of LCPT results.
| TP | TN | FP | FN | Specificity | Sensitivity | Accuracy | |
|---|---|---|---|---|---|---|---|
| Expert Opinion | 8 | 18 | 3 | 1 | 85.71% | 88.88% | 86.66% |
| Our LCPT | 9 | 19 | 2 | 0 | 90.47% | 100% | 93.33% |
Average DOI for each group of factors.
| Entire Factors | Dropped | Selected | |
|---|---|---|---|
| Number of Factors | 12 (100%) | 4 (33.33%) | 8 (66.66%) |
| Average DOI | 0.448 | 0.155 | 0.595 |