Sayed Asaduzzaman1,2, Md Raihan Ahmed3, Hasin Rehana4,5, Setu Chakraborty6, Md Shariful Islam6, Touhid Bhuiyan3. 1. Department of Computer Science and Engineering, Rangamati Science and Technology University, Vedvedi, Rangamati, Bangladesh. s.asaduzzaman@rmstu.edu.bd. 2. Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Tangail, 1902, Bangladesh. s.asaduzzaman@rmstu.edu.bd. 3. Department of Software Engineering, Daffodil International University, Dhanmondi, Dhaka, Bangladesh. 4. Department of Computer Science and Engineering, Daffodil International University, Dhanmondi, Dhaka, Bangladesh. 5. Department of Computer Science and Engineering, Rajshahi University Engineering and Technology, Rajshahi, Bangladesh. 6. Department of Biotechnology and Genetic Engineering, Mawlana Bhashani Science and Technology University, Tangail, Bangladesh.
Abstract
BACKGROUND: In this research, an astute system has been developed by using machine learning and data mining approach to predict the risk level of cervical and ovarian cancer in association to stress. RESULTS: For functioning factors and subfactors, several machine learning models like Logistics Regression, Random Forest, AdaBoost, Naïve Bayes, Neural Network, kNN, CN2 rule Inducer, Decision Tree, Quadratic Classifier were compared with standard metrics e.g., F1, AUC, CA. For certainty info gain, gain ratio, gini index were revealed for both cervical and ovarian cancer. Attributes were ranked using different feature selection evaluators. Then the most significant analysis was made with the significant factors. Factors like children, age of first intercourse, age of husband, Pap test, age are the most significant factors of cervical cancer. On the other hand, genital area infection, pregnancy problems, use of drugs, abortion, and the number of children are important factors of ovarian cancer. CONCLUSION: Resulting factors were merged, categorized, weighted according to their significance level. The categorized factors were indexed using ranker algorithm which provides them a weightage value. An algorithm has been formulated afterward which can be used to predict the risk level of cervical and ovarian cancer in relation to women's mental health. The research will have a great impact on the low incoming country like Bangladesh as most women in low incoming nations were unaware of it. As these two can be described as the most sensitive cancers to women, the development of the application from algorithm will also help to reduce women's mental stress. More data and parameters will be added in future for research in this perspective.
BACKGROUND: In this research, an astute system has been developed by using machine learning and data mining approach to predict the risk level of cervical and ovarian cancer in association to stress. RESULTS: For functioning factors and subfactors, several machine learning models like Logistics Regression, Random Forest, AdaBoost, Naïve Bayes, Neural Network, kNN, CN2 rule Inducer, Decision Tree, Quadratic Classifier were compared with standard metrics e.g., F1, AUC, CA. For certainty info gain, gain ratio, gini index were revealed for both cervical and ovarian cancer. Attributes were ranked using different feature selection evaluators. Then the most significant analysis was made with the significant factors. Factors like children, age of first intercourse, age of husband, Pap test, age are the most significant factors of cervical cancer. On the other hand, genital area infection, pregnancy problems, use of drugs, abortion, and the number of children are important factors of ovarian cancer. CONCLUSION: Resulting factors were merged, categorized, weighted according to their significance level. The categorized factors were indexed using ranker algorithm which provides them a weightage value. An algorithm has been formulated afterward which can be used to predict the risk level of cervical and ovarian cancer in relation to women's mental health. The research will have a great impact on the low incoming country like Bangladesh as most women in low incoming nations were unaware of it. As these two can be described as the most sensitive cancers to women, the development of the application from algorithm will also help to reduce women's mental stress. More data and parameters will be added in future for research in this perspective.
Entities:
Keywords:
Data mining; Gynecological cancer; Machine learning; Significant risk factors; Smart prediction tool; Women psychology
Authors: Nipin Sp; Dong Young Kang; Do Hoon Kim; Hyo Gun Lee; Yeong-Min Park; Il Ho Kim; Hak Kyo Lee; Byung-Wook Cho; Kyoung-Jin Jang; Young Mok Yang Journal: Exp Ther Med Date: 2019-11-13 Impact factor: 2.447