Jia Rong1,2, Sandra Michalska3, Sudha Subramani1, Jiahua Du1, Hua Wang1. 1. Institute for Sustainable Industries & Liveable Cities, Victoria University, Ballarat Road, Melbourne, 3011, Australia. 2. Faculty of Information Technology, Monash University, Wellington Road, Melbourne, 3800, Australia. 3. Institute for Sustainable Industries & Liveable Cities, Victoria University, Ballarat Road, Melbourne, 3011, Australia. sandra.michalska@live.vu.edu.au.
Abstract
BACKGROUND: The paper introduces a deep learning-based approach for real-time detection and insights generation about one of the most prevalent chronic conditions in Australia - Pollen allergy. The popular social media platform is used for data collection as cost-effective and unobtrusive alternative for public health monitoring to complement the traditional survey-based approaches. METHODS: The data was extracted from Twitter based on pre-defined keywords (i.e. 'hayfever' OR 'hay fever') throughout the period of 6 months, covering the high pollen season in Australia. The following deep learning architectures were adopted in the experiments: CNN, RNN, LSTM and GRU. Both default (GloVe) and domain-specific (HF) word embeddings were used in training the classifiers. Standard evaluation metrics (i.e. Accuracy, Precision and Recall) were calculated for the results validation. Finally, visual correlation with weather variables was performed. RESULTS: The neural networks-based approach was able to correctly identify the implicit mentions of the symptoms and treatments, even unseen previously (accuracy up to 87.9% for GRU with GloVe embeddings of 300 dimensions). CONCLUSIONS: The system addresses the shortcomings of the conventional machine learning techniques with manual feature-engineering that prove limiting when exposed to a wide range of non-standard expressions relating to medical concepts. The case-study presented demonstrates an application of 'black-box' approach to the real-world problem, along with its internal workings demonstration towards more transparent, interpretable and reproducible decision-making in health informatics domain.
BACKGROUND: The paper introduces a deep learning-based approach for real-time detection and insights generation about one of the most prevalent chronic conditions in Australia - Pollen allergy. The popular social media platform is used for data collection as cost-effective and unobtrusive alternative for public health monitoring to complement the traditional survey-based approaches. METHODS: The data was extracted from Twitter based on pre-defined keywords (i.e. 'hayfever' OR 'hay fever') throughout the period of 6 months, covering the high pollen season in Australia. The following deep learning architectures were adopted in the experiments: CNN, RNN, LSTM and GRU. Both default (GloVe) and domain-specific (HF) word embeddings were used in training the classifiers. Standard evaluation metrics (i.e. Accuracy, Precision and Recall) were calculated for the results validation. Finally, visual correlation with weather variables was performed. RESULTS: The neural networks-based approach was able to correctly identify the implicit mentions of the symptoms and treatments, even unseen previously (accuracy up to 87.9% for GRU with GloVe embeddings of 300 dimensions). CONCLUSIONS: The system addresses the shortcomings of the conventional machine learning techniques with manual feature-engineering that prove limiting when exposed to a wide range of non-standard expressions relating to medical concepts. The case-study presented demonstrates an application of 'black-box' approach to the real-world problem, along with its internal workings demonstration towards more transparent, interpretable and reproducible decision-making in health informatics domain.
Entities:
Keywords:
Deep learning; Hay fever; Pollen allergy; Twitter mining
Authors: Lewis Ziska; Kim Knowlton; Christine Rogers; Dan Dalan; Nicole Tierney; Mary Ann Elder; Warren Filley; Jeanne Shropshire; Linda B Ford; Curtis Hedberg; Pamela Fleetwood; Kim T Hovanky; Tony Kavanaugh; George Fulford; Rose F Vrtis; Jonathan A Patz; Jay Portnoy; Frances Coates; Leonard Bielory; David Frenz Journal: Proc Natl Acad Sci U S A Date: 2011-02-22 Impact factor: 11.205
Authors: Jason B Colditz; Kar-Hai Chu; Sherry L Emery; Chandler R Larkin; A Everette James; Joel Welling; Brian A Primack Journal: Am J Public Health Date: 2018-06-21 Impact factor: 9.308
Authors: Gennaro D'Amato; Stephen T Holgate; Ruby Pawankar; Dennis K Ledford; Lorenzo Cecchi; Mona Al-Ahmad; Fatma Al-Enezi; Saleh Al-Muhsen; Ignacio Ansotegui; Carlos E Baena-Cagnani; David J Baker; Hasan Bayram; Karl Christian Bergmann; Louis-Philippe Boulet; Jeroen T M Buters; Maria D'Amato; Sofia Dorsano; Jeroen Douwes; Sarah Elise Finlay; Donata Garrasi; Maximiliano Gómez; Tari Haahtela; Rabih Halwani; Youssouf Hassani; Basam Mahboub; Guy Marks; Paola Michelozzi; Marcello Montagni; Carlos Nunes; Jay Jae-Won Oh; Todor A Popov; Jay Portnoy; Erminia Ridolo; Nelson Rosário; Menachem Rottem; Mario Sánchez-Borges; Elopy Sibanda; Juan José Sienra-Monge; Carolina Vitale; Isabella Annesi-Maesano Journal: World Allergy Organ J Date: 2015-07-14 Impact factor: 4.084
Authors: Shang Gao; Michael T Young; John X Qiu; Hong-Jun Yoon; James B Christian; Paul A Fearn; Georgia D Tourassi; Arvind Ramanthan Journal: J Am Med Inform Assoc Date: 2018-03-01 Impact factor: 4.497
Authors: Charline Bour; Adrian Ahne; Susanne Schmitz; Camille Perchoux; Coralie Dessenne; Guy Fagherazzi Journal: J Med Internet Res Date: 2021-05-27 Impact factor: 5.428