| Literature DB >> 34950760 |
Sarah Valentin1,2,3, Elena Arsevska1,2, Julien Rabatel1, Sylvain Falala1,2, Alizé Mercier1,2, Renaud Lancelot1,2, Mathieu Roche1,3.
Abstract
PADI-web (Platform for Automated extraction of animal Disease Information from the web) is a biosurveillance system dedicated to monitoring online news sources for the detection of emerging animal infectious diseases. PADI-web has collected more than 380,000 news articles since 2016. Compared to other existing biosurveillance tools, PADI-web focuses specifically on animal health and has a fully automated pipeline based on machine-learning methods. This paper presents the new functionalities of PADI-web based on the integration of: (i) a new fine-grained classification system, (ii) automatic methods to extract terms and named entities with text-mining approaches, (iii) semantic resources for indexing keywords and (iv) a notification system for end-users. Compared to other biosurveillance tools, PADI-web, which is integrated in the French Platform for Animal Health Surveillance (ESA Platform), offers strong coverage of the animal sector, a multilingual approach, an automated information extraction module and a notification tool configurable according to end-user needs.Entities:
Keywords: Animal disease surveillance; Software; Text mining
Year: 2021 PMID: 34950760 PMCID: PMC8671119 DOI: 10.1016/j.onehlt.2021.100357
Source DB: PubMed Journal: One Health ISSN: 2352-7714
Fig. 1Epidemic intelligence workflow (adapted from [6]).
Fig. 2PADI-web 3.0 pipeline.
Fig. 3Spatial Entity ‘Montpellier’ recognized by the automatic extraction (i.e. location with SpaCy) and associated with the Geonames ID used for the geotagging task.
Fig. 4Brat annotation integrated into PADI-web 3.0.
Fig. 5Organisation of keyword lists in PADI-web 3.0.
Fig. 6Extract of a notification received the 8th of April 2021 related to avian influenza disease - List of new articles collected with French (FR) (automatically translated in English) and English (EN) feeds.
Fig. 7Extract of a notification received the 8th of April 2021 related to avian influenza disease - Information about (i) classification and (ii) keywords extracted. First, news classification is proposed with 2 types of classifications (i.e. fine-grained and relevance classifications). Second, the sentence classification described in subsection 3.3 is notified (i.e. event and information types).
Performances of MLP for Event type classification.
| Precision | Recall | F-measure | |
|---|---|---|---|
| Current event | 0.74 | 0.98 | 0.81 |
| ( | |||
| Risk event | 0.39 | 0.29 | 0.33 |
| ( | |||
| Old event | 0.33 | 0.09 | 0.14 |
| ( | |||
| General | 0.79 | 0.58 | 0.67 |
| ( | |||
| Irrelevant | 0.69 | 0.41 | 0.52 |
| ( | |||
| Weighted | 0.72 | 0.70 | 0.69 |
| average | (±0.02) | (±0.02) | (±0.02) |
Performances of MLP for Information type classification.
| Precision | Recall | F-measure | |
|---|---|---|---|
| Descriptive epidemiology | 0.70 | 0.78 | 0.73 |
| ( | |||
| Distribution | 0.67 | 0.15 | 0.24 |
| ( | |||
| Preventive and control measures | 0.57 | 0.75 | 0.65 |
| ( | |||
| Concern and risk factors | 0.53 | 0.35 | 0.42 |
| ( | |||
| Transmission pathway | 0.56 | 0.28 | 0.37 |
| ( | |||
| Economic and political consequences | 0.68 | 0.26 | 0.38 |
| ( | |||
| General epidemiology | 0.83 | 0.70 | 0.76 |
| ( | |||
| Weighted average | 0.66 | 0.66 | 0.66 |
| (±0.03) | (±0.04) | (±0.03) |