| Literature DB >> 31304318 |
Adam Sadilek1, Stephanie Caty2, Lauren DiPrete3, Raed Mansour4, Tom Schenk5, Mark Bergtholdt3, Ashish Jha2,6, Prem Ramaswami1, Evgeniy Gabrilovich1.
Abstract
Machine learning has become an increasingly powerful tool for solving complex problems, and its application in public health has been underutilized. The objective of this study is to test the efficacy of a machine-learned model of foodborne illness detection in a real-world setting. To this end, we built FINDER, a machine-learned model for real-time detection of foodborne illness using anonymous and aggregated web search and location data. We computed the fraction of people who visited a particular restaurant and later searched for terms indicative of food poisoning to identify potentially unsafe restaurants. We used this information to focus restaurant inspections in two cities and demonstrated that FINDER improves the accuracy of health inspections; restaurants identified by FINDER are 3.1 times as likely to be deemed unsafe during the inspection as restaurants identified by existing methods. Additionally, FINDER enables us to ascertain previously intractable epidemiological information, for example, in 38% of cases the restaurant potentially causing food poisoning was not the last one visited, which may explain the lower precision of complaint-based inspections. We found that FINDER is able to reliably identify restaurants that have an active lapse in food safety, allowing for implementation of corrective actions that would prevent the potential spread of foodborne illness.Entities:
Keywords: Data mining; Epidemiology; Machine learning
Year: 2018 PMID: 31304318 PMCID: PMC6550174 DOI: 10.1038/s41746-018-0045-1
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Number of inspections conducted during the experimental time period
|
|
| |
|---|---|---|
| Total | 132 | 10,786 |
| Las Vegas | 61 | 4977 |
| Chicago | 71 | 5809 |
| Complaint-driven | N/A | 1291 |
| Routine | N/A | 4518 |
| Risk levela | ||
| High (% of total) | 84 (63.6%) | 5702 (52.9%) |
| Medium (%) | 39 (29.6%) | 2325 (21.6%) |
| Low (%) | 9 (6.8%) | 2759 (25.6%) |
ap value for difference in risk distribution between FINDER and BASELINE <0.001, from Χ2-test
Ability of FINDER to detect unsafe restaurants as compared to BASELINE rate and with subcategories of the baseline inspections, including complaint-based inspections that occurred in Chicago and routine inspections from both Chicago and Las Vegas
| FINDER | BASELINE | Odds ratioa [95% CI] | ||
|---|---|---|---|---|
| Overall, number unsafe (%) | 69 (52.3%) | 2662 (24.7%) | 3.06 [2.14–4.35] | <0.001 |
| Risk level | ||||
| High, number unsafe (%) | 42 (50.0%) | 1909 (33.5%) | 1.98 [1.28–3.05] | 0.002 |
| Medium, number unsafe (%) | 23 (59.0%) | 536 (23.1%) | 5.50 [2.83–10.72] | <0.001 |
| Low, number unsafe (%) | 4 (44.4%) | 217 (7.9%) | 7.35 [1.79–30.13] | 0.006 |
| Comparison of FINDER to complaint-based inspections | ||||
|
| |||
| Overall, number unsafe (%) | 37 (52.1%) | 508 (39.4%) | 1.68 [1.04–2.71] | 0.03 |
| Risk level | ||||
| High, number unsafe (%) | 27 (47.4%) | 374 (39.4%) | 1.38 [0.81–2.36] | 0.24 |
| Medium, number unsafe (%) | 9 (75.0%) | 115 (39.3%) | 4.64 [1.23–17.51] | 0.02 |
| Low, number unsafe (%) | 1 (50.0%) | 19 (38.8%) | 1.58 [0.09–26.78] | 0.75 |
| Comparison of FINDER to routine inspections | ||||
|
| |||
| Overall, number unsafe (%) | 69 (52.3%) | 2,154 (22.7%) | 3.16 [2.22–4.51] | <0.001 |
| Risk level | ||||
| High, number unsafe (%) | 42 (50.0%) | 1531 (32.2%) | 2.07 [1.35–3.20] | 0.001 |
| Medium, number unsafe (%) | 23 (59.0%) | 424 (20.9%) | 5.52 [2.84–10.76] | <0.001 |
| Low, number unsafe (%) | 4 (44.4%) | 199 (7.3%) | 7.65 [1.90–30.89] | 0.004 |
aOdds ratios from binomial logistic regressions
Violation counts
| FINDERa | BASELINEa | ||
|---|---|---|---|
| Critical violations | 0.40 | 0.21 | 0.001 |
| Major violations | 0.74 | 0.56 | 0.04 |
aAdjusted mean violation count, accounting for city and risk level, calculated using linear regressions
Fig. 1Frequency with which illness can be attributed to recently visited restaurants, among FINDER restaurants. N = 132