| Literature DB >> 34977855 |
Abstract
Machine Learning (ML) has been a useful tool for scientific advancement during the COVID-19 pandemic. Contact tracing apps are just one area reaping the benefits, as ML can use location and health data from these apps to forecast virus spread, predict "hotspots," and identify vulnerable groups. However, to do so, it is first important to ensure that the dataset these apps yield is accurate, free of biases, and reliable, as any flaw can directly influence ML predictions. Given the lack of criteria to help ensure this, we present two requirements for those exploring using ML to follow. The requirements we presented work to uphold international data quality standards put forth for ML. We then identify where our requirements can be met, as countries have varying contact tracing apps and smartphone usages. Lastly, the advantages, limitations, and ethical considerations of our approach are discussed.Entities:
Keywords: AI; COVID-19; contact tracing; digital health; mobile applications
Year: 2021 PMID: 34977855 PMCID: PMC8715913 DOI: 10.3389/fdgth.2021.590194
Source DB: PubMed Journal: Front Digit Health ISSN: 2673-253X
Data quality dimensions.
|
|
|
|---|---|
| Understandability | This attribute enables the users to interpret and express the information in appropriate languages and symbols for a specific context of use. |
| Fairness | The machine is trained with data with the ratio of all races (e.g., Black, white, etc.). |
| Currentness | This attribute identifies the information that is up to date. |
| Efficiency | Capability of providing suitable performance according to the number of resources used. |
| Availability | The degree to which the extracted data can be retrieved by authorized users for that context of use. |
| Relevance | To retrieve the data based on the requirement of the end-user or targeted customers. |
| Context Coverage | The level to which the system can be re-trained with the data that matches the end user's requirements. |
| Reproducibility | The degree to which the data can reproduce the same results and allow others to continue to train new machine learning systems. |
| Traceability | The extent to which the source of information, including owner and/or author of the information, and any changes made to the information can be verified. |
| Satisfaction | The extent to which the end-user is satisfied with the trained data. |
| Effectiveness | The capability to produce the desired output from the extracted data. |
| Completeness | The ability of data to represent every meaningful state of the represented real-world system. |
| Accuracy | Data is accurate when data values stored in the database correspond to real-world values or the extent to which data is correct, reliable, and certified. |
| Interpretability | To extract the data with the right language, units, and symbols with better understandability. |
| Credibility | The extent to which the information is reputable, objective, and trustable. |
| Size | Depending on the type of input data, the maximum amount of data that varies is the size of the data. |
This table shows the dimensions taken from Rudraraju and Boyanapally (.
Applying ML to data from Contact-Tracing Apps.
|
|
|
|
|
|
|---|---|---|---|---|
| United Kingdom | 82.9 | Decentralized | No | 67.6 |
| Germany | 79.9 | Decentralized | No | 70.0 |
| United States | 79.1 | Decentralized | No | 70.8 |
| France | 77.5 | Centralized | Yes | 72.3 |
| Spain | 74.3 | Decentralized | No | 75.3 |
| South Korea | 70.4 | Centralized | Yes | 79.6 |
| Russia | 66.3 | Centralized | Yes | 84.5 |
| Italy | 60.8 | Decentralized | No | 92.1 |
| China | 59.9 | Centralized | Yes | 93.4 |
| Japan | 57.2 | Decentralized | No | 97.9 |
| Iran | 54.8 | Centralized | No | 102.1 |
| Turkey | 54.0 | Centralized | No | 103.8 |
| Mexico | 49.5 | ————– | No | 112.9 |
| Brazil | 45.6 | Decentralized | No | 122.7 |
| Vietnam | 44.9 | Decentralized | No | 124.8 |
| Philippines | 33.6 | Decentralized | No | 166.8 |
| Indonesia | 31.1 | Centralized | No | 179.9 |
| India | 36.7 | Centralized | No | 152.6 |
| Bangladesh | 18.5 | Centralized | No | 303.7 |
| Pakistan | 15.9 | Centralized | No | 352.5 |
The smartphone penetration rate per country and its server type to store contact tracing data. Countries listed above the red line have a smartphone penetration rate of at least 56%. If a country has a centralized server, AI can be feasibly applied to the data (denoted by “yes” and green box). The percentage of smartphone users required to achieve a 56% adoption rate is also listed.
In the United States, it should be noted that while certain states have begun to design official contract tracing apps, there is not a national consensus.
Information on Mexico could not be retrieved.