| Literature DB >> 35871228 |
Afra Nerpel1, Liuhuaying Yang2, Johannes Sorger2, Annemarie Käsbohrer1, Chris Walzer3,4, Amélie Desvars-Larrive5,6,7.
Abstract
The zoonotic origin of SARS-CoV-2, the etiological agent of COVID-19, is not yet fully resolved. Although natural infections in animals are reported in a wide range of species, large knowledge and data gaps remain regarding SARS-CoV-2 in animal hosts. We used two major health databases to extract unstructured data and generated a global dataset of SARS-CoV-2 events in animals. The dataset presents harmonized host names, integrates relevant epidemiological and clinical data on each event, and is readily usable for analytical purposes. We also share the code for technical and visual validation of the data and created a user-friendly dashboard for data exploration. Data on SARS-CoV-2 occurrence in animals is critical to adapting monitoring strategies, preventing the formation of animal reservoirs, and tailoring future human and animal vaccination programs. The FAIRness and analytical flexibility of the data will support research efforts on SARS-CoV-2 at the human-animal-environment interface. We intend to update this dataset weekly for at least one year and, through collaborations, to develop it further and expand its use.Entities:
Mesh:
Year: 2022 PMID: 35871228 PMCID: PMC9308035 DOI: 10.1038/s41597-022-01543-8
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Schematic overview of the methodology: report integration and validation steps.
Description of the fields presented in the final dataset and format.
| Field | Description | Format |
|---|---|---|
|
| Unique identifier for each unique event of SARS-CoV-2 infection/exposure in animal(s). | string |
|
| Primary source of information to document the event. Possible pre-defined string values are: • • | string |
|
| Unique identifier for the report, as provided by the primary source. Also corresponds to the name of the PDF file describing the event in the archives folder. | string |
|
| Link to the online primary source to document the event. | string |
|
| Secondary source of information to document the event. Possible pre-defined string values are: • • | string |
|
| Unique identifier for the report, as provided by the secondary source. Also corresponds to the name of the PDF file describing the event in the archives folder. | string |
|
| Link to the online secondary source for the event. | string |
|
| Most specific designation of the animal host provided by the source(s), in English. | string |
|
| Scientific name of the animal host as mentioned in the source(s) (scientific names are harmonized so that only the first letter of the genus is capitalized). | string |
|
| Common name of the animal host, harmonized against the National Center for Biotechnology Information (NCBI) taxonomic backbone. | string |
|
| Scientific name of the animal host (resolved to species or subspecies level), harmonized against the National Center for Biotechnology Information (NCBI) taxonomic backbone. | string |
|
| The colloquial name of the host, i.e. the name commonly used to identify the animal in non-specialist language (e.g. “tiger” for “Sumatran tiger”). | string |
|
| The scientific name of the host resolved to the species level. | string |
|
| Animal family of the animal host. | string |
|
| The epidemiological unit considered to describe the event. Possible pre-defined string values are: • • • • | string |
|
| Reported number of animal(s) tested positive for SARS-CoV-2 in the event. | numeric |
|
| Reported number of susceptible animal(s) of the same species in the event. | numeric |
|
| Reported number of animal(s) of the same species tested in the event. | numeric |
|
| Reported number of direct and indirect death(s) related to the event. If death is not related to SARS-CoV-2 (see field | numeric |
|
| Age of the animal(s) when tested, in years. | numeric |
|
| Sex of the animal(s). Possible pre-defined values are: • • | character |
|
| Three-digit ISO country code for the country where the SARS-CoV-2 event was reported. | string |
|
| Name of the country where the SARS-CoV-2 event was reported. | string |
|
| The subnational administrative region where the SARS-CoV-2 event was reported. | string |
|
| The city where the SARS-CoV-2 event was reported. | string |
|
| Specification of the geographic location enabling to discriminate SARS-CoV-2 events occurring in the same species, at the same date and geolocation ( | string |
|
| When the SARS-CoV-2 infection or exposure was laboratory confirmed. | date |
|
| When the SARS-CoV-2 event was reported by the WAHIS. | date |
|
| When the primary source published the SARS-CoV-2 event ( | date |
|
| Relationship with another record (see field Possible pre-defined string values are: • • • • • • | string |
|
| Unique identifier of the related entry in the dataset. | string |
|
| First type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2. | string |
|
| Type of sample collected to perform the test (reported in the field | string |
|
| Second type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2. | string |
|
| Type of sample collected to perform the second test (reported in the field | string |
|
| Third type of laboratory test performed to detect infection with (presence of the virus is evidenced) or exposure to (presence of antibodies is evidenced) SARS-CoV-2. | string |
|
| Type of sample collected to perform the third test (reported in the field | string |
|
| First type of laboratory test mentioned in the report, which outcome was negative. | string |
|
| Type of sample collected to perform the first test (reported in the field | string |
|
| Second type of laboratory test mentioned in the report, which outcome was negative. | string |
|
| Type of sample collected to perform the second test reported in the field | string |
|
| Rationale for testing the animal(s). | string |
|
| Reported clinical signs allegedly associated to SARS-CoV-2. | string |
|
| Issue of the SARS-CoV-2 infection (or exposure). | string |
|
| How/where the animal(s) live(s). | string |
|
| Most probable source of SARS-CoV-2 infection. | string |
|
| SARS-CoV-2 genetic variant. | string |
|
| Main intervention(s) implemented to mitigate further spread of the virus. | string |
|
| Information source cited by the primary source. | string |
|
| Link to the online source cited by the primary source (when applicable). | string |
* Several reports referred to “an undetected human case” as the most likely source of infection. Those were given the value NS in the dataset because the information was considered not specific enough. The source of infection was reported as human only when the report mentioned a contact with a confirmed human case as the most likely source of infection.
Details of the SARS-ANI files and products.
| SARS-ANI Files | Format | Description |
|---|---|---|
| sars_ani_data.csv | Comma-separated values file (.csv, UTF-8 encoded) | This file contains the raw data of the SARS-ANI Dataset, which presents structured information on SARS-CoV-2 events in animals. |
| README.md | Markdown file | This file contains information about the project and the other files stored in the repository. |
| Contributing.md | Markdown file | This file provides guidelines for contributing to the project: suggesting changes to the data or to the code, submitting new data, and contributing to the code. |
| sars_ani_validation.R | R file | This file contains the R code to validate and curate the dataset. This code enables the users to explore the structure of the dataset, check the different entries for each field, and search for the presence of duplicates. |
| sars_ani_visualization.Rmd | R Markdown file | This Markdown file contains the R code to explore, describe, and visualize the dataset. To see all the results, knit it to.pdf (default output; other outputs are also possible, e.g. .html or .docx). This code is used for the visual validation of the data. |
| sars_ani_excluded_rep.xlsx | Excel file | This file contains the list of ProMED-mail and WAHIS reports that were not included in the dataset and reasons for exclusion. |
| sars_ani_examples.pdf | PDF file | This document contains three examples illustrating the structure and coding scheme of the SARS-ANI Dataset. |
| sars_ani_PDF_archives | Folder containing PDF files | Contains all ProMED-mail and WAHIS reports (in PDF format) used to populate the dataset. |
| SARS-ANI VIS | Dashboard | Visual interactive displays of some selected data of the SARS-ANI Dataset enabling to monitor SARS-CoV-2 events in animals in near real-time ( |
Fig. 2Geographic distribution of reported SARS-CoV-2 outbreaks (i.e. occurrence of one or more cases in an epidemiological unit) in animals per country. The number of outbreaks is lower than the number of events because distinct events (i) may belong to the same epidemiological unit (e.g. animals that are living together, e.g. one farm, one household) or (ii) may be follow-ups of the same outbreak. Note that if an outbreak is not published by ProMED-mail and/or WAHIS then it is not included in the dataset. Grey colour: no outbreak reported.
Number of globally reported SARS-CoV-2 cases (infections or exposures) per animal host (as of date of submission, 22 June 2022).
| Family | Common name | Lowest taxonomy | Number cases |
|---|---|---|---|
| Mustelidae | American mink | 787* | |
| Cervidae | white-tailed deer | 467* | |
| Felidae | domestic cat | 338 | |
| Canidae | dog | 208 | |
| Felidae | lion | 68 | |
| Felidae | tiger | 62 | |
| Hominidae | western lowland gorilla | 23 | |
| Cricetidae | golden hamster | 15 | |
| Felidae | snow leopard | 14 | |
| Felidae | Malayan tiger | 11 | |
| Felidae | Asiatic lion | 9 | |
| Mustelidae | domestic ferret | 9 | |
| Hominidae | gorilla | 8 | |
| Mustelidae | Asian small-clawed otter | 8 | |
| Castoridae | Eurasian beaver | 7 | |
| Cricetidae | hamster (unspecified) | NS** | 3 |
| Felidae | puma | 3 | |
| Cervidae | mule deer | 2 | |
| Felidae | Sumatran tiger | 2 | |
| Hippopotamidae | hippopotamus | 2 | |
| Hyaenidae | spotted hyena | 2 | |
| Trichechidae | Caribbean manatee | 2 | |
| Cebidae | black-tailed marmoset | 1 | |
| Felidae | Canada lynx | 1 | |
| Felidae | Eurasian lynx | 1 | |
| Felidae | leopard | 1 | |
| Felidae | fishing cat | 1 | |
| Myrmecophagidae | giant anteater | 1 | |
| Procyonidae | ring-tailed coati | 1 | |
| Viverridae | binturong | 1 |
This table includes only events for which the number of cases is documented. Host names are harmonized against the NCBI taxonomic backbone.
*Number of cases was reported inconsistently in mink (data on the number of cases in mink is missing most of the time). Therefore, the number of diagnosed cases is largely under-estimated in mink. This is also true for deer, but to a lesser extent.
**NS: Not specified in the reports. The hamster species was neither specified in the ProMED-mail nor WAHIS report.
Fig. 3SARS-CoV-2 case fatality rate (CFR) per animal host and country. The CFR for each animal host and country is obtained by dividing the total number of reported deaths in one animal host by the total number of reported cases for this host in the country. Animals culled as part of a control strategy are excluded (not all were diagnosed as infected). Similarly, mink are not included here because data on case and death numbers are partial. The CFR depends strongly on testing and does not give information on the infection fatality rate (IFR, number of deaths divided by the total number of infected individuals) or mortality rate (MR, number of deaths divided by the total at-risk population).
Fig. 4Sankey diagram showing the SARS-CoV-2 variants identified in the different animal hosts. The figure describes the number of events (one event may include one or more cases).
Fig. 5Rationales for testing animals for SARS-CoV-2 infection or exposure. Only positive animals are reported in the dataset; investigations that led to negative results are not (or rarely) reported to the authorities or media.
Fig. 6Screen shot of the SARS-ANI dashboard.
| Measurement(s) | SARS-CoV-2 event in animal hosts |
| Technology Type(s) | Manual data collection from text sources |
| Factor Type(s) | ID • primary_source • archive_event_number • link_web • secondary_source • secondary_source_ID • secondary_source_web • host_com_orig • host_sci_orig • host_com_res • host_sci_res • host_colloq • host_sci_spec_res • family • epidemiological_unit • number_cases • number_susceptible • number_tested • number_deaths • age • sex • country_iso3 • country_name • subnational_administration • city • location_detail • date_confirmed • date_reported • date_published • related_to_other_entries • related_ID • test • sampling_type • test_2 • sampling_type_2 • test_3 • sampling_type_3 • negative_test • negative_sampling_type • negative_test_2 • negative_sampling_type_2 • reason_for_testing • symptoms • outcome • living_conditions • source_of_infection • variant • control_measures • original_source • link_original_source |
| Sample Characteristic - Organism | animal host(s) |
| Sample Characteristic - Environment | domestic • wild• captive• farmed |
| Sample Characteristic - Location | Global |