| Literature DB >> 35437401 |
Morgane Claudel1,2, Emilie Lerigoleur1,2, Cécile Brun1,3,2, Sylvie Guillerme2,1.
Abstract
Background: The original dataset presented here is the result of the first near-exhaustive analysis performed on historical data concerning ten plant species introduced in and around Occitania (south-western France) since 1651. Research was carried out on the following species: Alnusincana, Buddlejadavidii, Castaneasativa, Helianthustuberosus, Impatiensglandulifera, Prunuscerasifera, Prunuslaurocerasus, Reynoutriajaponica, Robiniapseudoacacia and Spiraeajaponica.The data file contains 199 occurrence data exclusively based on historical observations and records made between 1651 and 2004 that were retrieved from 111 of the 640 literary sources consulted. All the records are associated with a year and 61% of them have associated spatial coordinates. Initially, the EI2P-VALEEBEE research project focused on the introduction of these species into Occitania (95 occurrences, 47.7%), but mentions found of introductions beyond this territory - mainly in metropolitan France - are also reported.The creation of this dataset involved five stages: (1) selection of species, (2) consultation of historical sources, (3) recording of occurrences in the dataset, (4) dataset standardisation/enrichment and Darwin core mapping and (5) data publication. Quality controls were conducted at each step.The dataset is available on the platform of the Global Biodiversity Information Facility (GBIF) at https://doi.org/10.15468/3kvaeh. It respects the internationally recognised FAIR Data Principles (Findable, Accessible, Interoperable and Reusable). New information: The dataset will be progressively enriched by new data during the EI2P-VALEEBEE research project and future projects on invasive plant species conducted by the team. Morgane Claudel, Emilie Lerigoleur, Cécile Brun, Sylvie Guillerme.Entities:
Year: 2022 PMID: 35437401 PMCID: PMC8979938 DOI: 10.3897/BDJ.10.e76283
Source DB: PubMed Journal: Biodivers Data J ISSN: 1314-2828
FAIRness assessment criteria used for this dataset.
|
|
|
|---|---|
| FINDABLE |
Manfrini
Use a DOI for the dataset attributed by GBIF. Use unique identifiers (UUID) for each observation occurrence. Make persistent metadata and datasets thanks to the deposit on the GBIF platform. Use the Ecological Metadata Language (EML) internationally recognised standard to describe the database metadata and its associated projects, including standardised search keywords. Use a versioning system to allow future updates. |
| ACCESSIBLE |
Manfrini
Data storage in GBIF in accordance with the guidelines for quality standards (e.g. use of EML). The GBIF repository provides efficient, rich services for various uses and users. |
| INTEROPERABLE |
Manfrini
Standard vocabularies used as far as possible for some fields. Thesaurus used to search keywords from international thesauruses, such as Exclusive use of Darwin Core terms. A Darwin Core Archive offers a stable, straightforward and flexible framework for compiling biodiversity data from varied and variable sources (source: |
| REUSABLE |
Manfrini
The Darwin Core Archive facilitates the reusability of the dataset because it enables publication in the GBIF. This compact package (a ZIP file) contains interconnected text files and enables users to share their data using a common terminology (source: Use an open format for the dataset (OpenDocument.ods) and open source software to reuse it. EML metadata includes provenance for raw and derived data. This data paper explains the data processing steps, curation protocol, quality assurance processes, methods and tools that permit long-term integrity and understandability of data. The spatial/temporal/taxonomic coverage is clearly mentioned in the EML metadata and in this data paper, as well as the CC-BY licence and rules for large reuse. |
Figure 1.Spatial location of the 122 occurrences having data associated with coordinates.
Figure 2.Focus on (a) the Oussouet Valley and (b) the Pique Valley.
Phylogenetic classification of the studied taxa ordered by order, family, genus and species, with their number of occurrences and their hyperlink to the subsample by species on gbif.org.
|
|
|
|
|
|
|
|
|
|
|
| Jerusalem artichoke |
|
|
|
|
|
| Japanese knotweed |
|
|
|
|
|
| Indian balsam |
|
|
|
|
|
| False-acacia |
|
|
|
|
|
| Grey alder |
|
|
|
|
| Sweet chestnut |
| |
|
|
|
|
| Butterfly-bush |
|
|
|
|
|
| Cherry plum |
|
|
| Cherry laurel |
| |||
|
|
| Japanese spiraea |
|
Figure 3.Number of occurrences found and number of literary sources consulted per 25-year period.
| Column label | Column description |
|---|---|
| id | Same as occurrenceID: An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. This is the primary key of this table. |
| type | The nature or genre of the resource. |
| modified | The most recent date-time on which the resource was changed. |
| language | A language of the resource. |
| licence | A legal document giving official permission to do something with the resource. |
| rightsHolder | A person or organisation owning or managing rights over the resource. |
| institutionID | An identifier for the institution having custody of the object(s) or information referred to in the record. |
| institutionCode | The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. |
| datasetName | The name identifying the dataset from which the record was derived. |
| ownerInstitutionCode | The name (or acronym) in use by the institution having ownership of the object(s) or information referred to in the record. |
| basisOfRecord | The specific nature of the data record. Recommended best practice is to use the standard label of one of the Darwin Core classes. |
| occurrenceID | An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. |
| occurrenceRemarks | Comments or notes about the Occurrence. |
| recordedBy | A list (concatenated and separated) of names of people, groups or organisations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first. |
| occurrenceStatus | A statement about the presence or absence of a Taxon at a Location. Recommended best practice is to use this controlled vocabulary: |
| associatedReferences | A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the Occurrence. |
| eventDate | The date-time or interval when the event was recorded. Not suitable for a time in a geological context. Recommended best practice is to use an encoding scheme, such as ISO 8601:2004(E). |
| year | The four-digit year in which the Event occurred, according to the Common Era Calendar. |
| eventRemarks | Comments or notes about the Event. |
| locationID | An identifier for the set of location information (data associated with dcterms:Location). May be a global unique identifier or an identifier specific to the dataset. |
| continent | The name of the continent in which the Location occurs. Recommended best practice is to use a controlled vocabulary such as the Getty Thesaurus of Geographic Names. |
| countryCode | A unique (preferably globally-unique) identifier for the taxon represented in the row. Recommended best practice is to use ISO 3166-1-alpha-2 country codes: |
| stateProvince | The name of the next smaller administrative region than country (state, province, canton, department, region etc.) in which the Location occurs. Recommended best practice is to use a controlled vocabulary such as the Getty Thesaurus of Geographic Names. |
| county | The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department etc.) in which the Location occurs. Recommended best practice is to use a controlled vocabulary such as the Getty Thesaurus of Geographic Names. |
| municipality | The full, unabbreviated name of the next smaller administrative region than county (city, municipality etc.) in which the Location occurs. Do not use this term for a nearby named place that does not contain the actual location. Recommended best practice is to use a controlled vocabulary such as the Getty Thesaurus of Geographic Names. |
| locality | The specific description of the place. Less specific geographic information can be provided in other geographic terms (higherGeography, continent, country, stateProvince, county, municipality, waterBody, island, islandGroup). This term may contain information modified from the original to correct perceived errors or to standardise the description. |
| verbatimLocality | The original textual description of the place. |
| minimumElevationInMetres | The lower limit of the range of elevation (altitude, usually above sea level), in metres. |
| maximumElevationInMetres | The upper limit of the range of elevation (altitude, usually above sea level), in metres. |
| locationAccordingTo | Information about the source of this Location information. Could be a publication (gazetteer), institution or team of individuals. |
| decimalLatitude | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive. |
| decimalLongitude | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive. |
| geodeticDatum | The ellipsoid, geodetic datum or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude are based. Recommended best practice is use the EPSG code as a controlled vocabulary to provide an SRS, if known. Otherwise use a controlled vocabulary for the name or code of the geodetic datum, if known. Otherwise use a controlled vocabulary for the name or code of the ellipsoid, if known. If none of these is known, use the value "unknown". |
| georeferenceSources | A list (concatenated and separated) of maps, gazetteers or other resources used to georeference the Location, described specifically enough to allow anyone in the future to use the same resources. |
| georeferenceRemarks | Notes or comments about the spatial description determination, explaining assumptions made in addition or opposition to those formalised in the method referred to in georeferenceProtocol. |
| identifiedBy | A list (concatenated and separated) of names of people, groups or organisations who assigned the Taxon to the subject. Recommended best practice is to separate the values in a list with space vertical bar space (|) . |
| dateIdentified | The date on which the subject was determined as representing the Taxon. Recommended best practice is to use a date that conforms to ISO 8601-1:2019. |
| taxonID | An identifier for the set of taxon information (data associated with the Taxon class). May be a global unique identifier or an identifier specific to the dataset. |
| scientificNameID | An identifier for the nomenclatural (not taxonomic) details of a scientific name. |
| scientificName | The full scientific name, with authorship and date information, if known. When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the identificationQualifier term. |
| nameAccordingTo | The reference to the source in which the specific taxon concept circumscription is defined or implied - traditionally signified by the Latin "sensu" or "sec." (from secundum, meaning "according to"). For taxa that result from identifications, a reference to the keys, monographs, experts and other sources should be given. |
| kingdom | The full scientific name of the kingdom in which the taxon is classified. |
| phylum | The full scientific name of the phylum or division in which the taxon is classified. |
| class | The full scientific name of the class in which the taxon is classified. |
| order | The full scientific name of the order in which the taxon is classified. |
| family | The full scientific name of the family in which the taxon is classified. |
| genus | The full scientific name of the genus in which the taxon is classified. |
| taxonRank | The taxonomic rank of the most specific name in the scientificName. |
| vernacularName | A common or vernacular name. |
| taxonRemarks | Comments or notes about the taxon or name. |
| Column label | Column description |
|---|---|
| id | An identifier for the Occurrence linked to the occurrence.txt file (same as occurrenceID). It can be repeated as a foreign key here. |
| identificationID | A unique identifier corresponding to the name spelling reported as found in the original text. This is the primary key of this table. |
| dateIdentified | The date on which the subject was determined as representing the Taxon. Recommended best practice is to use a date that conforms to ISO 8601-1:2019. The date format here is YYYY (e.g. 1694). |
| scientificName | The full scientific name, with authorship and date information, if known. When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the identificationQualifier term. |
| nameAccordingTo | The reference to the source in which the specific taxon concept circumscription is defined or implied - traditionally signified by the Latin "sensu" or "sec." (from secundum, meaning "according to"). For taxa that result from identifications, a reference to the keys, monographs, experts and other sources should be given. |
| vernacularName | A common or vernacular name. |
| taxonRemarks | Comments or notes about the taxon or name. |