Literature DB >> 32393758

Pofatu, a curated and open-access database for geochemical sourcing of archaeological materials.

Aymeric Hermann1,2, Robert Forkel3, Andrew McAlister4, Arden Cruickshank5, Mark Golitko6, Brendan Kneebone5, Mark McCoy7, Christian Reepmeyer8, Peter Sheppard4, John Sinton9, Marshall Weisler10.   

Abstract

Compositional analyses have long been used to determine the geological sources of artefacts. Geochemical "fingerprinting" of artefacts and sources is the most effective way to reconstruct strategies of raw material and artefact procurement, exchange or interaction systems, and mobility patterns during prehistory. The efficacy and popularity of geochemical sourcing has led to many projects using various analytical techniques to produce independent datasets. In order to facilitate access to this growing body of data and to promote comparability and reproducibility in provenance studies, we designed Pofatu, the first online and open-access database to present geochemical compositions and contextual information for archaeological sources and artefacts in a form that can be readily accessed by the scientific community. This relational database currently contains 7759 individual samples from archaeological sites and geological sources across the Pacific Islands. Each sample is comprehensively documented and includes elemental and isotopic compositions, detailed archaeological provenance, and supporting analytical metadata, such as sampling processes, analytical procedures, and quality control.

Entities:  

Year:  2020        PMID: 32393758      PMCID: PMC7214434          DOI: 10.1038/s41597-020-0485-8

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Extracting, transforming, and distributing natural resources and finished goods between individuals and groups has always been an important aspect of technological, economic, and social behaviors in human societies[1-4]. Such material aspects of cultures can be inferred with the help of provenance studies, by reconstructing the movements of materials and artefacts across space. For this purpose, archaeologists have regularly used petrographic and geochemical analyses for more than 40 years for characterising the geological provenance of raw materials and stone artefacts and for reconstructing patterns of exchange based on hard evidence[5-7]. Geochemical techniques have proven to be the most efficient and reliable way to fingerprint raw material sources and artefacts thereby providing reproducible and comparable results[8-10]. Furthermore, geochemical data are quantitative and can therefore be examined with statistical methods[11,12] or by using, for example, well-known principles of petrogenesis and mantle source evolution. Due to the improvement of analytical techniques and the increasing use of geochemical sourcing, the production and publication of archaeological compositional data have grown exponentially. It is now recognized that using large source data compilations can lead to more efficient and cost-effective research planning[7,10,13]. Sharing source data compilations facilitates assigning unambiguous provenance to artefacts because it enables a better understanding of geochemical variability of sources throughout a given study region and also shows potential geochemical differences between sources[14], especially for artefacts found in either very homogeneous or complex petrogenetic contexts[15-17]. Furthermore, accessing large geochemical datasets of archaeological artefacts will lead to more robust and large-scope modelling of prehistoric exchange systems[18-20]. However, the current lack of appropriate global data management platform makes it difficult to access and reference relevant archaeological datasets and often induces duplication of individual endeavors. In this data descriptor, we introduce the Pofatu Database, a curated and open-access database of geochemical data on archaeological materials and sources supported by comprehensive contextual information about individual samples and artefacts, including about the archaeological provenance, and a thorough description of analytical procedures. The goals of the database are (i) to provide easy access to published compositional data of archaeological sources and artefacts, (ii) to assemble contextual archaeological information for each individual sample, (iii) to facilitate reuse of existing data and encourage the appropriate crediting of original data sources, and (iv) to ensure reproducibility and comparability by documenting instrumental details, analytical procedures and reference materials used for calibration purposes or quality control. We provide compositional data as well as contextual metadata for 7759 individual samples with a current focus on archaeological sites across the Pacific Islands (Fig. 1). Our vision is an inclusive and collaborative data resource to activate an operational framework for data sharing in archaeometry, that will progressively include more datasets, and initiate a more global project similar to other online repositories for geological materials already available through a wide geoinformatics network[21-24]. Furthermore, by using common non-proprietary file formats (CSV) and an open source system for storage and version control (Git and GitHub repository), the Pofatu Database provides an analysis-friendly environment that enables transparency and built-in reproducibility of analytical tasks[25].
Fig. 1

Locations of samples already released in the Pofatu Database.

Locations of samples already released in the Pofatu Database.

Methods

The data can be accessed and downloaded from the Zenodo archive (10.5281/zenodo.3670127) and browsed in the Pofatu web application (https://pofatu.clld.org/). The database was designed to contain geochemical compositional data and extensive contextual metadata (sample identification, archaeological provenance, analytical methods, and related bibliographical references), which we compiled to ensure further reuse and reinterpretation of previous provenance analyses (Fig. 2).
Fig. 2

Structure of the Pofatu Database.

Structure of the Pofatu Database. The compositional data contains all analytical values for major oxide and trace element compositions, radiogenic and stable isotope ratios, and geochronology. Sample metadata involves the creation of unique identifiers, and a description of sample condition and preparation. Archaeological metadata provides information on the geographical, cultural and stratigraphic context of the parent artefacts (name, category and attributes), the collection origin (collector, date and nature of field research, storage location), and a description of the site and stratigraphic context (name, code, context, stratigraphic position). The reference metadata lists all bibliographical sources of the data and metadata information[26-173]. Methodological metadata ensure a control on data quality and include information about the preparation of samples analytical procedure (technique, laboratory, analyst) as well as the accuracy and reproducibility of published analyses (errors, precision, standard values, correction procedures).

Data acquisition

All data and metadata in the Pofatu Database and included in this data descriptor release are linked with published resources. Geochemical datasets are extracted from peer-reviewed material, while contextual metadata include information gathered from peer-reviewed articles, monographs, book chapters, and publicly available institutional reports. Original sources are coded in the repository and available as a BibTeX database file, suitable for importing into reference management software. Geochemical datasets are associated with a method identifier, which is unique and defined based on the set of available methodological metadata for a specific set of values. The process of data acquisition includes: Data submission: Data and metadata are gathered and stored in normalized tables linked by foreign keys. These interrelated tables each contain sets of information on (i) Data source, (ii) Sample and archaeological provenance, (iii) Compositional data, (iv) Primary analytical and method-specific metadata. The Pofatu Database is frequently curated and updated on a regular basis. New datasets and complementary information on previously documented datasets can be submitted using the Data Submission Template and Guidelines available online (https://pofatu.clld.org/about). Data validation: The content of each table is handled manually but several fields are constrained by ontologies, which are built-in form validation in the submission template. Data is also validated using functionality implemented in the Python package pypofatu, which imposes suitable constraints on data like geographic coordinates. Data output: The manually curated “raw” data undergoes an automated processing workflow (implemented in the Python package pypofatu) to create output formats ready for distribution. For long-term accessibility, the data is converted to a set of interrelated CSV files, described by metadata encoded as JSON-LD (cf. https://www.w3.org/TR/json-ld/, accessed January 30, 2020), following the World Wide Web Consortium (W3C) recommendations[174,175]. Because the compiled data is exclusively made of line-based text files (in CSV format), it is well-suited for long-term access since it has the lowest requirements on processing software, and provides for a transparent history of changes with the version control software Git (cf. https://git-scm.com/, accessed January 30, 2020).

Data Records

A release of the Pofatu Database is available from the Zenodo archive[176]. Details of the parameters and measurements reported in the database are summarized in Online-only Table 1. Unique identifiers for samples, artefacts and analytical methods were created for each data record, and used as primary and foreign keys to define relationships between tables.
Online-only Table 1

Overview of fields and content of the Pofatu database.

TableField nameField contentaFormata,b,c,d
sourcesIDUnique identifier of the bibliographic referenceID(1)
Entry_TypeZotero item types and fieldsZotero item types and fields
annotation
author
bookauthor
booktitle
date
doi
edition
editor
editora
editoratype
institution
isbn
issn
issue
journaltitle
keywords
langid
location
note
number
pages
pagetotal
pmcid
pmid
publisher
rights
series
shortjournal
shorttitle
title
type
url
urldate
volume
volumes
samplesIDUnique identifier of the analyzed sampleID(1)
sample_namePublished name of the sampleCHAR
sample_categoryStatus of the analyzed sample: source or artefactCHAR(CV)
sample_commentSupplementary information about the status of the sample. Eg. “Artefact used as a source”CHAR
petrographyPetrographic identification of the analyzed materialCHAR
contribution_idUnique identifier of the datasetCHAR
contribution_nameDescriptive title of the datasetCHAR
contribution_descriptionBrief description (abstract) of datasetCHAR
contribution_authorsAuthor(s) of the dataset (Last, First name)CHAR
contribution_affiliationInstitution(s) of the author(s)CHAR
contribution_contact_emailContact email for the creator of the templateCHAR
location_regionName of the region where the parent artefact was collectedCHAR
location_subregionName of the sub-region where the parent artefact was collectedCHAR
location_localityName of the locality where the parent artefact was collectedCHAR
location_commentSupplementary information about the provenance of the parent artefactCHAR
location_latitudeCoordinates in decimal degrees (negative indicate South)NUM
location_longitudeCoordinates in decimal degrees (negative indicate West)NUM
location_elevationElevation in meters, with respect to sea levelNUM
artefact_idUnique identifier for the parent artefact of the analyzed sampleID(1)
artefact_nameName of the parent artefact in the original publicationCHAR
artefact_categoryGeneral artefact classificationCHAR(CV)
artefact_attributesGeneral attributes (fragmentation, raw material, etc.)CHAR(CV)
artefact_commentSupplementary information about the parent artefactCHAR
artefact_collectorPrincipal investigator of fieldwork collectionCHAR
artefact_collection_typeType of field research: excavation or surveyCHAR(CV)
artefact_fieldwork_datePeriod/dates of the field researchCHAR
artefact_collection_locationName of the institution managing the collectionCHAR
artefact_collection_commentSupplementary information about the archaeological collectionCHAR
site_nameName given to the archaeological site in the literatureCHAR
site_codeCode of the archaeological site in the literatureCHAR
site_contextAssumed function of the archaeological site based on the literatureCHAR(CV)
site_commentSupplementary information about the archaeological siteCHAR
site_stratigraphic_positionOriginal stratigraphic context of the parent artefactCHAR
site_stratigraphy_commentSupplementary information about the site stratigraphyCHAR
measurementsSample_IDUnique identifier of the analyzed sampleID(2)
value_stringString value of: oxides (wt%), trace elements and REE (ppm), radiogenic isotopes, geochronologyNUM
Method_IDUnique identifier of the analytical methodID(2)
parameterGeochemical parameter analyzedNUM
valueMeasured value of: oxides (wt%), trace elements and REE (ppm), radiogenic isotopes, geochronologyNUM
lessValue_string actually inferior to expressed value: yes/noCHAR(CV)
value_sdStandard deviation (error) valueNUM
sd_sigmaStandard deviation confidence interval: 1σ or 2σCHAR(CV)
methodsIDUnique identifier of the analytical method for each parameter of a give datasetID(1)
codeIdentifier of the analytical method for a given datasetID(1)
parameterGeochemical parameter analyzed (fields of compositional data)CHAR(CV)
analyzed_material_1Description of the analyzed material: whole rock, volcanic glass, mineralCHAR(CV)
analyzed_material_2Description of the analyzed sample: core sample, fused disks, sample surface, powderCHAR(CV)
sample_preparationDescription of the sample preparationCHAR
chemical_treatmentDescription of the sample chemical treatmentCHAR
techniqueAnalytical techniqueCHAR(CV)
laboratoryName of the institution of the analysisCHAR
analystName of the analystCHAR
number_of_replicatesNumber of replicatesCHAR
instrumentInstrument type and nameCHAR
datePeriod/dates of the analysisCHAR
commentSupplementary information about the analysis. e.g. reference to the method description in the literatureCHAR
detection_limitLower limit of detectionNUM
detection_limit_unitDetection limit unitCHAR
total_procedural_blank_valueBlank valueNUM
total_procedural_unitBlank value unitCHAR
methods_reference_samplesMethod_IDUnique identifier of the analytical methodID(2)
Reference_sample_IDReference sample name (international standard)ID(1)
reference_samplesIDReference sample name (international standard)ID(2)
sample_nameName of international standard in literatureCHAR
sample_measured_valueReference measurement valueNUM
uncertaintyReference standard deviationNUM
uncertainty_unitReference uncertainty unit: %, ppm, 1σ or 2σCHAR
number_of_measurementsNumber of measurementsNUM
methods_normalizationsMethod_IDUnique identifier of the analytical methodID(2)
Normalization_IDReference sample name used for normalizationID(1)
normalizationsIDReference sample name used for normalizationID(2)
reference_sample_nameReference sample name used for normalizationCHAR
reference_sample_accepted_valueReference measurement accepted value used for normalizationNUM
citationReference of reference sample in the literatureCHAR
referencesSource_IDUnique identifier of the bibliographic referenceID(2)
Sample_IDUnique identifier of the analyzed sampleID(2)
scopeScope of sample information documented by reference: sample, artefact or siteCHAR(CV)

asee Zotero item types and fields at: https://www.zotero.org/support/kb/item_types_and_fields.

bCHAR, character; NUM, number.

cCV, controlled vocabulary.

dID(1), primary key; ID(2), foreign key.

Technical Validation

Quality control of data and editorial procedures include: Data review: Database contributors who submit a new dataset are asked to be the editor of that specific dataset and to engage in a review of potential missing or inaccurate data. The content of new datasets is systematically cross-checked with the content of original sources and with potentially related content. Authors are contacted when information is missing or when clarifications are needed. Duplicate detection: Since Pofatu assigns semantic, unique identifiers to the objects in the database, and links data from additional tables using these keys (following the recommendations by Wilson and colleagues[177]), data consistency can be checked automatically, e.g. detecting multiple conflicting measurements of the same parameter in the same analysis, or conflicting sample metadata. Users feedback: Data and metadata issues can be reported to pofatu@shh.mpg.de. Editors will be contacted if an issue with one of their datasets is reported.

Usage Notes

The Pofatu Database provides an analysis-friendly environment[178] that enables transparency and built-in reproducibility of analytical tasks that can be achieved through freely available softwares or web browsers[25]. Since the metadata provided with the csv-formatted data files has information about data types as well as relations between the tables making up the dataset, it is automatically loaded into an SQLite database (cf. https://sqlite.org/appfileformat.html, accessed January 29, 2020) for the convenience of the users. This SQLite database is contained in a single file document that can be queried with a high-level query language, has accessible content, is cross-platform, performant, and can be used with multiple programming languages. The Python package pypofatu used for curating the dataset also provides functionality (built-in SQLite driver) that enables access and queries of the data with Python programs or the pypofatu API, and facilitates running SQL queries against the SQLite database. Complex queries can be created in various ways and with different computing environments: using SQL command line using SQL browsers such as SQLite manager or SQLite reader using R, with SQL codes in a notebook or packages such as sqldf or dplyr[179,180] using the Datasette tool[181] Data usage instructions are provided in the GitHub repository where the dataset is curated (cf. https://github.com/pofatu/pofatu-data, accessed February 6, 2020). A “cookbook” collects shareable pieces of code and how-to instructions to query the relational database (cf. https://github.com/pofatu/pofatu-data/blob/master/doc/cookbook.md, accessed February 6, 2020), and users are invited to contribute with the “recipes” they used for “cooking” with Pofatu.
Measurement(s)isotopic composition • chemical composition • contextual information for archaeological sources • contextual information for stone artefacts
Technology Type(s)digital curation • computational modeling technique
Factor Type(s)archaeological provenance • artefact attribute
Sample Characteristic - LocationRegion (Papua New Guinea) • Vanuatu Islands • Solomon Islands • Fiji islands • Tonga Archipelago • Samoa • American Samoa • Wallis and Futuna Islands • Tuvalu Islands • Tokelau Islands • Rotuma Island Group • Cook Islands • Society Islands • Marquesas Islands • Archipel des Tuamotu • Gambier Islands • Austral Islands • Pitcairn, Henderson, Ducie and Oeno Islands • Easter Island • State of Hawaii • Micronesia • New Zealand
  8 in total

1.  Stone adze compositions and the extent of ancient Polynesian voyaging and trade.

Authors:  Kenneth D Collerson; Marshall I Weisler
Journal:  Science       Date:  2007-09-28       Impact factor: 47.728

2.  Interpolity exchange of basalt tools facilitated via elite control in Hawaiian archaic states.

Authors:  Patrick V Kirch; Peter R Mills; Steven P Lundblad; John Sinton; Jennifer G Kahn
Journal:  Proc Natl Acad Sci U S A       Date:  2011-12-27       Impact factor: 11.205

3.  Cook Island artifact geochemistry demonstrates spatial and temporal extent of pre-European interarchipelago voyaging in East Polynesia.

Authors:  Marshall I Weisler; Robert Bolhar; Jinlong Ma; Emma St Pierre; Peter Sheppard; Richard K Walter; Yuexing Feng; Jian-Xin Zhao; Patrick V Kirch
Journal:  Proc Natl Acad Sci U S A       Date:  2016-07-05       Impact factor: 11.205

4.  Stone tools from the ancient Tongan state reveal prehistoric interaction centers in the Central Pacific.

Authors:  Geoffrey R Clark; Christian Reepmeyer; Nivaleti Melekiola; Jon Woodhead; William R Dickinson; Helene Martinsson-Wallin
Journal:  Proc Natl Acad Sci U S A       Date:  2014-07-07       Impact factor: 11.205

5.  Basalt Pb isotope analysis and the prehistoric settlement of Polynesia.

Authors:  M I Weisler; J D Woodhead
Journal:  Proc Natl Acad Sci U S A       Date:  1995-03-14       Impact factor: 11.205

6.  Interisland and interarchipelago transfer of stone tools in prehistoric Polynesia.

Authors:  M I Weisler; P V Kirch
Journal:  Proc Natl Acad Sci U S A       Date:  1996-02-20       Impact factor: 11.205

7.  Basalt geochemistry reveals high frequency of prehistoric tool exchange in low hierarchy Marquesas Islands (Polynesia).

Authors:  Andrew McAlister; Melinda S Allen
Journal:  PLoS One       Date:  2017-12-27       Impact factor: 3.240

8.  Social network analysis of obsidian artefacts and Māori interaction in northern Aotearoa New Zealand.

Authors:  Thegn N Ladefoged; Caleb Gemmell; Mark McCoy; Alex Jorgensen; Hayley Glover; Christopher Stevenson; Dion O'Neale
Journal:  PLoS One       Date:  2019-03-14       Impact factor: 3.240

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.