Literature DB >> 34593819

AusTraits, a curated plant trait database for the Australian flora.

Daniel Falster1, Rachael Gallagher2,3, Elizabeth H Wenk4, Ian J Wright2, Dony Indiarto4, Samuel C Andrew5, Caitlan Baxter4, James Lawson6, Stuart Allen2, Anne Fuchs7, Anna Monro7, Fonti Kar4, Mark A Adams8, Collin W Ahrens3, Matthew Alfonzetti2, Tara Angevin9, Deborah M G Apgaua10, Stefan Arndt11, Owen K Atkin12, Joe Atkinson4, Tony Auld13, Andrew Baker14, Maria von Balthazar15, Anthony Bean16, Chris J Blackman17, Keith Bloomfield18, David M J S Bowman17, Jason Bragg19, Timothy J Brodribb17, Genevieve Buckton20, Geoff Burrows21, Elizabeth Caldwell22, James Camac23, Raymond Carpenter24, Jane A Catford25, Gregory R Cawthray26, Lucas A Cernusak27, Gregory Chandler28, Alex R Chapman29, David Cheal30, Alexander W Cheesman20, Si-Chong Chen31, Brendan Choat3, Brook Clinton7, Peta L Clode26, Helen Coleman29, William K Cornwell4, Meredith Cosgrove12, Michael Crisp12, Erika Cross21, Kristine Y Crous3, Saul Cunningham32, Timothy Curran33, Ellen Curtis34, Matthew I Daws35, Jane L DeGabriel36, Matthew D Denton37, Ning Dong2, Pengzhen Du38, Honglang Duan39, David H Duncan11, Richard P Duncan40, Marco Duretto41, John M Dwyer42, Cheryl Edwards43, Manuel Esperon-Rodriguez3, John R Evans12, Susan E Everingham4, Claire Farrell11, Jennifer Firn44, Carlos Roberto Fonseca45, Ben J French17, Doug Frood46, Jennifer L Funk47, Sonya R Geange12, Oula Ghannoum3, Sean M Gleason48, Carl R Gosper49, Emma Gray2, Philip K Groom50, Saskia Grootemaat4, Caroline Gross51, Greg Guerin52, Lydia Guja7, Amy K Hahs53, Matthew Tom Harrison54, Patrick E Hayes26, Martin Henery55, Dieter Hochuli56, Jocelyn Howell57, Guomin Huang58, Lesley Hughes2, John Huisman59, Jugoslav Ilic11, Ashika Jagdish4, Daniel Jin56, Gregory Jordan17, Enrique Jurado60, John Kanowski61, Sabine Kasel11, Jürgen Kellermann62, Belinda Kenny63, Michele Kohout64, Robert M Kooyman2, Martyna M Kotowska65, Hao Ran Lai66, Etienne Laliberté67, Hans Lambers26, Byron B Lamont50, Robert Lanfear68, Frank van Langevelde69, Daniel C Laughlin70, Bree-Anne Laugier-Kitchener2, Susan Laurance20, Caroline E R Lehmann71, Andrea Leigh34, Michelle R Leishman2, Tanja Lenz2, Brendan Lepschi7, James D Lewis72, Felix Lim73, Udayangani Liu31, Janice Lord74, Christopher H Lusk75, Cate Macinnis-Ng76, Hannah McPherson41, Susana Magallón77, Anthony Manea2, Andrea López-Martinez77, Margaret Mayfield42, James K McCarthy78, Trevor Meers79, Marlien van der Merwe19, Daniel J Metcalfe5, Per Milberg80, Karel Mokany5, Angela T Moles4, Ben D Moore3, Nicholas Moore9, John W Morgan9, William Morris11, Annette Muir64, Samantha Munroe52, Áine Nicholson17, Dean Nicolle81, Adrienne B Nicotra12, Ülo Niinemets82, Tom North7, Andrew O'Reilly-Nugent40, Odhran S O'Sullivan83, Brad Oberle84, Yusuke Onoda85, Mark K J Ooi86, Colin P Osborne87, Grazyna Paczkowska29, Burak Pekin88, Caio Guilherme Pereira89, Catherine Pickering90, Melinda Pickup91, Laura J Pollock92, Pieter Poot27, Jeff R Powell3, Sally A Power3, Iain Colin Prentice18, Lynda Prior17, Suzanne M Prober5, Jennifer Read22, Victoria Reynolds42, Anna E Richards5, Ben Richardson93, Michael L Roderick12, Julieta A Rosell77, Maurizio Rossetto41, Barbara Rye93, Paul D Rymer3, Michael A Sams42, Gordon Sanson22, Hervé Sauquet41, Susanne Schmidt94, Jürg Schönenberger15, Ernst-Detlef Schulze95, Kerrie Sendall96, Steve Sinclair65, Benjamin Smith3, Renee Smith3, Fiona Soper97, Ben Sparrow52, Rachel J Standish98, Timothy L Staples42, Ruby Stephens2, Christopher Szota11, Guy Taseski4, Elizabeth Tasker13, Freya Thomas11, David T Tissue3, Mark G Tjoelker3, David Yue Phin Tng10, Félix de Tombeur99, Kyle Tomlinson100, Neil C Turner26, Erik J Veneklaas26, Susanna Venn101, Peter Vesk11, Carolyn Vlasveld22, Maria S Vorontsova31, Charles A Warren56, Nigel Warwick51, Lasantha K Weerasinghe102, Jessie Wells42, Mark Westoby2, Matthew White64, Nicholas S G Williams11, Jarrah Wills56, Peter G Wilson103, Colin Yates49, Amy E Zanne104,105, Graham Zemunik26, Kasia Ziemińska73.   

Abstract

We introduce the AusTraits database - a compilation of values of plant traits for taxa in the Australian flora (hereafter AusTraits). AusTraits synthesises data on 448 traits across 28,640 taxa from field campaigns, published literature, taxonomic monographs, and individual taxon descriptions. Traits vary in scope from physiological measures of performance (e.g. photosynthetic gas exchange, water-use efficiency) to morphological attributes (e.g. leaf area, seed mass, plant height) which link to aspects of ecological variation. AusTraits contains curated and harmonised individual- and species-level measurements coupled to, where available, contextual information on site properties and experimental conditions. This article provides information on version 3.0.2 of AusTraits which contains data for 997,808 trait-by-taxon combinations. We envision AusTraits as an ongoing collaborative initiative for easily archiving and sharing trait data, which also provides a template for other national or regional initiatives globally to fill persistent gaps in trait knowledge.
© 2021. The Author(s).

Entities:  

Mesh:

Year:  2021        PMID: 34593819      PMCID: PMC8484355          DOI: 10.1038/s41597-021-01006-6

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Species traits are essential for comparing ecological strategies among plants, both within any given vegetation and across environmental space or evolutionary lineages[1-4]. Broadly, a trait is any measurable property of a plant capturing aspects of its structure or function[5-8]. Traits thereby provide useful indicators of species’ behaviours in communities and ecosystems, regardless of their taxonomy[8-10]. Through global initiatives the volume of available trait information for plants has grown rapidly in the last two decades[11,12]. However, the geographic coverage of trait measurements across the globe is patchy, limiting detailed analyses of trait variation and diversity in some regions, and, more generally, development of theory accounting for the diversity of plant strategies. One such region where trait data is sparsely documented is Australia; a continent with a flora of c. 28,900 native vascular plant taxa[13] (including species, subspecies, varietas and forma). While significant investment has been made in curating and digitising herbarium collections and observation records in Australia over the last two decades (e.g. The Australian Virtual Herbarium houses ~7 million specimen occurrence records; https://avh.ala.org.au), no complementary resource yet exists for consolidating information on plant traits. Moreover, relatively few Australian species are represented in the leading global databases. For example, the international TRY database[12] has measurements for only 3830 Australian species across all collated traits. This level of species coverage limits our ability to use traits to understand and ultimately manage Australian vegetation[14]. While initiatives such as TRY[12] and the Open Traits Network[15] are working towards global synthesis of trait data, a stronger representation of Australian plant taxa in these efforts is essential, especially given the high richness and endemicity of this continental flora, and the unique contribution this makes to global floral diversity[16,17]. Here we introduce the AusTraits database (hereafter AusTraits), a compilation of plant traits for the Australian flora. Currently, AusTraits draws together 283 distinct sources and contains 997,808 measurements spread across 448 different traits for 28,640 taxa. To assemble AusTraits from diverse primary sources and make data available for reuse, we needed to overcome three main types of challenges (Fig. 1): (1) Accessing data from diverse original sources, including field studies, online databases, scientific articles, and published taxonomic floras; (2) Harmonising these diverse sources into a federated resource, with common taxon names, units, trait names, and data formats; and (3) Distributing versions of the data under suitable license. To meet this challenge, we developed a workflow which draws on emerging community standards and our collective experience building trait databases.
Fig. 1

The data curation pathway used to assemble the AusTraits database. Trait measurements are accessed from original data sources, including published floras and field campaigns. Features such as variable names, units and taxonomy are harmonised to a common standard. Versioned releases are distributed to users, allowing the dataset to be used and re-used in a reproducible way.

The data curation pathway used to assemble the AusTraits database. Trait measurements are accessed from original data sources, including published floras and field campaigns. Features such as variable names, units and taxonomy are harmonised to a common standard. Versioned releases are distributed to users, allowing the dataset to be used and re-used in a reproducible way. By providing a harmonised and curated dataset on 448 plant traits, AusTraits contributes substantially to filling the gap in Australian and global biodiversity resources. Prior to the development of AusTraits, data on Australian plant traits existed largely as a series of disconnected datasets collected by individual laboratories or initiatives. AusTraits has been developed as a standalone database, rather than as part of the existing global database TRY[12], for three reasons. First, we sought to establish an engaged and localised community, actively collaborating to enhance coverage of plant trait data within Australia. We envisioned that a community would form more readily to fill gaps in national knowledge of traits with local ownership of the resource. While we will never have a counterfactual, a vibrant community excited to be part of this initiative has indeed been established and coverage is much higher for Australian species than has been achieved since TRY’s inception. Local ownership also aligns well with funding opportunities and national research priorities, and enables database coordinators to progress at their own speed. Second, we wanted to apply an entirely open-source approach to the aggregation workflow. All the code and raw files used to create the compiled database are available, and this database is freely available via a third party data repository (Zenodo) which is itself built for long term data archiving, with an established API. Finally, we targeted primary data sources, where possible, whereas TRY accepts aggregated datasets. The hope was that this would increase data quality, by removing intermediaries and easier identification of duplicates. While independent, the overall structure of AusTraits is similar to that of TRY, ensuring the two databases will be interoperable. Both databases are founded on similar principles and terminology[18,19]. Increasingly, researchers and biodiversity portals are seeking to connect diverse datasets[15], which is possible if they share a common foundation. We envision AusTraits as an on-going collaborative initiative for easily archiving and sharing trait data about the Australian flora. Open access to a comprehensive resource like this will generate significant new knowledge about the Australian flora across multiple scales of interest, as well as reduce duplication of effort in the compilation of plant trait data, particularly for research students and government agencies seeking to access information on traits. In coming years, AusTraits will continue to be expanded, with integrations into other biodiversity platforms and expansion of coverage into historically neglected plant lineages in trait science, such as pteridophytes (lycophytes and ferns). Further, through international initiatives, such as the Open Traits Network, linkages are being forged between plant datasets and a variety of other organismal databases[15].

Methods

Primary sources

AusTraits version 3.0.2 was assembled from 283 distinct sources, including published papers, field measurements, glasshouse and field experiments, botanical collections, and taxonomic treatments. Initially we identified a list of candidate traits of interest, then identified primary sources containing measurements for these traits, before contacting authors for access. As the compilation grew, we expanded the list of traits considered to include any measurable quantity that had been quantified for at least a moderate number of taxa (n > 20). For a small subset of sources from herbaria, providing a text description of taxa, we used regular expressions in R to extract measurements of traits from the text. A variety of expressions were developed to extract height, leaf/seed dimensions and growth form. Error checking was completed on approximately 60% of mined measurements by visually inspecting the extracted values relative to the textual descriptions.

Trait definitions

A full list of traits and their sources appears in Supplementary Table 1[20-354] . The list of sources in AusTraits was developed gradually as new datasets were incorporated, drawing from original source publications and a published thesaurus of plant characteristics[19]. We categorised traits based on the tissue where it is measured (bark, leaf, reproductive, root, stem, whole plant) and the type of measurement (allocation, life history, morphology, nutrient, physiological). Version 3.0.2 of AusTraits includes 358 numeric and 90 categorical traits.

Database structure

The schema of AusTraits broadly follows the principles of the established Observation and Measurement Ontology[18] in that, where available, trait data are connected to contextual information about the collection (e.g. location coordinates, light levels, whether data were collected in the field or lab) and information about the methods used to derive measurements (e.g. number of replicates, equipment used). The database contains 11 elements, as described in Table 1. This format was developed to include information about the trait measurements, taxon, methods, sites, contextual information, people involved, and citation sources.
Table 1

Main elements of the harmonised AusTraits database. See Tables 2–8 for details on each component.

ElementContents
traitsA table containing measurements of plant traits.
sitesA table containing observations of site characteristics associated with information in ‘traits’. Cross referencing between the two dataframes is possible using combinations of the variables ‘dataset_id’, ‘site_name’.
contextsA table containing observations of contextual characteristics associated with information in ‘traits’. Cross referencing between the two dataframes is possible using combinations of the variables ‘dataset_id’, ‘context_name’.
methodsA table containing details on methods with which data were collected, including time frame and source.
excluded_dataA table of data that did not pass quality test and so were excluded from the master dataset.
taxaA table containing details on taxa associated with information in ‘traits’. This information has been sourced from the APC (Australian Plant Census) and APNI (Australian Plant Name Index) and is released under a CC-BY3 license.
definitionsA copy of the definitions for all tables and terms. Information included here was used to process data and generate any documentation for the study.
sourcesBibtex entries for all primary and secondary sources in the compilation.
contributorsA table of people contributing to each study.
taxonomic_updatesA table of all taxonomic changes implemented in the construction of AusTraits. Changes are determined by comparing against the APC (Australian Plant Census) and APNI (Australian Plant Name Index).
build_infoA description of the computing environment used to create this version of the dataset, including version number, git commit and R session_info.
Main elements of the harmonised AusTraits database. See Tables 2–8 for details on each component.
Table 2

Structure of the traits table, containing measurements of plant traits.

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
taxon_nameCurrently accepted name of taxon in the Australian Plant Census or in the Australian Plant Name Index.
site_nameName of site where individual was sampled. Cross-references to identical columns in ‘sites’ and ‘traits’.
context_nameName of contextual senario where individual was sampled. Cross-references to identical columns in ‘contexts’ and ‘traits’.
observation_idA unique identifier for the observation, useful for joining traits coming from the same ‘observation_id’. These are assigned automatically, based on the ‘dataset_id’ and row number of the raw data.
trait_nameName of trait sampled.
valueMeasured value.
unitUnits of the sampled trait value after aligning with AusTraits standards.
dateDate sample was taken, in the format ‘yyyy-mm-dd’, but with days and months only when specified.
value_typeA categorical variable describing the type of trait value recorded.
replicatesNumber of replicate measurements that comprise the data points for the trait for each measurement. A numeric value (or range) is ideal and appropriate if the value type is a ‘mean’, ‘median’, ‘min’ or ‘max’. For these value types, if replication is unknown the entry should be ‘unknown’. If the value type is ‘raw_value’ the replicate value should be 1. If the value type is ‘expert_mean’, ‘expert_min’, or ‘expert_max’ the replicate value should be ‘na’.
original_nameName given to taxon in the original data supplied by the authors
Table 8

Structure of the contributors table, of people contributing to each study.

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
nameName of contributor
institutionLast known institution or affiliation
roleTheir role in the study
For storage efficiency, the main table of traits contains relatively little information (Table 2), but can be cross linked against other tables (Tables 3–8) using identifiers for dataset, site, context, observation, and taxon (Table 1). The dataset_id is ordinarily the surname of the first author and year of publication associated with the source’s primary citation (e.g. Blackman_2014). Trait values were also recorded as being one of several possible value types (value_type) (Table 9), reflecting the type of measurement submitted by the contributor, as different sources provide different levels of detail. Possible values include raw_value, individual_mean, site_mean, multisite_mean, expert_mean, experiment_mean. Further details on the methods used for collecting each trait are provided in a methods table (Table 5).
Table 3

Structure of the sites table, containing observations of site characteristics associated with information in traits.

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
site_nameName of site where individual was sampled. Cross-references to identical columns in ‘sites’ and ‘traits’.
site_propertyThe site characteristic being recorded. Name should include units of measurement, e.g. ‘longitude (deg)’. Ideally we have at least these variables for each site - ‘longitude (deg)’, ‘latitude (deg)’, ‘description’.
valueMeasured value.
Table 9

Possible value types of trait records.

keyvalue
raw_valueValue is a direct measurement
site_minValue is the minimum of measurements on multiple individuals of the taxon at a single site
site_meanValue is the mean or median of measurements on multiple individuals of the taxon at a single site
site_maxValue is the maximum of measurements on multiple individuals of the taxon at a single site
multisite_minValue is the minimum of measurements on multiple individuals of the taxon across multiple sites
multisite_meanValue is the mean or median of measurements on multiple individuals of the taxon across multiple sites
multisite_maxValue is the maximum of measurements on multiple individuals of the taxon across multiple sites
expert_minValue is the minimum observed for a taxon across its range or in this particular dataset, as estimated by an expert based on their knowledge of the taxon. Data fitting this category include estimates from floras that represent a taxon’s entire range.
expert_meanValue is the mean observed for a taxon across its range or in this particular dataset, as estimated by an expert based on their knowledge of the taxon. Data fitting this category include estimates from floras that represent a taxon’s entire range, and values for categorical variables obtained from a reference book, or identified by an expert.
expert_maxValue is the maximum observed for a taxon across its range or in this particular dataset, as estimated by an expert based on their knowledge of the taxon. Data fitting this category include estimates from floras that represent a taxon’s entire range.
experiment_minValue is the minimum of measurements from an experimental study either in the field or a glasshouse
experiment_meanValue is the mean or median of measurements from an experimental study either in the field or a glasshouse
experiment_maxValue is the maximum of measurements from an experimental study either in the field or a glasshouse
individual_meanValue is a mean of replicate measurements on an individual (usually for experimental ecophysiology studies)
individual_maxValue is a maximum of replicate measurements on an individual (usually for experimental ecophysiology studies)
literature_sourceValue is a site or multi-site mean that has been sourced from an unknown literature source
unknownValue type is not currently known
Table 5

Structure of the methods table, containing details on methods with which data were collected, including time frame and source.

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
trait_nameName of trait sampled. Allowable values specified in the table ‘traits’.
methodsA textual description of the methods used to collect the trait data. Whenever available, methods are taken near-verbatim from referenced source. Methods can include descriptions such as ‘measured on botanical collections’, ‘data from the literature’, or a detailed description of the field or lab methods used to collect the data.
year_collected_startThe year data collection commenced.
year_collected_endThe year data collection was completed.
descriptionA 1–2 sentence description of the purpose of the study.
collection_typeA field to indicate where the majority of plants on which traits were measured were collected - in the ‘field’, ‘lab’, ‘glasshouse’, ‘botanical collection’, or ‘literature’. The latter should only be used when the data were sourced from the literature and the collection type is unknown.
sample_age_classA field to indicate if the study was completed on ‘adult’ or ‘juvenile’ plants.
sampling_strategyA written description of how study sites were selected and how study individuals were selected. When available, this information is copied verbatim from a published manuscript. For botanical collections, this field ideally indicates which records were ‘sampled’ to measure a specific trait.
source_primary_citationCitation for primary source. This detail is generated from the primary source in the metadata.
source_primary_keyCitation key for primary source in ‘sources’. The key is typically of format ‘Surname_year’.
source_secondary_citationCitations for secondary source. This detail is generated from the secondary source in the metadata.
source_secondary_keyCitation key for secondary source in ‘sources’. The key is typically of format ‘Surname_year’.
Structure of the traits table, containing measurements of plant traits. Structure of the sites table, containing observations of site characteristics associated with information in traits. Structure of the contexts table, containing observations of contextual characteristics associated with information in traits. Structure of the methods table, containing details on methods with which data were collected, including time frame and source. Structure of the taxonomic_updates table, of all taxonomic changes implemented in the construction of AusTraits. Changes are determined by comparing against the APC (Australian Plant Census) and APNI (Australian Plant Name Index). Structure of the taxa table, containing details on taxa associated with information in the traits table. This information has been sourced from the APC (Australian Plant Census) and APNI (Australian Plant Name Index) and is released under a CC-BY3 license. Structure of the contributors table, of people contributing to each study. Possible value types of trait records.

Harmonisation

To harmonise each source into the common AusTraits format we applied a reproducible and transparent workflow (Fig. 1), written in R[355], using custom code, and the packages tidyverse[356], yaml[357], remake[358], knitr[359], and rmarkdown[360]. In this workflow, we performed a series of operations, including reformatting data into a standardised format, generating observation ids for each set of linked measurements, transforming variable names into common terms, transforming data into common units, standardising terms (trait values) for categorical variables, encoding suitable metadata, and flagging data that did not pass quality checks. Details from each primary source were saved with minimal modification into two plain text files. The first file, data.csv, contains the actual trait data in comma-separated values format. The second file, metadata.yml, contains relevant metadata for the study, as well as options for mapping trait names and units onto standard types, and any substitutions applied to the data in processing. These two files provide all the information needed to compile each study into a standardised AusTraits format. Successive versions of AusTraits iterate through the steps in Fig. 1, to incorporate new data and correct identified errors, leading to a high-quality, harmonised dataset. After importing a study, we generated a detailed report which summarised the study’s metadata and compared the study’s data values to those collected by other studies for the same traits. Data for continuous and categorical variables are presented in scatter plots and tables respectively. These reports allow first the AusTraits data curator, followed by the data contributor, to rapidly scan the metadata to confirm it has been entered correctly and the trait data to ensure it has been assigned the correct units and their categorical traits values are properly aligned with AusTraits trait values.

Taxonomy

We developed a custom workflow to clean and standardise taxonomic names using the latest and most comprehensive taxonomic resources for the Australian flora: the Australian Plant Census (APC)[13] and the Australian Plant Name Index (APNI)[361]. These resources document all known taxonomic names for Australian plants, including currently accepted names and synonyms. While several automated tools exist for updating taxonomy, such as taxize[362], these do not currently include up to date information for Australian taxa. Updates were completed in two steps. In the first step, we used both direct and then fuzzy matching (with up to 2 characters difference) to search for an alignment between reported names and those in three name sets: 1) All accepted taxa in the APC, 2) All known names in the APC, 3) All names in the APNI. Names were aligned without name authorities, as we found this information was rarely reported in the raw datasets provided to us. Second, we used the aligned name to update any outdated names to their current accepted name, using the information provided in the APC. If a name was recorded as being both an accepted name and an alternative (e.g. synonym) we preferred the accepted name, but also noted the alternative records. For phrase names, when a suitable match could not be found, we manually reviewed near matches via web portals such as the Atlas of Living Australia to find a suitable match. The final resource reports both the original and the updated taxon name alongside each trait record (Table 2), as well as an additional table summarising all taxonomic name changes (Table 6) and further information from the APC and APNI on all taxa included (Table 7). Any changes in taxonomy are exposed within the compiled dataset, enabling researchers to review these as needed.
Table 6

Structure of the taxonomic_updates table, of all taxonomic changes implemented in the construction of AusTraits. Changes are determined by comparing against the APC (Australian Plant Census) and APNI (Australian Plant Name Index).

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
original_nameName given to taxon in the original data supplied by the authors
cleaned_nameName of the taxon after implementing any changes encoded for this taxon in the metadata file for the correpsonding ‘dataset_id’.
taxonIDCleanWhere it could be identified, the ‘taxonID’ of the ‘cleaned_name’ for this taxon in the APC.
taxonomicStatusCleanTaxonomic status of the taxon identified by ‘taxonIDClean’ in the APC.
alternativeTaxonomicStatusCleanThe status of alternative records with the name ‘cleaned_name’ in the APC.
acceptedNameUsageIDID of the accepted name for taxon in the APC or APNI.
taxon_nameCurrently accepted name of taxon in the APC or in the APNI .
Table 7

Structure of the taxa table, containing details on taxa associated with information in the traits table. This information has been sourced from the APC (Australian Plant Census) and APNI (Australian Plant Name Index) and is released under a CC-BY3 license.

keyvalue
taxon_nameCurrently accepted name of taxon in the APC or in the APNI .
sourceSource of taxnonomic information, either APC or APNI.
acceptedNameUsageIDID of the accepted name for taxon in the APC or APNI.
scientificNameAuthorshipAuthority for taxon indicated under taxon_name.
taxonRankRank of the taxon.
taxonomicStatusTaxonomic status of the taxon.
familyFamily of the taxon.
genusGenus of the taxon.
taxonDistributionKnown distribution of the taxon, by state.
ccAttributionIRISource of taxonomic information.

Data Records

Access

Static versions of AusTraits, including version 3.0.2 used in this descriptor, are available via Zenodo[363]. Data is released under a CC-BY license enabling reuse with attribution – being a citation of this descriptor and, where possible, original sources. Deposition within Zenodo helps makes the dataset consistent with FAIR principles[364]. As an evolving data product, successive versions of AusTraits are being released, containing updates and corrections. Versions are labeled using semantic versioning to indicate the change between versions[365]. As validation (see Technical Validation, below) and data entry are ongoing, users are recommended to pull data from release, to ensure results in their downstream analyses remain consistent as the database is updated. The R package austraits (https://github.com/traitecoevo/austraits) provides easy access to data and examples on manipulating data (e.g. joining tables, subsetting) for those using this platform.

Data coverage

The number of accepted vascular plant taxa in the APC (as of May 2020) is around 28,981[13]. Version 3.0.2 of AusTraits includes at least one record for 26,852 taxa (~93% of known taxa). Five traits (leaf_length, leaf_width, plant_height, life_history, plant_growth_form) have records for more than 50% of known species (Fig. 2a). Across all traits, the median number of taxa with records is 62. Supplementary Table 1 shows the number of studies, taxa, and families with data in AusTraits, as well as the number of geo-referenced records, for each trait. Looking across traits and tissue categories, coverage declined gradually, with moderate coverage(>20%) for more than 50 traits (Fig. 2). Coverage for root, stem and bark traits declined much faster than trait measurements for other plant tissues (Fig. 2b).
Fig. 2

Coverage of traits by taxa. (a) Matrix showing the coverage of taxa for each trait, with yellow indicating presence of data. The figure was generated with a subset of 500 randomly selected taxa. (b) Number of taxa with data for first 100 traits for all traits and separated by tissue.

Coverage of traits by taxa. (a) Matrix showing the coverage of taxa for each trait, with yellow indicating presence of data. The figure was generated with a subset of 500 randomly selected taxa. (b) Number of taxa with data for first 100 traits for all traits and separated by tissue. The most common traits are non geo-referenced records from floras; these are trait values representing a continental or region mean (or spread) and hence are not linked to a location. Yet, geo-referenced records were available for several traits for more than 10% of the flora (Fig. 3a). Coverage is notably higher for geo-referenced measurements of some tissues and trait types - such as bark stems and roots - relative to non-geo-referenced measurements (Fig. 3).
Fig. 3

Number of taxa with trait records by plant tissue and trait category, for data that are (a) Geo-referenced, and (b) Not geo-referenced. Many records without a geo-reference come from botanical collections, such as floras.

Number of taxa with trait records by plant tissue and trait category, for data that are (a) Geo-referenced, and (b) Not geo-referenced. Many records without a geo-reference come from botanical collections, such as floras. Trait records are spread across the climate space of Australia (Fig. 4a), as well as geographic locations (Fig. 4b). As with most data in Australia, the density of records was somewhat concentrated around cities or roads in remote regions.
Fig. 4

Coverage of geo-referenced trait records across Australian climatic and geographic space for traits in different categories. (a) AusTraits’ sites (orange) within Australia’s precipitation-temperature space (dark-grey) superimposed upon Whittaker’s classification of major biomes by climate[370]. Climate data were extracted at 10" resolution from WorldClim[371]. (b) Locations of geo-referenced records for different plant tissues.

Coverage of geo-referenced trait records across Australian climatic and geographic space for traits in different categories. (a) AusTraits’ sites (orange) within Australia’s precipitation-temperature space (dark-grey) superimposed upon Whittaker’s classification of major biomes by climate[370]. Climate data were extracted at 10" resolution from WorldClim[371]. (b) Locations of geo-referenced records for different plant tissues. Overall trait coverage across an estimated phylogenetic tree of Australian plant species is relatively unbiased (Fig. 5), though there are some notable exceptions. One exception is for root traits, where taxa within Poaceae have large amounts of information available relative to other plant families. A cluster of taxa within the family Myrtaceae which are largely from Western Australia have little leaf information available.
Fig. 5

Phylogenetic distribution of trait data in AusTraits for a subset of 2000 randomly sampled taxa. The heatmap colour intensity denotes the number of traits measured within a family for each plant tissue. The most widespread family names (with more than ten taxa) are labelled on the edge of the tree.

Phylogenetic distribution of trait data in AusTraits for a subset of 2000 randomly sampled taxa. The heatmap colour intensity denotes the number of traits measured within a family for each plant tissue. The most widespread family names (with more than ten taxa) are labelled on the edge of the tree. Comparing coverage in AusTraits to the global database TRY, there were 76 traits overlapping. Of these, AusTraits tended to contain records for more taxa, but not always; multiple traits had more than 10 times the number of taxa represented in AusTraits (Fig. 6). However, there were more records in TRY for 25 traits, in particular physiological leaf traits. Many traits were not overlapping between the two databases (Fig. 6). We noted that AusTraits includes more seed and fruit nutrient data; possibly reflecting the interest in Australia in understanding how fruit and seeds are provisioned in nutrient-depauperate environments. AusTraits includes more categorical values, especially variables documenting different components of species’ fire response strategies, reflecting the importance of fire in shaping Australian communities and the research to document different strategies species have evolved to succeed in fire-prone environments.
Fig. 6

The number of taxa with trait records in AusTraits and global TRY database (accessed 28 May 2020). Each point shows a separate trait.

The number of taxa with trait records in AusTraits and global TRY database (accessed 28 May 2020). Each point shows a separate trait.

Technical Validation

We implemented three strategies to maintain data quality. First, we conducted a detailed review of each source based on a bespoke report, showing all data and metadata, by both an AusTraits curator (primarily Wenk) and the original contributor (where possible). Measurements for each trait were plotted against all other values for the trait in AusTraits, allowing quick identification of outliers. Corrections suggested by contributors were combined back into AusTraits and made available with the next release. Version 3.0.2 of AusTraits, described here, is the sixth release. Second, we implemented automated tests for each dataset, to confirm that values for continuous traits fall within the accepted range for the trait, and that values for categorical traits are on a list of allowed values. Data that did not pass these tests were moved to a separate spreadsheet (“excluded_data”) that is also made available for use and review. Third, we provide a pathway for user feedback. AusTraits is an open-source community resource and we encourage engagement from users on maintaining the quality and usability of the dataset. As such, we welcome reporting of possible errors, as well as additions and edits to the online documentation for AusTraits that make using the existing data, or adding new data, easier for the community. Feedback can be posted as an issue directly at the project’s GitHub page (http://traitecoevo.github.io/austraits.build).

Usage Notes

Each data release is available in multiple formats: first, as a compressed folder containing text files for each of the main components, second, as a compressed R object, enabling easy loading into R for those using that platform. Using the taxon names aligned with the APC, data can be queried against location data from the Atlas of Living Australia. To create the phylogenetic tree in Fig. 6, we pruned a master tree for all higher plants[366] using the package V.PhyloMaker[367] and visualising via ggtree[368]. To create Fig. 3a, we used the package plotbiomes[369] to create the baseline plot of biomes. Supplementary Table 1
Measurement(s)plant trait
Technology Type(s)digital curation
Sample Characteristic - OrganismViridiplantae
Sample Characteristic - LocationAustralia
Table 4

Structure of the contexts table, containing observations of contextual characteristics associated with information in traits.

keyvalue
dataset_idPrimary identifier for each study contributed into AusTraits; most often these are scientific papers, books, or online resources. By default should be name of first author and year of publication, e.g. ‘Falster_2005’.
context_nameName of contextual senario where individual was sampled. Cross-references to identical columns in ‘contexts’ and ‘traits’.
context_propertyThe contextual characteristic being recorded. Name should include units of measurement, e.g. ‘CO2 concentration (ppm)’.
valueMeasured value.
  102 in total

1.  The leaf size-twig size spectrum and its relationship to other important spectra of variation among species.

Authors:  Mark Westoby; Ian J Wright
Journal:  Oecologia       Date:  2003-03-28       Impact factor: 3.225

2.  A comparison of the sexual systems in the trees from the Australian tropics with other tropical biomes--more monoecy but why?

Authors:  C L Gross
Journal:  Am J Bot       Date:  2005-06       Impact factor: 3.844

3.  Effects of drought and fire on resprouting capacity of 52 temperate Australian perennial native grasses.

Authors:  Nicholas A Moore; James S Camac; John W Morgan
Journal:  New Phytol       Date:  2018-10-13       Impact factor: 10.151

4.  Canopy position affects the relationships between leaf respiration and associated traits in a tropical rainforest in Far North Queensland.

Authors:  Lasantha K Weerasinghe; Danielle Creek; Kristine Y Crous; Shuang Xiang; Michael J Liddell; Matthew H Turnbull; Owen K Atkin
Journal:  Tree Physiol       Date:  2014-04-10       Impact factor: 4.196

5.  Tree leaf trade-offs are stronger for sub-canopy trees: leaf traits reveal little about growth rates in canopy trees.

Authors:  Jarrah Wills; John Herbohn; Jing Hu; Shawkat Sohel; Jack Baynes; Jennifer Firn
Journal:  Ecol Appl       Date:  2018-04-26       Impact factor: 4.657

6.  A regional-scale assessment of using metabolic scaling theory to predict ecosystem properties.

Authors:  James K McCarthy; John M Dwyer; Karel Mokany
Journal:  Proc Biol Sci       Date:  2019-11-20       Impact factor: 5.349

7.  Constraints on trait combinations explain climatic drivers of biodiversity: the importance of trait covariance in community assembly.

Authors:  John M Dwyer; Daniel C Laughlin
Journal:  Ecol Lett       Date:  2017-05-16       Impact factor: 9.492

8.  Nitrogen in cell walls of sclerophyllous leaves accounts for little of the variation in photosynthetic nitrogen-use efficiency.

Authors:  Matthew T Harrison; Everard J Edwards; Graham D Farquhar; Adrienne B Nicotra; John R Evans
Journal:  Plant Cell Environ       Date:  2008-11-25       Impact factor: 7.228

9.  Components of leaf-trait variation along environmental gradients.

Authors:  Ning Dong; Iain Colin Prentice; Ian J Wright; Bradley J Evans; Henrique F Togashi; Stefan Caddy-Retalic; Francesca A McInerney; Ben Sparrow; Emrys Leitch; Andrew J Lowe
Journal:  New Phytol       Date:  2020-04-24       Impact factor: 10.151

10.  The photosynthetic pathways of plant species surveyed in Australia's national terrestrial monitoring network.

Authors:  Samantha E M Munroe; Francesca A McInerney; Jake Andrae; Nina Welti; Greg R Guerin; Emrys Leitch; Tony Hall; Steve Szarvas; Rachel Atkins; Stefan Caddy-Retalic; Ben Sparrow
Journal:  Sci Data       Date:  2021-04-01       Impact factor: 6.444

View more
  4 in total

Review 1.  Modelling coupled human-environment complexity for the future of the biosphere: strengths, gaps and promising directions.

Authors:  Isaiah Farahbakhsh; Chris T Bauch; Madhur Anand
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2022-06-27       Impact factor: 6.671

2.  Towards species-level forecasts of drought-induced tree mortality risk.

Authors:  Martin G De Kauwe; Manon E B Sabot; Belinda E Medlyn; Andrew J Pitman; Patrick Meir; Lucas A Cernusak; Rachael V Gallagher; Anna M Ukkola; Sami W Rifai; Brendan Choat
Journal:  New Phytol       Date:  2022-04-22       Impact factor: 10.323

3.  Plant rarity in fire-prone dry sclerophyll communities.

Authors:  Meena S Sritharan; Ben C Scheele; Wade Blanchard; Claire N Foster; Patricia A Werner; David B Lindenmayer
Journal:  Sci Rep       Date:  2022-07-14       Impact factor: 4.996

4.  Soil chemistry and fungal communities are associated with dieback in an Endangered Australian shrub.

Authors:  Samantha E Andres; Nathan J Emery; Paul D Rymer; Jeff R Powell
Journal:  Plant Soil       Date:  2022-10-01       Impact factor: 4.993

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.