| Literature DB >> 35790740 |
Joscha Beninde1, Erin M Toffelmier2,3, Aarron Andreas4, Celina Nishioka4, Meryl Slay4, Ashley Soto4, Justin P Bueno4, Germar Gonzalez4, Hannah V Pham4, Molly Posta4, Jordan L Pace4, H Bradley Shaffer5,4.
Abstract
CaliPopGen is a database of population genetic data for native and naturalized eukaryotic species in California, USA. It summarizes the published literature (1985-2020) for 5,453 unique populations with genetic data from more than 187,394 individuals and 448 species (513 species plus subspecies) across molecular markers including allozymes, RFLPs, mtDNA, microsatellites, nDNA, and SNPs. Terrestrial habitats accounted for the majority (46.4%) of the genetic data. Taxonomic groups with the greatest representation were Magnoliophyta (20.31%), Insecta (13.4%), and Actinopterygii (12.85%). CaliPopGen also reports life-history data for most included species to enable analyses of the drivers of genetic diversity across the state. The large number of populations and wide taxonomic breadth will facilitate explorations of ecological patterns and processes across the varied geography of California. CaliPopGen covers all terrestrial and marine ecoregions of California and has a greater density of species and georeferenced populations than any previously published population genetic database. It is thus uniquely suited to inform conservation management at the regional and state levels across taxonomic groups.Entities:
Mesh:
Year: 2022 PMID: 35790740 PMCID: PMC9256587 DOI: 10.1038/s41597-022-01479-z
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Taxonomic breakdown of species represented in the CaliPopGen database. Values in parentheses represent the total number of species as a percentage of the number of unique species in the database.
Fig. 2The six predominant marker types included in the CaliPopGen database, demonstrating different publication trends through time. The grey bars in each panel are the total number of published studies across all marker types (and are the same in each panel).
Fig. 3Maps of data contained in the CaliPopGen databases. (A) All unique sampling locations of both the population genetic (Dataset 1[21]) and pairwise comparison (Dataset 2[21]) data. The inset shows the location of California within the contiguous USA. (B) The number of unique populations in CaliPopGen per California ecoregion. Note the relative under-representation of inland desert regions (yellow) and over-representation of coastal ecoregions (purple-blue). (C) The number of unique populations of the populations genetic Dataset 1[21] per 20km raster cell. (D) The number of straight-line pairwise comparisons of Dataset 2[21] per 20km raster cell.
Fig. 4Flow chart of the data collection process that generated the CaliPopGen databases.
Description of the population genetic data in Dataset 1[21].
| Column ID | Description |
|---|---|
| CitationID | Unique ID assigned to each source article |
| EntryID | Unique ID assigned to each unique entry in the entire CaliPopGen database |
| CitationFull | Citation information |
| Kingdom | Kingdom classification for the species |
| Phylum | Phylum classification for the species |
| TaxonGroup | Broadly categorized taxonomic group |
| ScientificName | Currently accepted Latin binomial (GBIF) |
| SubspeciesName | Currently accepted subspecies epithet (GBIF) |
| CommonName | Currently reported English common name (GBIF) |
| MarkerType | General category of genetic marker |
| GeneTarget | Specific genes or markers used |
| NumMarkers | Number of markers used in the study |
| SampleSize | Number of samples used to calculate genetic parameters. Value may be a non-integer if a mean number of samples across a set of loci was reported. |
| YearStart | First year of sample collection |
| YearEnd | Last year of sample collection |
| PopName | Population or locality name |
| LatitudeDD | Latitude in decimal degrees |
| LongitudeDD | Longitude in decimal degrees |
| CoordError | Estimated radius of error in kilometers for coordinates georeferenced by us |
| AllelicRichness | Allelic richness |
| HetExp | Expected heterozygosity |
| HetObs | Observed heterozygosity |
| NucDiversity | Nucleotide diversity, pi |
| EffectivePopSize | Effective population size |
| AllelesPerLocus | Alleles per locus |
| PercentPolyLoci | Percent polymorphic loci |
| HaploDiv | Haplotype diversity |
| InbreedingCoefType | Type of inbreeding coefficient reported |
| InbreedingCoefValue | Value of inbreeding coefficient |
| SpeciesID | Unique ID assigned to this entry |
| HabitatType | Marine, Freshwater, Diadromous, Terrestrial, Amphibious. See text for descriptions |
| Columns 32–70 | Animal life history data (see Table |
| Columns 71–101 | Plant life history data (see Table |
Description of the pairwise genetic distance data in Dataset 2[21].
| Column Name | Description |
|---|---|
| CitationID | Unique ID assigned to each source article. |
| EntryID | Unique ID assigned to each unique entry in the entire CaliPopGen database |
| CitationFull | Reference information |
| Kingdom | Kingdom classification for the species |
| Phylum | Phylum classification for the species |
| TaxonGroup | Broadly categorized taxonomic group |
| Pop1ScientificName | Currently accepted Latin binomial (GBIF) |
| Pop1SubspeciesName | Currently accepted subspecies epithet (GBIF) |
| Pop1CommonName | Currently reported English common name (GBIF) |
| Pop1Name | Population or locality name of first site in pairwise comparison |
| Pop1LatitudeDD | Latitude in decimal degrees of first site |
| Pop1LongitudeDD | Longitude in decimal degrees of first site |
| Pop2ScientificName | Currently accepted Latin binomial (GBIF) |
| Pop2SubspeciesName | Currently accepted subspecies epithet (GBIF) |
| Pop2CommonName | Currently reported English common name (GBIF) |
| Pop2Name | Population or locality name of second site in pairwise comparison |
| Pop2LatitudeDD | Latitude in decimal degrees second site |
| Pop2LongitudeDD | Longitude in decimal degrees second site |
| CoordError | Estimated radius of error in kilometers for coordinates georeferenced by us |
| GenDist | Genetic distance score ( |
| GenDistMetric | Type of pairwise genetic parameter reported ( |
| GenDistMetricMethod | Name/citation of specific method used to calculate GenDistMetric (if provided) |
| MarkerType | General category of genetic marker |
| GeneTarget | Specific genes or markers used |
| NumMarkers | Number of markers used |
| SepAnalyses | When multiple analyses were conducted, the level by which data were split is noted here (e.g. species or sampling year) |
| SpecialComparionsType | Identifies pairwise comparisons across timescales (“temporal”), at different temporal intervals (“spatio-temporal replicate”), of samples collected before 1920 (“historic”), between species (“interspecific”) or hybrid populations (“hybrid”) |
| Pop1ComparisonCharacteristic | Characteristic of special comparison |
| Pop2ComparisonCharacteristic | Characteristic of special comparison |
| Pop1YearStart | First year of sample collection |
| Pop1YearEnd | Last year of sample collection |
| Pop2YearStart | First year of sample collection |
| Pop2YearEnd | Last year of sample collection |
| SpeciesID | Unique ID assigned to this entry |
| HabitatType | Marine, Freshwater, Diadromous, Terrestrial, Amphibious. See text for descriptions |
| Columns 36–74 | Animal life history data (see Table |
| Columns 75–101 | Plant life history data (see Table |
Description of the animal life-history data in Dataset 3[21].
| Column Name | Description | Total entries |
|---|---|---|
| SpeciesID | Unique ID assigned to this entry | 432 |
| TaxonGroup | Broadly categorized taxonomic group | 432 |
| ScientificName | Currently accepted Latin species binomial (GBIF) | 432 |
| SubspeciesName | Currently accepted subspecies epithet (GBIF) | 88 |
| CommonName | Currently reported English common name (GBIF) | 372 |
| HabitatType | Marine, Freshwater, Diadromous, Terrestrial, Amphibious. See text for descriptions | 429 |
| LifespanMin | Minimum value for reported lifespan range | 90 |
| LifespanMax | Maximum value for reported lifespan range | 131 |
| LifespanOther | Value of lifespan if not reported as a range | 147 |
| LifespanOtherType | Value type of “LifespanOther” (average, minimum or maximum) | 147 |
| Fecundity | The number of offspring or eggs per reproductive event | 216 |
| LifetimeReprodOutput | Total lifetime reproductive output | 24 |
| AgeSexMatMin | The minimum age for an individual to reach sexual maturity, in years | 92 |
| AgeSexMatMax | The maximum age for an individual to reach sexual maturity, in years | 79 |
| AgeSexMatOther | Single values for age of sexual maturity in years if not reported as a range | 121 |
| AgeSexMatOtherType | Value type of “AgeSexMatOther” (average, minimum or maximum) | 121 |
| NumBreedingEvents | Number of breeding events per year | 146 |
| ReprodMode | Mode of reproduction (asexual, sexual, both) | 312 |
| BodyLength | Adult body length reported in centimeters (cm) | 333 |
| BodyLengthType | Adult body length measurement type: SL (standard length) or PCL (precaudal standard length), FL (fork length), TL (total length), WS (wingspan), SCL (straight-line carapace), SVL (snout-to-vent length) | 254 |
| BodyLengthSex | The gender of the adult length reported | 248 |
| AdultMass | Adult mass, reported in kilograms (kg) | 178 |
| AdultMassSex | The gender of the adult mass reported | 124 |
| CANativeStatus | Native/non-native: whether the species is known to be native to California | 329 |
| CESAStatus | California Endangered Species Act listing status, if any | 39 |
| SSCStatus | California Species of Special Concern listing status, if any | 49 |
| ESAStatus | Federal Endangered Species Act (ESA) listing status, if any | 52 |
| TaxonDataLevel | The taxonomic level at which collected data was obtained, if not for the species or subspecies in question | 16 |
| SpeciesSynonyms | List of species synonyms used to acquire information (GBIF) | 15 |
| Columns 30–45 | Reference sources for trait data |
Description of the plant life-history data in Dataset 4[21].
| Column Name | Description | Total entries |
|---|---|---|
| SpeciesID | Unique ID assigned to this entry | 177 |
| TaxonGroup | Broadly categorized taxonomic group | 177 |
| ScientificName | Currently accepted Latin binomial (GBIF) | 177 |
| SubspeciesName | Currently accepted subspecies epithet (GBIF) | 34 |
| CommonName | Currently reported English common name (GBIF) | 144 |
| HabitatType | Marine, Freshwater, Terrestrial. See text for descriptions | 116 |
| Lifespan | Reported only for perennial species. Maximum lifespan value reported or highest value of reported lifespan range (years) | 61 |
| LifeCycle | Annual, Biennial, Perennial, Perennial-Evergreen, Perennial-Deciduous. See text for descriptions | 152 |
| AdultHeight | Maximum height value reported or highest value of reported height range in meters (m) | 145 |
| SelfCompatibility | Indicates whether species is self-compatible | 98 |
| MonoeciousDioecious | Monoecious: individuals bear both male and female flowers; Dioecious: individuals bear either male or female flowers, but not both | 78 |
| Asexual | Indicates whether primary mode of reproduction is asexually | 21 |
| PollinationMode | Primary pollination mode: wind, animal, water | 120 |
| SeedDispMode | Seed dispersal mode: wind, animal, gravity, water, human | 94 |
| MassPerSeed | Fecundity as measured by mass per seed in milligrams (mg) | 83 |
| CANativeStatus | Native/non-native: whether the species is known to be native to the state of California | 164 |
| CAEndemicStatus | Endemic, near-endemic or distributed only in California & Baja California | 62 |
| InvasiveRating | California Invasive Plant Council rating of invasiveness (non-native species only) | 27 |
| CESAStatus | California Endangered Species Act listing status, if any | 18 |
| CNDDBStatus | Heritage rank as defined by the California Natural Diversity Database. See Table | 139 |
| ESAStatus | Federal Endangered Species Act (ESA) listing status, if any | 19 |
| TaxonDataLevel | The taxonomic level at which collected data was obtained, if not for the species or subspecies in question | 63 |
| SpeciesSynonyms | List of species synonyms used to acquire information (GBIF) | 21 |
| Columns 24–37 | Reference sources for trait data |
Description of the Conservation status (Heritage Rank) from California Natural Diversity Database[29].
| Global/State rank | Description |
|---|---|
| GX/SX | Presumed extirpated |
| GH/SH | Possibly extirpated; known only from historical occurrences but there is still some hope of rediscovery. |
| G1/S1 | Critically imperiled; at very high risk of extirpation in the jurisdiction due to very restricted range, very few populations or occurrences, very steep declines, severe threats, or other factors. |
| G2/S2 | Imperiled; at high risk of extirpation in the jurisdiction due to restricted range, few populations or occurrences, steep declines, severe threats, or other factors. |
| G3/S3 | Vulnerable; at moderate risk of extirpation in the jurisdiction due to a fairly restricted range, relatively few populations or occurrences, recent and widespread declines, threats, or other factors. |
| G4/S4 | Apparently secure; at a fairly low risk of extirpation in the jurisdiction due to an extensive range and/or many populations or occurrences, but with possible vii cause for some concern as a result of local recent declines, threats, or other factors. |
| G5/S5 | Secure; at very low or no risk of extirpation in the jurisdiction due to a very extensive range, abundant populations or occurrences, with little to no concern from declines or threats. |
The Global rank (G rank) is a reflection of the overall status of a species throughout its global range. The State rank (S rank) is assigned much the same way as the Global rank, but State ranks refer to the imperilment status only within California’s state boundaries.
Summary of total numbers of populations and species per California ecoregion, separately for population genetic and pairwise datasets.
| Ecoregion | type | area (km²) | N species PopGen | N populations PopGen | N species Pairwise | N populations Pairwise |
|---|---|---|---|---|---|---|
| Oregon, Washington, Vancouver Coast and Shelf | marine | — | 23 | 34 | 11 | 69 |
| Northern California | marine | — | 93 | 247 | 43 | 273 |
| Southern California Bight | marine | — | 79 | 248 | 28 | 223 |
| Central California Coast | terrestrial | 13,726 | 161 | 401 | 74 | 905 |
| Central Valley Coast Ranges | terrestrial | 24,852 | 37 | 75 | 17 | 211 |
| Colorado Desert | terrestrial | 11,852 | 18 | 46 | 15 | 165 |
| Great Valley | terrestrial | 49,176 | 67 | 348 | 43 | 567 |
| Klamath Mountains | terrestrial | 22,568 | 37 | 96 | 16 | 261 |
| Modoc Plateau | terrestrial | 14,309 | 19 | 31 | 6 | 40 |
| Mojave Desert | terrestrial | 66,832 | 23 | 63 | 12 | 164 |
| Mono | terrestrial | 7,984 | 15 | 45 | 8 | 129 |
| Northern California Coast | terrestrial | 17,135 | 83 | 419 | 51 | 758 |
| Northern California Coast Ranges | terrestrial | 15,524 | 41 | 121 | 22 | 390 |
| Northern California Interior Coast Ranges | terrestrial | 7,494 | 15 | 16 | 10 | 145 |
| North-western Basin and Range | terrestrial | 5,224 | 7 | 9 | 0 | 0 |
| Sierra Nevada | terrestrial | 51,593 | 67 | 358 | 23 | 511 |
| Sierra Nevada Foothills | terrestrial | 18,191 | 28 | 87 | 17 | 336 |
| Sonoran Desert | terrestrial | 12,878 | 4 | 6 | 3 | 22 |
| South-eastern Great Basin | terrestrial | 11,038 | 7 | 15 | 2 | 42 |
| Southern California Coast | terrestrial | 14,473 | 177 | 645 | 77 | 920 |
| Southern California Mountains and Valleys | terrestrial | 27,551 | 78 | 340 | 43 | 619 |
| Southern Cascades | terrestrial | 17,025 | 28 | 81 | 13 | 212 |
The first three are marine, followed by the 19 USDA-defined ecoregions.
| Measurement(s) | genetic variation |
| Technology Type(s) | DNA sequencing |
| Factor Type(s) | Kingdom • Phylum • TaxonGroup • MarkerType • SampleSize • GeneTarget • NumMarkers • YearStart • YearEnd • PopName • LongitudeDD • LatitudeDD • CoordError • HabitatType • Lifespan • Fecundity • LifetimeReprodOutput • AgeSexMat • NumBreedingEvents • ReprodMode • BodyLength • AdultMass • CANativeStatus • CESAStatus • SSCStatus • ESAStatus • LifeCycle • AdultHeight • SelfCompatibility • MonoeciousDioecious • Asexual • PollinationMode • SeedDispMode • MassPerSeed • CAEndemicStatus |
| Sample Characteristic - Organism | eukaryota |
| Sample Characteristic - Location | California |