Literature DB >> 31217580

Location-level processes drive the establishment of alien bird populations worldwide.

David W Redding¹, Alex L Pigot¹, Ellie E Dyer¹, Çağan H Şekercioğlu^2,3, Salit Kark⁴, Tim M Blackburn^5,6.

Abstract

Human-mediated translocation of species to areas beyond their natural distribution (which results in 'alien' populations1) is a key signature of the Anthropocene2, and is a primary global driver of biodiversity loss and environmental change3. Stemming the tide of invasions requires understanding why some species fail to establish alien populations, and others succeed. To achieve this, we need to integrate the effects of features of the introduction site, the species introduced and the specific introduction event. Determining which, if any, location-level factors affect the success of establishment has proven difficult, owing to the multiple spatial, temporal and phylogenetic axes along which environmental variation may influence population survival. Here we apply Bayesian hierarchical regression analysis to a global spatially and temporally explicit database of introduction events of alien birds4 to show that environmental conditions at the introduction location, notably climatic suitability and the presence of other groups of alien species, are the primary determinants of successful establishment. Species-level traits and the size of the founding population (propagule pressure) exert secondary, but important, effects on success. Thus, current trajectories of anthropogenic environmental change will most probably facilitate future incursions by alien species, but predicting future invasions will require the integration of multiple location-, species- and event-level characteristics.

Entities: Chemical

Mesh：

Year: 2019 PMID： 31217580 PMCID： PMC6611725 DOI： 10.1038/s41586-019-1292-2

Source DB: PubMed Journal: Nature ISSN： 0028-0836 Impact factor: 49.962

Globally, alien species are accumulating at ever-increasing rates5, mainly driven by growing trade and transport connectivity6. Once an alien species is established (i.e. self-sustaining) in a new location, the economic and environmental costs of eradicating it or controlling its spread are often prohibitive3. Understanding the processes that facilitate or inhibit the initial establishment of alien species is therefore a critical step in limiting the future threat of biological invasions. Most early attempts to predict alien species establishment focussed on the characteristics of the introduced species or the introduction location7, but with limited success8, and did not consider the key role of idiosyncratic “event-level” factors, notably propagule pressure9. Some species-level traits (life history10, behavioural11 and ecological12) have subsequently been shown to explain variation in alien establishment success. However, determining which, if any, location-level factors affect success generally at a global level and across large taxonomic groups has proved challenging, for several reasons. First, many different biotic (e.g. recipient assemblage composition13) and abiotic (e.g. climate14, disturbance15) factors may be important. Second, these factors vary across both space and time, and drive differences in susceptibility at a range of levels of biological organisation – population (e.g. stochastic weather events), species (e.g. climatic affinity), community (e.g. native species richness), and landscape (e.g. habitat composition). Third, how a new environment interacts with a species is dependent on the evolutionary and adaptive history of the species introduced16: a harsh environment for a house sparrow (Passer domesticus) may or may not be harsh for the closely related Eurasian tree sparrow (P. montanus), and vice versa. Fourth, alien introductions happen in synergy with other major anthropogenic environmental changes such as increasing human population density, agricultural land conversion, and the presence of other alien species17. Yet, despite this apparent complexity, many previous analyses have treated location-level variables in a relatively simplistic way, considering either only coarse features of locations (e.g. latitude18, island versus continent19) or gross differences between native and alien environments20, and typically ignore spatial autocorrelation21. Therefore, we still await an integrated analysis of variation in alien establishment. Here, we undertake a global analysis to identify both the absolute and relative contributions of location, species, and event-level processes in predicting alien establishment. Using birds as a model system, we interrogate data on the success or failure of 4,346 individual introduction events spanning 708 species and, crucially, include information on propagule pressure, the key event-level driver of establishment9. To assess the specific influence of location, we consider a wide array of abiotic, biotic and anthropogenic factors. These account for both the mean and temporal variability in the abiotic environment, the suitability of the environment in terms of its similarity to conditions experienced by a species in its native range (‘environmental match’), metrics of human disturbance, and the characteristics of recipient biological communities, including both their diversity and their phylogenetic similarity to each introduced species. Finally, we incorporate aspects of species’ life history, behaviour and ecology that have previously been hypothesised to explain establishment success in alien birds. Features of introduction events are not random with regard to the identity, relatedness and characteristics of the species introduced16, their spatial location of origin and introduction4, nor to propagule pressure22, and so we undertake this analysis using Bayesian hierarchical regressions, inferred using Integrated Nested Laplace Approximation (INLA)23. This method provides efficient and accurate parameter estimation for complex inferences incorporating both random and fixed effects, allowing us to control for spatial and temporal non-independence in the abiotic and biotic features of locations, and for taxonomic non-independence in species traits. At a global scale, combinations of location-, species- and event-level variables are selected as important terms across all fitted models, including the best fitting model of avian establishment success (Fig. 1, Extended Data Table 1, n = 1530, wAIC = 892.96, AUC = 0.75 – for definitions see methods). This result was robust to the precise way in which introduction events were defined (Extended Data Fig. 1) and highlights that alien establishment cannot be adequately explained by characteristics of the environment, the species, or the specific introduction event in isolation. The most strongly supported individual determinant of establishment is the environment of the recipient location (Fig. 2a). Within this category, anthropogenic features, followed by climatic suitability, have the greatest influence on establishment success (Fig. 2b).

Fig. 1

Posterior distributions for fixed effects parameter estimates for the best fitting model of alien bird establishment success.

Boxplots summarise the posterior marginal distributions for all fixed-effects parameters (β) from a Bayesian regression of the most conservative data subset (n = 1530 introductions). Box widths show the interquartile range, the mean is represented as a bold vertical line within each box, and whiskers the 2.5th and 97.5th percentiles (i.e. the 95% credibility interval) of the distribution. Colours indicate the fixed effect category, and bold y-axis labels indicate that there is evidence for a non-zero slope for the described data variable. Further details are in Extended Data Table 1.

Extended Data Table 1

All covariates in the best fitting model, from a Bayesian regression of global avian establishment success (n = 1530 introductions).

Each variable (column: Variable name) is assigned a hierarchical category (Category level 1-3), the mean for the posterior distribution for β estimates (Mean β), the standard deviation for the mean value (SD β), and the 25th, 50th and 75th quantiles (25% quant, 50% quant, 75% quant) of the posterior distribution of each β estimate.

		Category level
Variable name	1	2	3	Mean β	SD β	25% quant	50% quant	75% quant
Intercept				0.234	0.112	0.013	0.234	0.454
Propagule Pressure	Event	Propagule pressure	Propagule pressure	0.123	0.032	0.061	0.123	0.185
Max Wind Anomaly	Location	Abiotic	Max wind anomaly	0.04	0.027	-0.013	0.04	0.092
Max Wind Anomaly^2	Location	Abiotic	Max wind anomaly	-0.013	0.006	-0.025	-0.013	-0.001
Mean Temperature	Location	Abiotic	Mean temperature	0.06	0.055	-0.048	0.059	0.169
Mean Temperature^2	Location	Abiotic	Mean temperature	-0.012	0.013	-0.037	-0.012	0.012
Crop Coverage	Location	Anthropogenic	Crop coverage	-0.022	0.011	-0.043	-0.022	-0.001
Human Popn	Location	Anthropogenic	Human population size	-0.002	0.029	-0.059	-0.003	0.055
Introduced Species Groups	Location	Anthropogenic	Introduced species	0.269	0.06	0.152	0.269	0.387
Intro. Group Success	Location	Anthropogenic	Introduced species	-0.297	0.053	-0.401	-0.297	-0.193
Intro. Group Success^2	Location	Anthropogenic	Introduced species	0.065	0.012	0.04	0.065	0.089
Ten Yr Crop Change	Location	Anthropogenic	Ten Year Crop change	-0.083	0.04	-0.161	-0.083	-0.005
Confamilial Rich.	Location	Biotic	Confamilial richness	0.077	0.027	0.024	0.077	0.129
Confamilial Rich.^2	Location	Biotic	Confamilial richness	-0.016	0.006	-0.027	-0.016	-0.005
Species Rich.	Location	Biotic	Native bird richness	-0.014	0.014	-0.041	-0.014	0.013
Different to Native	Location	Match	Environmental niche difference	-0.105	0.026	-0.156	-0.105	-0.054
Foraging Generalism Cat.	Species	Ecology	Foraging generalism	-0.011	0.026	-0.063	-0.011	0.04
Foraging Generalism	Species	Ecology	Foraging generalism	-0.041	0.04	-0.118	-0.041	0.037
Foraging Generalism^2	Species	Ecology	Foraging generalism	0.037	0.017	0.004	0.037	0.07
Habitat Generalism	Species	Ecology	Habitat generalism	-0.061	0.026	-0.113	-0.061	-0.01
Habitat Generalism^2	Species	Ecology	Habitat generalism	0.011	0.004	0.002	0.011	0.02
Brood size	Species	Life History	Brood size	0.129	0.036	0.059	0.128	0.201
Maximum Age	Species	Life History	Max lifespan	0.056	0.021	0.015	0.056	0.097
Maximum Age^2	Species	Life History	Max lifespan	-0.009	0.004	-0.018	-0.009	-0.001

Extended Data Figure 1

Sensitivity analysis of slope (β) estimates for the linear terms of a subset of variables across all versions of the input data.

Dot size is the size of the beta value with colour representing direction with red positive and blue negative. Each row label represents the name of the fixed effect. The column headings represent the data subset used: ‘all-records’ = all data (n = 4346), ‘intro-and-unk.’ = all data but one record per species-location event (n = 3762), ‘intro-only’ = detailed introductions only (n = 1784), ‘intro-last-only’ = detailed introductions but one record per species-location event (n = 1530). The number at end of each column heading indicates relative size of buffer used to impute establishment status (see methods).

Fig. 2

Relative effect size of different categories of predictors in the best fitting model of alien bird establishment success.

Each wedge represents the sum of the change in wAIC for the fixed effects in each category when added to a Bayesian regression of establishment success versus failure (n = 1530 introductions). The left-hand panel (a) presents variables classified into location-, species- and event-level categories, while the right-hand panel (b) presents the sub-categories within those broad levels (n = 1530 introduction events).

A strong anthropogenic determinant of establishment success is the number of alien taxonomic groups already established at a location at the time of introduction. The positive effect of the number of alien groups introduced is broadly consistent with the invasion meltdown hypothesis17, whereby ecological disruptions caused by, or enabling, earlier invasions facilitate further successful introductions. This result is not simply indexing anthropogenic environmental disturbance; crop coverage and human population density, while included in the best fitting model, did not have a strong and consistent global signal for alien establishment success (Fig. 1, Extended Data Fig. 2). This may be due to historical patterns of introductions being mainly restricted to already disturbed areas12. In fact, our analysis shows that less disturbed areas have higher establishment success rates, with rapid agricultural land-conversion not only causing native species declines2, but also negatively impacting alien species, at least in the early stages of the invasion process.

Extended Data Figure 2

Approximate shape of fixed effects over the range of observed values.

Each panel represents the prediction using β slope estimates from the lowest wAIC model over the known range of values for that given fixed effect (identified by strip title) from the raw data. Only fixed effects whose values were unlikely to include zero are included. All panels from a single Bayesian regression of global avian establishment success (n = 1530 introductions).

Previous evidence has suggested that species are more likely to establish when they are pre-adapted to local climatic conditions16 and our analysis confirms this hypothesis. We found that alien establishment success is highest in locations where environmental conditions are more similar to those in the species’ native range (‘environmental match’, Fig. 1-2), albeit with the proviso that average conditions across the range are relatively crude measures of climatic preferences. Our analysis also suggests a hump-shaped effect of mean annual temperature on establishment (Fig. 1). This relationship implies a “Goldilocks effect”, such that locations with intermediate conditions are more amenable to establishment than those that are too hot or too cold, regardless of the conditions naturally experienced by each species. Environmental extremes are also important24, with establishment success reduced by the occurrence of historical storm events in the period immediately following introduction. Anecdotal evidence had previously suggested extreme weather as a cause of specific establishment failures (e.g., the house crow (Corvus splendens) on Mauritius25), and our spatiotemporal analysis identifies this as a general effect in the global record of avian introductions. The extent to which communities differ in their biotic resistance to introduced species has remained controversial, with studies variously reporting positive, negative or no effects of local species richness on patterns of establishment26. Overall, we found that the biotic environment had a relatively weak effect on establishment compared to the other location-, species- and event-level factors. Nevertheless, accounting for these other factors revealed a potential negative effect of native bird species richness on alien establishment success, with this switching to a hump-shaped relationship (Fig. 1) when considering only the most closely related and presumably ecologically similar species. These results help clarify previous contradictory findings, by showing that while overall native biodiversity may inhibit invasions, (i) this effect is relatively weak compared to other extrinsic and intrinsic factors, and (ii) it may be partially masked by the tendency for locations with many closely related species to be more environmentally suitable, and thus be more susceptible to establishment (i.e. biotic acceptance hypothesis17). In addition to environmental factors, features of the species’ life history and ecology are strongly supported as determinants of establishment success. In particular, larger brood sizes promote establishment, while lifespan showed a hump-shaped relationship with invasion success (Fig. 1, Extended Data Fig. 2), confirming previous evidence of a trade-off between the benefits of fast and slow life histories10. While species with fast life histories can gain a quick ‘foothold’ at a new location through rapid population growth, slower life histories give resilience against demographic and environmental variation, allowing alien populations to be better able to ride out extreme conditions27,28. In our model, there is also evidence that foraging specialism and habitat-use generalism may, taken together, increase establishment success. Life history variables are generally strongly phylogenetically conserved (e.g. brood size, λ = 0.96, Fig. 3), implying that related species could have similar rates of establishment success. However, globally, establishment success has a much weaker phylogenetic signal (λ = 0.4; Fig. 3), due to phylogenetically conserved traits being overwhelmed by the combined spatial effects of the local environment and propagule pressure, all of which tend to exhibit little phylogenetic signal. The inherently idiosyncratic nature of these effects with regard to the identity of the species introduced (Spearman ρ between predictions based on life history and the final model is 0.64) explains why it has proven difficult to identify consistent life history predictors of establishment in isolation29.

Fig. 3

Phylogenetic patterns of invasion probability across alien birds.

Shows 358 species with the highest quality information on introduction events. Blue-green-yellow outer bars show the mean establishment potential of a species across all 1-degree grid cells beyond its native range, with longer and yellower bars indicating that a species has greater potential to establish outside its native range. Phylogenetic branches are coloured according to brood size, with lighter colours indicating higher brood sizes, and darker colours lower brood sizes. Silhouettes (from http://phylopic.org/) show the approximate location of avian families.

Lastly, we confirm the strong general role of propagule pressure which, in line with previous work on alien birds30, is best represented by an asymptotic log-term (Fig. 1, Extended Data Fig. 2): small founding populations are likely to fail due to stochastic and Allee effects, while the success of larger populations30 depends instead on the species- and location-level effects we identify here. Our analysis highlights the key role of the presence of other alien species groups, suggesting that locations that are already hotspots for introductions are especially susceptible to accumulating alien species, but also show that alien species are more likely to establish when they are pre-adapted to local climatic conditions. Growth in global trade means that an ever-growing number of species are being introduced to novel locations4,31, and the environmental matches of ever more species are being tested against new environments. These trajectories will facilitate future incursions by alien species, exhibiting features of an invasion meltdown32, which, as we show, could be further exacerbated depending on precise combinations of species and sites where the introductions are occurring. Our analyses confirm the urgent need for enhanced management programs to prevent or mitigate the negative impact of these invasions.

Methods

Alien introduction events

We collated all the records from the GAVIA database of global bird introductions4,31 that contained geo-referenced introduction events at known specific (e.g. sub-national and below) point locations (i.e. the locations where the species was recorded as having escaped or been released) and at specific dates, excluding those records from GAVIA that related to spread after an introduction event (Supplement Information Data 1). This process initially resulted in 5834 records, with accompanying spatial polygon data created by drawing around the smallest geographical unit to which the introduction event could be reasonably attributed4. While some records were at a specific location a (e.g. a single building address, park or harbour), a small minority of event records could only be assigned at a coarser spatial scale (e.g. city or county). We note there is some geographical bias in the records with most of the introductions occurring in the Australasian region (19% of records), followed by the Paleartic (18%), Oceanic (17%) and the Nearartic (16%), with fewest records in the Afrotropics (12%), Neotropics (10%), IndoMalay (8%), and Antarctica (<1%). However, introduction events occur predominately within regions and high-profile historical routes between continents (e.g. Europe to Australia) are relatively rare given the huge increase in introductions in recent decades5 through, for example, accidental transport and wildlife trade31 (Extended Data Figure 3).

Extended Data Figure 3

Chord diagram showing directions of origin and introduction location of avian introduction events between regions of the world.

The chords near the edge represent introductions to a region, chords away from the edge show origins of introduction. Width of chord is the relative number of introduction events (n = 4346).

Using data from both the original, and external sources, we thoroughly checked all introduction records for potential errors and removed any records that were possibly dubious – usually due to misreported dates or locations across multiple references. This resulted in n = 4346 unique introduction event records (Supplementary Information Data 1). We then used text information, again from the original source, to categorise the introduction events as either known specific introduction events (“introductions”, n = 1784) first sightings (“sightings”, n = 584), or as having no clear designation (“unknown”, n = 1978). Finally, we noted if any records were part of a chronological sequence of introduction events involving a single species at a single location e.g. Eurasian skylark (Alauda arvensis) imported and released on six separate occasions in the Barrabool Hills, VIC, Australia from 1852-1880. From these we created four data subsets of decreasing size but increasing specificity: “All Records” contained all the records in the database (n = 4346), “Intro. & Unk.” contained records that are known introductions and records that have no detailed description (n = 3762), and “Introductions” contained all records that are specified as detailed introduction events (n = 1784) and “Last Introduction” (n= 1530). This final data subset contained known introduction events, but with events that were part of a chronological sequence of introduction events collapsed into a single record, summarised using the date of the last introduction events and the cumulative propagule pressure across events (n = 1530). Based on prior hypotheses of avian establishment success reported in the literature (Supplementary Information Data 2), we collated information on a wide-range of covariates that could reasonably impact upon establishment success. We categorised these covariates into the three categories of establishment determinants, as defined by Duncan et al.16 – location-, species- and event-level factors –further distinguishing between different types of species- and location-level factors as below:

Event-level Factors - Propagule pressure

We extracted from the original reference source, where available, a numerical estimate of propagule pressure (founding population size), measuring both the number of introduction events per record (propagule number9) and the number of individual birds that were introduced at each event (propagule size9).9). For a minority (n = 67 or ~0.01%) of records that only had descriptive text regarding the number of individuals introduced, we translated any common terms according to the following rules. When describing individuals released: “few”=3, “several”=5, “some”=10, “small numbers”=10, “many”=25, “flock”=25, “large numbers”=100, “shipment”=200, “mass”=250, “great numbers”=250. When describing propagule number: “repeated”=5, “several”=5, “releases”=2, “numerous”=10, “many”=10, “frequent”=10. We decided on these numbers by summarising, where available, records that contained both these descriptive qualifiers and a numerical figure for number introduced. To calculate propagule pressure, that is the relative size of the introduction effort9, we used the recorded number of individuals introduced. When this data type was missing we added in the median propagule size (5 individuals introduced), or, if the number of discrete introduction events were available, we used the median propagule size multiplied by propagule number.

Species-level Factors - Life History Traits

For each species, we assembled data from published sources on a number of life-history traits previously linked to establishment success in birds, including mean clutch size, number of clutches per year, age at first breeding (months) and maximum lifespan (years)10,33,34. We additionally included data on mean adult body mass (g)35. Species for which data could not be collected (clutch size [11%], number of clutches per year [48%], age a first breeding [66%], lifespan [52%]) were assigned the mean value of the lowest inclusive taxonomic rank (i.e. genus, family, order) for which data were available. This approach is justified because most of variance in avian traits, as calculated from our data, occurs at taxonomic levels above the genus (clutch size [91%], number of clutches per year [70%], age at first breeding [83%], lifespan [62%]). We also include a previously used measure called brood value, which is expressed as log10(1/[(broods per year) × (reproductive lifespan)])10 and represents investment in future over current reproduction.

Species-level Factors - Behavioural Traits

For each species, we assembled data on relative brain size, quantified as the residuals from a least squares regression of brain size on body size (both log-transformed)36. Relative brain size provides a metric of behavioural flexibility that has been shown to relate to establishment success in birds37. Species with missing data (72%) were assigned the mean value of the lowest inclusive taxonomic rank (i.e. genus, family, order) for which data were available. As for life history traits, most of the variance in brain size (93%) occurs at taxonomic levels above the genus.

Species-level Factors - Ecological Traits

Data on species-specific diets and foraging strategies came from Kissling et al.38: for both of these variables a total value of 100 was divided between categories to represent the percentage of time a species spends feeding on a particular food type or foraging in any particular location. Habitat use data for each species were extracted from the IUCN Red List database40. For each ecological variable (diet, foraging and habitat), we calculated two measures of generalism, using the total number of different categories used and Simpson’s diversity measure D39.

Location-level Factors - Abiotic environment

Global geophysical data (altitude above sea level and slope %) were downloaded in pre-processed geoTiff format at 1km grid scale41. A third variable, ‘altitude variance’, was computed with the R function aggregate (raster package42) using the variance of the altitude values of 3x3 grid cells, such that all 9 cells had the same final value. Bio-climatic data in the form of global averages from 1960-2000 were restricted to large terrestrial land masses, and were downloaded as four ‘ESRI’ format ascii data grids, at 30 arc-seconds (~1 km) resolution. They consisted of mean annual temperature, annual variation in temperature (‘temperature seasonality’43), mean annual rainfall and annual variation in temperature (‘precipitation seasonality’43). Abiotic data for islands came from an island-specific dataset44 and had the same variables as for large terrestrial land masses, but represented as a spatially-referenced spreadsheet containing data on climate and physical characteristics of the majority of the world’s islands. For islands that were not represented in the gridded bioclimatic data, we identified missing values for the above bioclimatic and altitude data, and by matching the island name in the GAVIA data with the island name in the bioclimatic dataset, we were able to extract the mean annual temperature, annual variation in temperature, mean annual rainfall, annual variation in temperature and altitude. For islands, we also included distance to continent (giving non-island records a value of zero) represented by the “dist” column from that data set. We also employed a measure of remoteness, again giving continents a value of zero, using the “SLMP” column of the data set. Historical climate data (1850-2007) were downloaded as 6x4km netCDF grids for six main variables: sea surface temperature (SST), air temperature (A), U-wind (Uwind), V-wind (Vwind), sea-level pressure (SLP) and cloudiness (CLDC) from the HADCRUT3 data set45. Historical spatio-temporal land-cover data (1700-2007) were downloaded as global ‘ESRI’ format ascii data grids at ~5 km resolution, consisting of proportion cover of primary and secondary habitats, from the Harmonised Land Use Dataset46. To reduce collinearity in a regression (i.e. the dummy variable trap) the ‘other’ category data was not included in the analysis, such that the addition of all the land-use categories in each grid cell did not sum to one. These environmental data were extracted at each record location by calculating the mean grid cell values intersecting the introduction event polygon (R function extract42). For very specific introduction events, this would be the single cell the event polygon was located in, but for less specific introduction event data there was sometimes more than one grid cell that overlapped the polygon. For the spatio-temporal data, extractions for each record were only undertaken on the temporally nearest data layers. For example, for the historical climate data, a maximum anomaly value at that location over the ten years (120 months) post-introduction was used, with records outside the dataset time period (<5% of the total records) being designated missing using “NA”. For the land-cover data, which was a yearly rather than monthly dataset, records were matched by introduction year to the specific global land-cover layer so that the contemporary (at the time of introduction) percentage cover for five land-use types (Primary, Secondary, Cropland, Pasture, Urban) could be calculated. Records earlier than the starting year of the land-cover dataset (<1% of total records) were designated missing in each land-use covariate using “NA”. We note that there is uneven sampling here, with most historical introductions occurring in human modified landscapes: for instance, in forested areas only 8% of the introduction polygon is designated as ‘Primary’ at the time of introduction. This means for some specific land-types we may not be able to resolve their specific impacts on invasion success. For each land-cover type noted above, an additional variable was constructed which was the gradient of change in land-cover in each grid cell over the 10 years prior to each introduction event. This was calculated by using a linear regression (R function lm) of land-cover proportion explained by year, and taking the slope (β) as the value of change per cell.

Location-level Factors - Environmental Match

Range maps for species’ native distributions were downloaded from Birdlife International and NatureServe (www.birdlife.org) and extracted onto an equal area grid (~110 x 110 km) in a Behrmann projection. These maps show species’ extent of occurrence, and so are relatively crude depictions of the area occupied by the species, but are nevertheless commonly used for analyses of this type. We quantified species’ environmental preferences using the mean and standard deviation of climate conditions across grid cells in their native distributions on the basis of four climatic indices from the WorldClim dataset (BIO1, mean annual temperature; BIO4, temperature seasonality; BIO12, annual precipitation; BIO15, precipitation seasonality)43. Using each of these input variables, to capture as a single variable the environmental match between the introduction site and those being experienced in the species’ native range, for each introduction event we calculated the distance, in measurement space, between the Euclidean distance from mean values taken from the grid cells at the introduction site (sources defined in ‘abiotic’ above) to the mean values from the native range of the introduced species. For each climatic axis, we divided the distance by the standard deviation of native climatic values, as some species have very large ranges with a corresponding wide range of acceptable native climatic values. We note that this measure is a relatively coarse way to measure native preferences, as fine scale habitat variation within the range may act to bias the mean value, but that finer scale data are not available for all introduced species.

Location-level Factors - Biotic Environment

To test whether interactions with native resident species may influence establishment success, for each combination of introduced alien species and grid cell, we calculated four metrics of community diversity and structure: i) the richness of all native resident species, ii) the number of native species in the same genus or, iii) family, and iv) the nearest taxon index, representing the phylogenetic branch length (millions of years) separating an introduced alien species from its closest relative in the recipient community. Recipient communities were designated as those species whose ranges overlapped with any of the introduction polygon, though we note that not all species in this sample would be interacting if they utilised very different habitats. Phylogenetic distances were calculated as the mean across 100 phylogenies sampled at random from the posterior distribution of trees from the Jetz et al. phylogeny with the Hackett backbone47. These variables thus quantify the overall species richness of the location of introduction (i), and the location’s richness relative to the phylogenetic position of the introduced species (ii – iv).

Location-level Factors - Anthropogenic Environment

To determine the role of human disturbance and urbanisation12 in facilitating the establishment of invasive species we captured the spatial variation and prior ten-year change in human population density, urban land, crop and pastoral land coverage, from the Harmonised Land Use Dataset46 using the same methods as for abiotic environment variables. We also tested whether establishment probability is related to the prior presence of other alien species. We determined if any alien species were present at a given site prior to each introduction event using published data5 on first records for a number of groups (algae, amphibians, arachnids, arthropods, bacteria and protozoans, birds, bryophytes, crustaceans, fishes, fungi, insects, invertebrates, mammals, molluscs, reptiles, plants, viruses), recorded at the level of country or major island group. For larger countries this value would be less accurate, but most of our data are for islands (66% of introduction records). Furthermore, these data represent the best spatio-temporal knowledge currently available and further work using site-level data would be able to examine these relationships in more detail. For each introduction event, we create a binary variable with a value of 1 for each group if any species from that group was present at least one year prior to the bird introduction event and 0 if no species from that group were present.

Statistical Modelling Outline

We modelled the establishment success or failure of bird introductions using a Bayesian hierarchical regression inferred using Integrated Nested Laplace Approximation (INLA) as implemented in the R package R-INLA48 (Supplementary Information Code 1). We used this method as it provides accurate parameter (e.g. β) estimates for complex regressions incorporating both spatial and non-spatial random and fixed effects with very low computational overheads23. We evaluated the model fit for covariate choice via the Watanabe–Akaike Information Criterion (wAIC)49 and the Conditional Predictive Ordinate (CPO). wAIC is a criterion for model comparison and is an extension of the Akaike Information Criterion (AIC), but is widely-applicable to Bayesian inference techniques and offers clearer interpretation than other options50. Similar to AIC, wAIC provides a method to penalise the ability of the model to fit the observed data by the number of parameters used to create the underlying model. This value is more suitable for a Bayesian framework as it integrates across the whole posterior distribution rather than relying on summary statistics, e.g. mean of posterior distribution. Similarly, the CPO approximates the ‘gold-standard’51 Leave-One-Out Cross Validation, and calculates the posterior probability of a model inferred without each data point. The sum of the log CPO scores, therefore, represents an estimator for the log marginal likelihood of the model. Given that wAIC tends to ∑ln(CPO) under ideal circumstances51 (in our case Spearman correlation ~0.99), we henceforth report only wAIC as a proxy for CPO. We model number of establishment successes across the dataset as a binomial random variable (success = 1, failure = 0) and use the normal approximation to the binomial, as expected under central limit theorem given our large numbers of trials. This was due to the computational efficiency of utilising the Gaussian distribution, allowing us to repeat the modelling procedure many times with no loss of predictive accuracy (mean hold-out cross-validated Area Under receiver operating Curve statistic (AUC) for Gaussian = 0.68±0.05, versus AUC = 0.67±0.06 for the binomial mean). To convert the ‘StatusCat’ column from the GAVIA dataset to the response variable, we recoded the categories of “established” and “breeding” to 1 and the known failure categories (“died out”, “failed”) to 0. For a large proportion of records, the success or failure of an avian introduction was unknown (n = 2234). In these cases, we used the introduction event polygon associated with each record to search for the alien species in sightings from eBird52 and other sources from within the Global Biodiversity Information Facility (GBIF)53. We downloaded all occurrences from GBIF, within 0.5 degrees of both latitude and longitude of the centre point of each introduction event. Then, using the associated ‘observation date’ with each GBIF record and the ‘mapping date’ associated with each GAVIA record, we counted the total number of individuals seen within the 0.5-degree buffer in the last ten years (2007-2017). We changed unknown values to 1 (succeeded) if there were more than 15 records of the introduced species within 0.5 degree (~ 95km at the equator) of the introduction site in the last ten years, and 0 (failed) if fewer. To ensure our results were robust to these thresholds, we changed the record density threshold used to identify whether a species is “established” by increasing the buffer size used to capture GBIF observations to 1 and then 1.5 degrees of both latitude and longitude around the introduction point. To account for known spatial autocorrelation in the input data31 we implemented a stochastic partial differential equations model (SPDE) with the hierarchical regression that builds a latent error surface, of user-defined complexity, to account for similarities in more closely located data points48. We inferred the regression models using an SPDE term and mesh with varying characteristics to find the range, standard deviation of the range, cut-off and maximum edge values using the wAIC score to determine the best version of the SPDE model. To account for the random effect of phylogenetic non-independence, we included “iid” random effects of species’ taxonomic family and order. To account for temporal differences in recording accuracy and in methods of introduction31 at the same location over time, we included a random-walk auto-correlated random effect48 for the year of introduction. To remove effects of very large or small values, for each covariate we capped low values at the 1% quantile and high values as the 99% quantile. Finally, for each model we assessed all recommended diagnostics to ensure the model was robustly fitted, including plotting and visualising the distribution and probability density of the out-of-sample CPO per data point score (using 10-fold, 10% hold-out, cross validations) (Extended Data Figure 4a,b) and spatially mapping the same values (Extended Data Figure 4c) to check for parts of data that were poorly predicted by the model. Then, using the mean of the posterior distribution of the linear predictor, we employed an AUC approach (Area Under receiver operating Curve score) to calculate the predictive accuracy of each model. This process works by measuring the numbers of correctly and incorrectly labelled predictions across all possible classification threshold values of the binomial response variable. An AUC value equal to or less than 0.5 indicates a predictive ability that is equal to the random expectation, while an AUC value close to 1 indicates a perfect predictive ability.

Extended Data Figure 4

Model diagnostics from the best fitting model.

Panel (a) shows a plot of out-of-sample Conditional Probability Ordinate (CPO) scores for all data points in rank order used in the model; panel (b) the probability density of the CPO scores; (c) map of CPO scores. CPO is the probability of generating each data point in the data set from a posterior fitted without this data point, with each panel allowing visualisation of where in the data the model not be fitting well. All plots from a single Bayesian regression of global avian establishment success (n = 1530 introductions).

Analysis protocol

We first used the most conservative dataset for analysis – that which contained known introductions events (n = 1530 records) (Supplementary Information Data 3). Then, to ensure our random effect terms were valid in their inclusion, we first fitted a model with just an intercept, then just an intercept and spatial term, and then added in other random effects one at a time (Extend Data Table 1). For each additional step of complexity we recorded the change (Δ) in the wAIC value49 and only included random effects that increased the fit of the model by more than 2 wAIC units. We used uninformative priors in all cases except for the spatial term where we set the priors to a set of reasonable estimates of the range and standard deviation of the range to understand how this specification affected parameter estimates. To increase the biological interpretability of our models (and due to the large number of covariates and high collinearity between them), we then added all explanatory variables (Supplementary Information Data 2) into a regression model in a step-wise manner and, after each step, assessed model fit using the wAIC value. In order that the effect sizes of the different covariates could be better compared, each explanatory variable was standardised to a mean of zero and standard deviation of one before it was added. At each model choice step we used the standard threshold of ΔwAIC>2 to select better models54. When offering steps either forward or backwards, we allowed the choice of either a linear representation of the covariate, the natural logarithm of the term, or a linear and squared term to allow for the situations where a curvilinear relationship fitted better than a linear slope. To examine whether the model selection process was robust to decisions relating to the database, we ran several additional versions of the stepwise regressions to see if the key variables identified in the main analysis above were still recovered. We first ran a stepwise regression using all the introduction records and then repeated the process with all the other data subsets. We also ran different versions of the stepwise selection with the different buffer sizes for the GBIF missing data interpolation to test the sensitivity of the imputation process. We then used the lowest wAIC model to predict establishment success over a 1-degree grid of points covering all land areas (n = 19561 cells) for each introduced alien species (n = 358 species). We only used already established alien species because new species are being added to the current pool of aliens at a relatively low rate and this current pool will likely make up the vast majority of future invasions5. When predicting, we set the random effects introduction year = 2015 to be as close to the present day as possible and propagule pressure = 150, the lowest value after the threshold beyond which the number introduced has limited impact (Extend Data Fig. 2). All predicted values where the confidence was low, such that the 95% confidence intervals for the grid cell estimate covered 0 and 1, we designated as NA. Using the prediction layers for each species we created two datasets: First, by using a 10% trimmed mean of the probability of success for every species value for each grid cell we were able to determine which areas of the world had the highest ‘establishment potential’ and, therefore, were at risk of this set of introduced species establishing there. Second, by using 10% trimmed means for all values in each of the 384 layers, we were able to create an index of establishment potential per species, which we then mapped on to a recent phylogeny47. We calculated the phylogenetic signal of these values using Pagel’s λ (R function phylosig) and used a permutation test to test the probability that these values deviated from 0, indicating a significant relationship between phylogenetic relatedness and species trait values.

Sensitivity analysis of slope (β) estimates for the linear terms of a subset of variables across all versions of the input data.

Approximate shape of fixed effects over the range of observed values.

Chord diagram showing directions of origin and introduction location of avian introduction events between regions of the world.

The chords near the edge represent introductions to a region, chords away from the edge show origins of introduction. Width of chord is the relative number of introduction events (n = 4346).

Model diagnostics from the best fitting model.

All covariates in the best fitting model, from a Bayesian regression of global avian establishment success (n = 1530 introductions).

Supplementary Material

Supplementary Information is available in the online version of the paper.

21 in total

Review 1. Predicting the geography of species' invasions via ecological niche modeling.

Authors: A Townsend Peterson
Journal: Q Rev Biol Date: 2003-12 Impact factor: 4.875

2. Natural population die-offs: causes and consequences for terrestrial mammals.

Authors: Eric I Ameca Y Juárez; Georgina M Mace; Guy Cowlishaw; Nathalie Pettorelli
Journal: Trends Ecol Evol Date: 2011-12-22 Impact factor: 17.712

3. Big brains, enhanced cognition, and response of birds to novel environments.

Authors: Daniel Sol; Richard P Duncan; Tim M Blackburn; Phillip Cassey; Louis Lefebvre
Journal: Proc Natl Acad Sci U S A Date: 2005-03-22 Impact factor: 11.205

4. Global patterns of introduction effort and establishment success in birds.

Authors: Phillip Cassey; Tim M Blackburn; Daniel Sol; Richard P Duncan; Julie L Lockwood
Journal: Proc Biol Sci Date: 2004-12-07 Impact factor: 5.349

5. Unraveling the life history of successful invaders.

Authors: Daniel Sol; Joan Maspons; Miquel Vall-Llosera; Ignasi Bartomeus; Gabriel E García-Peña; Josep Piñol; Robert P Freckleton
Journal: Science Date: 2012-08-03 Impact factor: 47.728

Review 6. The role of global trade and transport network topology in the human-mediated dispersal of alien species.

Authors: Natalie Clare Banks; Dean Ronald Paini; Kirsty Louise Bayliss; Michael Hodda
Journal: Ecol Lett Date: 2014-12-21 Impact factor: 9.492

7. Exploring or avoiding novel food resources? The novelty conflict in an invasive bird.

Authors: Daniel Sol; Andrea S Griffin; Ignasi Bartomeus; Hayley Boyce
Journal: PLoS One Date: 2011-05-18 Impact factor: 3.240

8. Environmental variation and the evolution of large brains in birds.

Authors: Ferran Sayol; Joan Maspons; Oriol Lapiedra; Andrew N Iwaniuk; Tamás Székely; Daniel Sol
Journal: Nat Commun Date: 2016-12-22 Impact factor: 14.919

9. The Global Distribution and Drivers of Alien Bird Species Richness.

Authors: Ellie E Dyer; Phillip Cassey; David W Redding; Ben Collen; Victoria Franks; Kevin J Gaston; Kate E Jones; Salit Kark; C David L Orme; Tim M Blackburn
Journal: PLoS Biol Date: 2017-01-12 Impact factor: 8.029

10. Human Ageing Genomic Resources: integrated databases and tools for the biology and genetics of ageing.

Authors: Robi Tacutu; Thomas Craig; Arie Budovsky; Daniel Wuttke; Gilad Lehmann; Dmitri Taranukha; Joana Costa; Vadim E Fraifeld; João Pedro de Magalhães
Journal: Nucleic Acids Res Date: 2012-11-27 Impact factor: 16.971

12 in total

1. Invasiveness is linked to greater commercial success in the global pet trade.

Authors: Jérôme M W Gippet; Cleo Bertelsmeier
Journal: Proc Natl Acad Sci U S A Date: 2021-04-06 Impact factor: 11.205

2. Environmental resistance predicts the spread of alien species.

Authors: Rebecca S L Lovell; Tim M Blackburn; Ellie E Dyer; Alex L Pigot
Journal: Nat Ecol Evol Date: 2021-01-25 Impact factor: 15.460

3. Propagule pressure does not consistently predict the outcomes of exotic bird introductions.

Authors: Michael P Moulton; Wendell P Cropper
Journal: PeerJ Date: 2019-09-11 Impact factor: 2.984

4. Scientists' warning on invasive alien species.

Authors: Petr Pyšek; Philip E Hulme; Dan Simberloff; Sven Bacher; Tim M Blackburn; James T Carlton; Wayne Dawson; Franz Essl; Llewellyn C Foxcroft; Piero Genovesi; Jonathan M Jeschke; Ingolf Kühn; Andrew M Liebhold; Nicholas E Mandrak; Laura A Meyerson; Aníbal Pauchard; Jan Pergl; Helen E Roy; Hanno Seebens; Mark van Kleunen; Montserrat Vilà; Michael J Wingfield; David M Richardson
Journal: Biol Rev Camb Philos Soc Date: 2020-06-25