| Literature DB >> 34916621 |
Lindell Bromham1, Xia Hua2,3, Russell Dinnage2, Hedvig Skirgård4,5, Andrew Ritchie2, Marcel Cardillo2, Felicity Meakins6, Simon Greenhill4,5.
Abstract
Language diversity is under threat. While each language is subject to specific social, demographic and political pressures, there may also be common threatening processes. We use an analysis of 6,511 spoken languages with 51 predictor variables spanning aspects of population, documentation, legal recognition, education policy, socioeconomic indicators and environmental features to show that, counter to common perception, contact with other languages per se is not a driver of language loss. However, greater road density, which may encourage population movement, is associated with increased endangerment. Higher average years of schooling is also associated with greater endangerment, evidence that formal education can contribute to loss of language diversity. Without intervention, language loss could triple within 40 years, with at least one language lost per month. To avoid the loss of over 1,500 languages by the end of the century, urgent investment is needed in language documentation, bilingual education programmes and other community-based programmes.Entities:
Mesh:
Year: 2021 PMID: 34916621 PMCID: PMC8825282 DOI: 10.1038/s41559-021-01604-y
Source DB: PubMed Journal: Nat Ecol Evol ISSN: 2397-334X Impact factor: 15.460
List of variables analysed in this study (see also Supplementary Fig. 3), with the names given to the variables in the raw data available in Supplementary Data 1
| Variable name | Level | Tr. | Sources | |
|---|---|---|---|---|
| Response variable | ||||
| Endangerment level | EGIDS | Language | ||
| Independent variable | ||||
| 0. Intercepts | ||||
| Region | Region | Language | ||
| Predictors | ||||
| 1. Language | ||||
| L1 speaker population size | L1 pop | Language | L | WLMS e17, e16; |
| Area | Area | Language | L | WLMS e17, e16; |
| Island | Island | Language | See Supplementary Methods | |
| Official status | Official status | Language | See Supplementary Methods | |
| Level of language documentation | Documentation | Language | Glottolog V4.2.1 | |
| 2. Diversity | ||||
| L1 Speakers as proportion of number of people in the neighbourhood | L1 pop prop | Language | L | WLMS and Glottolog |
| Number of languages in contact | Bordering language richness | Language | L | WLMS e17, e16; |
| Number of languages in contact per km perimeter | Bordering language richness per km | Language | L | WLMS e17, e16; |
| Evenness of languages in contact | Bordering languages evenness | Language | SR | WLMS e17, e16; |
| Number of languages | Language richness | Neighbourhood | L | WLMS e17, e16; |
| Language evenness | Language evenness | Neighbourhood | SR | WLMS e17, e16; |
| Number of endangered languages | Endangered languages | Neighbourhood | L | Glottolog V4.2.1 |
| Proportion of languages that are endangered | Endangered prop languages | Neighbourhood | SR | Glottolog V4.2.1 |
| 3. Education | See Supplementary Tables | |||
| Recognized language of education | Language of education | Language | ||
| Average years of schooling | Years of schooling | National | SR | Barro–Lee Educational Attainment database[ |
| Policy affirming minority language education | Minority education | National | L’aménagement linguistique dans le monde[ | |
| Education spending as % of GDP | Education spending | National | SR | World Bank 2019 |
| 4.Socioeconomic | See Supplementary Table | |||
| Gross Domestic Product per capita | GDPpc | National | L | World Bank 2019 |
| GINI | GINI | National | S | Standardized World Income Inequality Database (SWIID)[ |
| Life Expectancy at age 60 | Life expectancy 60 | National | L | World Bank 2019 |
| 5. Land use | ||||
| Population density | Pop density | Polygon | L | Gridded Population of the World (GPW) v4 |
| Cropland | Cropland | Polygon | SR | Venter et al.[ |
| Built environment | Built | Polygon | L | Venter et al.[ |
| Pasture | Pasture | Polygon | SR | Venter et al.[ |
| Human footprint | Human footprint | Polygon | SR | Venter et al.[ |
| 6. Environment | ||||
| Mean growing season | Growing season | Polygon | Global Agro-ecological Zones (GAEZ v3.0)[ | |
| Mean annual temperature | Temperature | Polygon | C | Worldclim v2 |
| Temperature seasonality | Temperature seasonality | Polygon | L | Worldclim v2 |
| Precipitation seasonality | Rainfall seasonality | Polygon | SR | Worldclim v2 |
| 7. Biodiversity loss | ||||
| Threatened species | Threatened species | Polygon | L | IUCN[ |
| Proportion of species that are threatened | Threatened prop species | Polygon | L | IUCN[ |
| 8. Connectivity | ||||
| Road distance score | Roads | Neighbourhood | SR | Venter et al.[ |
| Navigable waterways distance score | Waterways | Neighbourhood | SR | Venter et al.[ |
| Landscape roughness | Roughness | Neighbourhood | S | SRTM30 elevation dataset |
| Altitudinal range | Altitude range | Neighbourhood | L | Worldclim v2 |
| 9. Shift | ||||
| Increase in urbanization | Urban change | National | SSR | World Bank 2019 |
| Rate of change in population density | Pop density change | Polygon | SSR | GPW v4 |
| Change in human footprint (score per year) | Footprint change | Polygon | SSR | Venter et al.[ |
| Change in croplands (proportion of area per year) | Cropland change | Polygon | SSR | Venter et al.[ |
| Change in pasture (proportion of area per year) | Pasture change | Polygon | SSR | Venter et al.[ |
| Change in built environment (proportion of area per year) | Built change | Polygon | SSR | Venter et al.[ |
| 10. World language as official language | Supplementary Tables | |||
| Any | World language | National | ||
| Arabic | Arabic | National | ||
| Malay (including Indonesian) | Malay | National | ||
| English | English | National | ||
| French | French | National | ||
| Hindustani (Hindi+Urdu) | Hindustani | National | ||
| Mandarin | Mandarin | National | ||
| Portuguese | Portuguese | National | ||
| Russian | Russian | National | ||
| Spanish | Spanish | National | ||
Level describes unit of estimation, whether based on information available for each language (‘language’), averaged over gridded data within the language polygon/s (‘polygon’), averaged over all gridded data for a 10,000 km2 circle centred on the language polygon (‘neighbourhood’), or information available at the national level, as a weighted average for the territories or nation states overlapped by each language polygon (‘national’). Endangerment level is based on EGIDS (Expanded Graded Intergenerational Disruption Scale) score from Glottolog V4.2.1[8], analysed as an ordered 7-level scale (see Supplementary Table 1). Languages were assigned to regions as described in the Supplementary Methods (section 2.1.3). Language polygons are derived from World Language Mapping System (WLMS) as described in Supplementary Methods (section 2). Details of all variables are given in Supplementary Methods. The column ‘Tr.’ lists transformations applied to each variable following the procedure described in Supplementary Information (section 4.1; log (L), square (S), square root (SR), signed squared root (SSR), cube (C)).
Extended Data Fig. 1Residual in the best model for language endangerment level.
Residuals of the model predicting number of endangered languages (a) and Sleeping languages (b), calculated, for each hex grid, as the predicted number of languages with distribution in the hex grid and with (A) predicted endangerment level above ‘Stable’ (corresponding to EGIDS 6b-10) and (B) predicted to be no longer spoken (ie EGIDS 9-10) minus the number of languages with distribution in the hex grid and with reported EGIDS from 6b-10 (A) and from 9-10 (B). The predicted number of languages in each category is calculated by using the best model to estimate the probability distribution of endangerment level for each language with distribution in the hex grid, sampling from the probability distribution the endangerment level of each language, repeating the sampling 1000 times, and averaging the number of languages with sampled endangerment level of endangered or Sleeping over the 1000 times. A negative value (blue) indicates that the model estimates fewer endangered or Sleeping languages than the reported EGIDS score from Ethnologue (e17/e16). A positive value (red) indicates the model estimating a greater number of endangered or Sleeping languages than observed. In some cases, this could indicate higher ‘latent risk’, for languages that have many of the predictors of high endangerment but are currently rated as stable or at a lower level of endangerment. Dark grey areas do not have data for all the independent variables in the best model for language endangerment level. Language distribution data from WLMS 16 worldgeodatasets.com.
Extended Data Fig. 2Current and future predicted distribution of endangered languages.
The current patterns of language endangerment are plotted as absolute number of languages with a reported EGIDS score of 6b-10 with distribution in each hex grid. a) the number of languages with observed EGIDS from 6b to 10 at present. b) the predicted number of languages with EGIDS from 6b to 10 in 40 years minus the predicted number of languages with EGIDS from 6b to 10 at present. c) the predicted number of languages with EGIDS from 6b to 10 in 80 years minus the predicted number of languages with EGIDS from 6b to 10 in 40 years. The predicted number of languages is calculated in the same way as Supplementary Fig. 7. Dark grey areas have no data for independent variables in the best model for language endangerment level. Language distribution data from WLMS 16 worldgeodatasets.com.
Fig. 1Current patterns of language endangerment expressed as the proportion of languages overlapping each hex grid that are currently rated threatened or above (EGIDS 6b–10; see Supplementary Information Table 1).
Each hexagon represents approximately 415,000 km2. The coloured bars show the predictors of level of endangerment identified in the best model for a global language database of 6,511 languages, and for each of 12 regions any additional influences on patterns of language endangerment (see Supplementary Data 3). Dark grey areas on the map do not have data for all the independent variables in the best model for language endangerment level. Language distribution data are from WLMS 16 (worldgeodatasets.com).
Fig. 2Model predictions for areas where languages are likely to become endangered (EGIDS ≥ 6b) in the next 40 years, given the best model.
a,b, The red shading represents the differences between the predicted values at present and the predicted values in 40 years, for the absolute number (a) and proportion of languages (b) per hex grid, based on generational shift and demographic transition in L1 speakers. c, Proportion of languages predicted to become Sleeping (EGIDS ≥ 9) in the next 40 years. See Supplementary Table 1 for information on endangerment scales. Language distribution data from WLMS 16 (worldgeodatasets.com).
Fig. 3Estimated future loss of linguistic diversity.
a, Current and predicted proportion of languages that are endangered (EGIDS 6b–8b) or Sleeping (no living L1 speakers, EGIDS 9–10). b,c, Current and predicted number of endangered (6b–8b) (b) and Sleeping (9–10) (c) languages according to the current level of language documentation. Each violin gives the probability distribution of the number or proportion of languages that are predicted to be endangered or Sleeping, with the dot showing the mean and the whisker showing the standard deviation. Each dashed line shows the number or proportion of languages that are currently endangered or Sleeping. This figure projects current levels of documentation for each language, hence does not reflect future documentation efforts of threatened languages.
Extended Data Fig. 4Current and future predicted number of languages no longer spoken.
a) the number of languages with observed EGIDS from 9 to 10 at present. b) the predicted number of languages with EGIDS from 9 to 10 in 40 years minus the predicted number of languages with EGIDS from 9 to 10 at present. c) the predicted number of languages with EGIDS from 9 to 10 in 80 years minus the predicted number of languages with EGIDS from 9 to 10 in 40 years. The predicted number of languages is calculated in the same way as Fig. 7. Dark grey areas have no data for independent variables in the best model for language endangerment level. Language distribution data from WLMS 16 worldgeodatasets.com.
Extended Data Fig. 5Current and future predicted proportion of languages no longer spoken.
The proportion of Sleeping languages with distribution in each hex grid. a) the proportion of languages with observed EGIDS from 9 to 10 at present. b) the predicted proportion of languages with EGIDS from 9 to 10 in 40 years minus the predicted proportion of languages with EGIDS from 9 to 10 at present. c) the predicted proportion of languages with EGIDS from 9 to 10 in 80 years minus the predicted proportion of languages with EGIDS from 9 to 10 in 40 years. The predicted proportion of languages is calculated as the predicted number of languages divided by the total number of languages with distribution in each hex grid, where the predicted number of languages is calculated in the same way as Fig. 7. Dark grey areas have no data for independent variables in the best model for language endangerment level. Language distribution data from WLMS 16 worldgeodatasets.com.