Demographic shifts, inter-group contact, and environmental conditions drive language extinction and diversification


Humans currently collectively use thousands of languages. The number of languages in a given region (i.e. language ‘richness’) varies widely. Understanding the processes of diversification and homogenization that produce these patterns has been a fundamental aim of linguistics and anthropology. Empirical research to date has identified various social, environmental, geographic, and demographic factors associated with language richness3. However, our understanding of causal mechanisms and variation in their effects over space has been limited by prior analyses focusing on correlation and assuming stationarity. Here we use process-based, spatially-explicit stochastic models to simulate the emergence, expansion, contraction, fragmentation, and extinction of language ranges. We varied combinations of parameter settings in these computer-simulated experiments to evaluate the extent to which different processes reproduce observed patterns of pre-colonial language richness in North America. We find that the majority of spatial variation in language richness can be explained by models in which environmental and social constraints determine population density, random shocks alter population sizes more frequently at higher population densities, and population shocks are more frequently negative than positive. Language diversification occurs when populations split after reaching size limits, and when ranges fragment due to population contractions following negative shocks or due to contact with other groups that are expanding following positive shocks. These findings support diverse theoretical perspectives arguing that language richness is shaped by environmental and social conditions, constraints on group sizes, outcomes of contact among groups, and shifting demographics driven by positive innovations, such as new subsistence strategies, or negative events, such as war or disease.