Preprints
Tentatively tracing Trans‐New Guinea: A phylogenetic evaluation of potential deeper relationships..
Greenhill SJ. In Press. Tentatively tracing Trans‐New Guinea: A phylogenetic evaluation of potential deeper relationships.. In Evans N & Fedden S (Eds). The Oxford Guide to the Papuan Languages. Oxford University Press: Oxford.
The Trans‐New Guinea language family is one of the world’s largest language families. Strikingly it is also one of the world’s least studied. There is ongoing debate about which of many languages should be included in Trans‐New Guinea and how these relate to each other. Resolving this debate is hard due to the complexities of studying New Guinea languages, and a lack of adequate data suitable for detailed historical linguistic work. These difficulties have led to suggestions that the only way forward is to wait for low‐level descriptive 15 field‐work and detailed bottom‐up historical …
Abstract PDF 10.31235/osf.io/628cvDemographic shifts, inter-group contact, and environmental conditions drive language extinction and diversification.
Pacheco Coelho MT, Haynie HJ, Bowern C, Coelho RK, Greenhill SJ, Kirby KR, Rangel TF, Gavin MC. Preprint. Demographic shifts, inter-group contact, and environmental conditions drive language extinction and diversification.
Humans currently collectively use thousands of languages. The number of languages in a given region (i.e. language 'richness') varies widely. Understanding the processes of diversification and homogenization that produce these patterns has been a fundamental aim of linguistics and anthropology. Empirical research to date has identified various social, environmental, geographic, and demographic factors associated with language richness3. However, our understanding of causal mechanisms and variation in their effects over space has been limited by prior analyses focusing on correlation and …
Abstract PDF 10.31235/osf.io/xqr2u
2024
Methods in Malayo-Polynesian comparative-historical linguistics.
Ross M, & Greenhill SJ. 2024. Methods in Malayo-Polynesian comparative-historical linguistics. In Adelaar A & Schapper A (Eds) The Oxford Guide to the Malayo-Polynesian Languages of Southeast Asia. Oxford: Oxford University Press.
This chapter considers methodological issues in the classification and subgrouping of Malayo-Polynesian (MP) languages. It compares applications of the traditional comparative method and newer Bayesian phylogenetics in Austronesian historical linguistics, arguing that they present complementary rather than competitive approaches. The comparison can be used to illuminate contentious points in the MP tree. The chapter discusses the application of methods to MP by alluding to the higher-order phylogeny of Austronesian on which most Austronesianist historical linguists agree.
Abstract PDF 10.1093/oso/9780198807353.003.0003Bayesian phylogenetic analysis of Philippine languages supports a rapid migration of Malayo Polynesian languages.
King B, Greenhill SJ, Reid LA, Ross M, Walworth M, & Gray R. 2024. Bayesian phylogenetic analysis of Philippine languages supports a rapid migration of Malayo Polynesian languages. Scientific Reports, 14, 14967.
The Philippines are central to understanding the expansion of the Austronesian language family from its homeland in Taiwan. It remains unknown to what extent the distribution of Malayo-Polynesian languages has been shaped by back migrations and language leveling events following the initial Out-of-Taiwan expansion. Other aspects of language history, including the effect of language switching from non-Austronesian languages, also remain poorly understood. Here we apply Bayesian phylogenetic methods to a core-vocabulary dataset of Philippine languages. Our analysis strongly supports a sister …
Abstract PDF 10.1038/s41598-024-65810-xThe evolutionary dynamics of how languages signal who does what to whom.
Shcherbakova O, Blasi DE, Gast V, Skirgård H, Gray RD, & Greenhill SJ. 2024. The evolutionary dynamics of how languages signal who does what to whom. Scientific Reports, 14, 7259.
Languages vary in how they signal “who does what to whom”. Three main strategies to indicate the participant roles of “who” and “whom” are case, verbal indexing, and rigid word order. Languages that disambiguate these roles with case tend to have either verb-final or flexible word order. Most previous studies that found these patterns used limited language samples and overlooked the causal mechanisms that could jointly explain the association between all three features. Here we analyze grammatical data from a Grambank sample of 1705 languages with phylogenetic causal graph methods. Our results …
Abstract PDF 10.1038/s41598-024-51542-5
2023
Variation in phoneme inventories: quantifying the problem and improving comparability.
Anderson C, Tresoldi T, Greenhill SJ, Forkel R, Gray RD & List JML. 2023. Variation in phoneme inventories: quantifying the problem and improving comparability. Journal of Language Evolution, 11, lzad011.
For over a century, the phoneme has played a central role in linguistic research. In recent years, collections of phoneme inventories, originally designed for cross-linguistic purposes, have increasingly been used in comparative studies involving neighbouring disciplines. Despite the extended application of this type of data, there has been no research into its comparability or tests of its reliability. In this study, we carry out a systematic comparison of nine popular phoneme inventory collections. We render them comparable by linking them to standardised formats for the handling of …
Abstract PDF 10.1093/jole/lzad011Societies of strangers do not speak grammatically simpler languages.
Shcherbakova O, Michaelis SM, Haynie HJ, Passmore S, Gast V, Gray RD, Greenhill SJ, Blasi DE, & Skirgård H. 2023. Societies of strangers do not speak grammatically simpler languages. Science Advances, 9 (33), eadf7704.
Many recent proposals claim that languages adapt to their environments. The linguistic niche hypothesis claims that languages with numerous native speakers and substantial proportions of nonnative speakers (societies of strangers) tend to lose grammatical distinctions. In contrast, languages in small, isolated communities should maintain or expand their grammatical markers. Here, we test these claims using a global dataset of grammatical structures, Grambank. We model the impact of the number of native speakers, the proportion of nonnative speakers, the number of linguistic neighbors, and the …
Abstract PDF 10.1126/sciadv.adf7704A shared foundation of language change.
Greenhill SJ. 2023. A shared foundation of language change. Science, 6656, 374-375.
As the world changes, humans encounter new things that need to be described using a finite set of words. A common strategy for labeling these novelties is to reuse existing words—i.e., word meaning extension. For example, “mouse” can refer to a computer control device. Children also creatively overextend word meanings as they learn their languages. The need to name novelties has been present during the evolution of language, often resulting in the use of one word to express two different meanings. For example, Russian labels (colexifies) both “tree” and “wood” with “derevo” (1); this is a …
Abstract 10.1126/science.adj2154Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages.
Heggarty PH et al. 2023. Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages. Science, 381, abg0818.
Almost half the world’s population speaks a language of the Indo-European language family. It remains unclear, however, where this family’s common ancestral language (Proto-Indo-European) was initially spoken and when and why it spread through Eurasia. The “Steppe” hypothesis posits an expansion out of the Pontic-Caspian Steppe, no earlier than 6500 years before present (yr B.P.), and mostly with horse-based pastoralism from ~5000 yr B.P. An alternative “Anatolian” or “farming” hypothesis posits that Indo-European dispersed with agriculture out of parts of the Fertile Crescent, beginning as …
Abstract PDF 10.1126/science.abg0818 OverviewSubgrouping in a `dialect continuum': A Bayesian phylogenetic analysis of the Mixtecan language family.
Auderset S, Greenhill SJ, DiCanio CT & Campbell EW. 2023. Subgrouping in a 'dialect continuum': A Bayesian phylogenetic analysis of the Mixtecan language family. Journal of Language Evolution.
Subgrouping language varieties within dialect continua poses challenges for the application of the comparative method of historical linguistics, and similar claims have been made for the use of Bayesian phylogenetic methods. In this article, we present the first Bayesian phylogenetic analysis of the Mixtecan language family of southern Mexico and show that the method produces valuable results and new insights with respect to subgrouping beyond what the comparative method and dialect geography have provided. Our findings reveal potential new subgroups that should be further investigated. We …
Abstract PDF 10.1093/jole/lzad004Kinbank: A global database of kinship terminology.
Passmore S, Barth W, Greenhill SJ, Quinn K, Sheard C, Argyriou P, Birchall J, Bowern C, Calladine J, Deb A, Diederen A, Metsäranta NP, Araujo LH, Schembri R, Hickey-Hall J, Honkola T, Mitchell A, Poole L, Rácz PM, Roberts SG, Ross RM, Thomas-Colquhoun E, Evans N, Jordan FM. 2023. Kinbank: A global database of kinship terminology. PLoS One, 18(5), e0283218.
For a single species, human kinship organization is both remarkably diverse and strikingly organized. Kinship terminology is the structured vocabulary used to classify, refer to, and address relatives and family. Diversity in kinship terminology has been analyzed by anthropologists for over 150 years, although recurrent patterning across cultures remains incompletely explained. Despite the wealth of kinship data in the anthropological record, comparative studies of kinship terminology are hindered by data accessibility. Here we present Kinbank, a new database of 210,903 kinterms from a global …
Abstract PDF 10.1371/journal.pone.0283218 OverviewLanguage Phylogenies: Modelling the evolution of language..
Greenhill SJ. 2023. Language Phylogenies. The Oxford Handbook of Cultural Evolution, C61P1-C61P248.
Recent years have seen Bayesian phylogenetic methods from evolutionary biology applied to questions about language evolution in two major contexts. First, language phylogenies are now routinely used to make inferences and test hypotheses about human prehistory. Second, language phylogenies provide a solid backbone 10 to test hypotheses about how aspects of language and culture have evolved in three key ways: by revealing the evolutionary dynamics, by modelling the trait history, and testing coevolutionary hypotheses. In this chapter I will survey this literature, present some case studies that …
Abstract PDF 10.1093/oxfordhb/9780198869252.013.61Grambank’s Typological Advances Support Computational Research on Diverse Languages.
Haynie H, Blasi DE, Skirgård H, Greenhill SJ, Atkinson QD, & Gray RD. Grambank’s Typological Advances Support Computational Research on Diverse Languages. In Beinborn L, Goswami K, Muradoğlu S, Sorokin A, Kumar R, Shcherbakov A, Ponti EM, Cotterell R & Vylomova E. Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP (SIGTYP). Association for Computational Linguistics: Dubrovnik, Croatia.
Of approximately 7,000 languages around the world, only a handful have abundant computational resources. Extending the reach of language technologies to diverse, less-resourced languages is important for tackling the challenges of digital equity and inclusion. Here we introduce the Grambank typological database as a resource to support such efforts. To date, work that uses typological data to extend computational research to less-resourced languages has relied on cross-linguistic morphosyntax datasets that are sparsely populated, use categorical coding that can be difficult to interpret, and …
Abstract PDF OverviewGrambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss.
Skirgård H ....Greenhill SJ, Atkinson QD, & Gray RD. 2023. Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss. Science Advances, 9, eadg6175.
While global patterns of human genetic diversity are increasingly well characterized, the diversity of human languages remains less systematically described. Here, we outline the Grambank database. With over 400,000 data points and 2400 languages, Grambank is the largest comparative grammatical database available. The comprehensiveness of Grambank allows us to quantify the relative effects of genealogical inheritance and geographic proximity on the structural diversity of the world’s languages, evaluate constraints on linguistic diversity, and identify the world’s most unusual languages. An …
Abstract PDF 10.1126/sciadv.adg6175 OverviewModelling admixture across language levels to evaluate deep history claims.
Hübler N & Greenhill SJ. 2023. Modelling admixture across language levels to evaluate deep history claims. Journal of Language Evolution.
The so-called ‘Altaic’ languages have been subject of debate for over 200 years. An array of different data sets have been used to investigate the genealogical relationships between them, but the controversy persists. The new data with a high potential for such cases in historical linguistics are structural features, which are sometimes declared to be prone to borrowing and discarded from the very beginning and at other times considered to have an especially precise historical signal reaching further back in time than other types of linguistic data. We investigate the performance of …
Abstract PDF 10.1093/jole/lzad002A recent northern origin for the Uto-Aztecan family.
Greenhill SJ, Haynie H, Ross R, Chira A, List J-M, Campbell L, Botero C, & Gray R. 2023. A recent northern origin for the Uto-Aztecan family. Language.
The Uto-Aztecan language family is one of the largest language families in the Americas. However, there has been considerable debate about its origin and how it spread. Here we use Bayesian phylogenetic methods to analyze lexical data from thirty-four Uto-Aztecan varieties and two Kiowa-Tanoan languages. We infer the age of Proto-Uto-Aztecan to be around 4,100 years (3,258–5,025 years) and identify the most likely homeland to be near what is now Southern California. We reconstruct the most probable subsistence strategy in the ancestral Uto-Aztecan society and infer no casual or intensive …
Abstract PDF 10.1353/lan.0.0276
2022
Untangling the evolution of body-part terminology in Pano: conservative versus innovative traits in body-part lexicalization.
Zariquiey R, Vera J, Greenhill SJ, Valenzuela P, Gray RD, & List J-M. 2022. Untangling the evolution of body-part terminology in Pano: conservative versus innovative traits in body-part lexicalization. Interface Focus, 13(1).
Although language-family specific traits which do not find direct counterparts outside a given language family are usually ignored in quantitative phylogenetic studies, scholars have made ample use of them in qualitative investigations, revealing their potential for identifying language relationships. An example of such a family specific trait are body-part expressions in Pano languages, which are often lexicalized forms, composed of bound roots (also called body-part prefixes in the literature) and non-productive derivative morphemes (called here body-part formatives). We use various …
Abstract PDF 10.1098/rsfs.2022.0053A quantitative global test of the complexity trade-off hypothesis: the case of nominal and verbal grammatical marking.
Shcherbakova O, Gast V, Blasi DE, Skirgård H, Gray RD, & Greenhill SJ. 2022. A quantitative global test of the complexity trade-off hypothesis: the case of nominal and verbal grammatical marking. Linguistics Vanguard.
Nouns and verbs are known to differ in the types of grammatical information they encode. What is less well known is the relationship between verbal and nominal coding within and across languages. The equi-complexity hypothesis holds that all languages are equally complex overall, which entails trade-offs between coding in different domains. From a diachronic point of view, this hypothesis implies that the loss and gain of coding in different domains can be expected to balance each other out. In this study, we test to what extent such inverse coevolution can be observed in a sample of 244 …
Abstract PDF 10.1515/lingvan-2021-0011Grammatical complexity is only weakly influenced by the sociolinguistic environment.
Shcherbakova O, Michaelis SM, Haynie HJ, Greenhill SJ, Blasi DE, Gray RD, Gast V, & Skirgård H. 2022. Grammatical complexity is only weakly influenced by the sociolinguistic environment. Pp. 669-671, In Ravignani A, Asano R, Valente D, Ferretti F, Hartmann S, Hayashi M, Jadoul Y, Martins M, Oseki Y, Rodrigues ED, Vasileva O, & Wacewicz S. (Eds). Proceedings of the Joint Conference on Language Evolution (JCoLE). Joint Conference on Language Evolution (JCoLE). Nijmegen: Joint Conference on Language Evolution (JCoLE).
Recent studies claim that the social environment influences the evolution of language structures. In particular, grammatical complexity has been proposed to be lower in communities with looser social networks, higher numbers of L1 speakers, and higher proportions of L2 speakers (among others, Kusters 2003, Trudgill 2011, Lupyan & Dale 2010, Sinnemäki & Di Garbo 2018). The explanation for these relationships relies on the assumption that larger communities are exposed to more contact than smaller ones. Specifically, due to substantial proportions of L2 speakers in large communities, the more …
Abstract PDF 10.17617/2.3398549A global analysis of matches and mismatches between human genetic and linguistic histories..
Barbieri C, Blasi DE, Arango-Isaza E, Sotiropoulos AG, Hammarström H, Wichmann S, Greenhill SJ, Gray RD, Forkel R, Bickel B, & Shimizu KK. 2022. A global analysis of matches and mismatches between human genetic and linguistic histories. Proceedings of the National Academy of Sciences, 119(47).
Human history is written in both our genes and our languages. The extent to which our biological and linguistic histories are congruent has been the subject of considerable debate, with clear examples of both matches and mismatches. To disentangle the patterns of demographic and cultural transmission, we need a global systematic assessment of matches and mismatches. Here, we assemble a genomic database (GeLaTo, or Genes and Languages Together) specifically curated to investigate genetic and linguistic diversity worldwide. We find that most populations in GeLaTo that speak languages of the same …
Abstract PDF 10.1073/pnas.2122084119Phylogeographic analysis of the Bantu language expansion supports a rainforest route.
Koile E, Greenhill SJ, Blasi DE, Bouckaert R, & Gray RD. 2022. Phylogeographic analysis of the Bantu language expansion supports a rainforest route. Proceedings of the National Academy of Sciences, 119(32) e2112853119.
The Bantu expansion transformed the linguistic, economic, and cultural composition of sub-Saharan Africa. However, the exact dates and routes taken by the ancestors of the speakers of the more than 500 current Bantu languages remain uncertain. Here, we use the recently developed “break-away” geographical diffusion model, specially designed for modeling migrations, with “augmented” geographic information, to reconstruct the Bantu language family expansion. This Bayesian phylogeographic approach with augmented geographical data provides a powerful way of linking linguistic, archaeological, and …
Abstract PDF 10.1073/pnas.2112853119Lexibank, a public repository of standardized wordlists with computed phonological and lexical features.
List JM, Forkel R, Greenhill SJ, Rzymski C, Englisch J & Gray RD. 2022. Lexibank, a public repository of standardized wordlists with computed phonological and lexical features. Scientific Data, 9(1): 316.
the past decades have seen substantial growth in digital data on the world’s languages. at the same time, the demand for cross-linguistic datasets has been increasing, as witnessed by numerous studies devoted to diverse questions on human prehistory, cultural evolution, and human cognition. Unfortunately, most published datasets lack standardization which makes their comparison difficult. Here, we present a new approach to increase the comparability of cross-linguistic lexical data. We have designed workflows for the computer-assisted lifting of datasets to Cross-Linguistic Data Formats, a …
Abstract PDF 10.1038/s41597-022-01432-0Managing Historical Linguistic Data for Computational Phylogenetics and Computer-Assisted Language Comparison.
Tresoldi T, Rzymski C, Forkel R, Greenhill SJ, List JM, & Gray R. 2022. Managing historical linguistic data for computational phylogenetics and computer-assisted language comparison. In Andrea L. Berez-Kroeker, Bradley McDonnell, Eve Koller, & Lauren B. Collister (Eds). Open Handbook of Linguistic Data Management.
Computational phylogenetics is a relatively recent branch of historical linguistics that uses quantitative techniques to investigate the history of related languages. As the classical comparative method is less explicit on the techniques for constructing phylogenies of language families (see discussion in Jacques & List 2019), such a new approach can complement traditional techniques for sub-grouping based on shared innovations (Ross & Durie 1996).
Abstract PDF 10.7551/mitpress/12200.001.0001
2021
Global predictors of language endangerment and the future of linguistic diversity..
Bromham L, Dinnage R, Skirgård H, Ritchie A, Cardillo M, Meakins F, Greenhill S & Hua X. 2021. Global predictors of language endangerment and the future of linguistic diversity. Nature Ecology & Evolution, 6: 163–173.
Language diversity is under threat. While each language is subject to specific social, demographic and political pressures, there may also be common threatening processes. We use an analysis of 6,511 spoken languages with 51 predictor variables spanning aspects of population, documentation, legal recognition, education policy, socioeconomic indicators and environmental features to show that, counter to common perception, contact with other languages per se is not a driver of language loss. However, greater road density, which may encourage population movement, is associated with increased …
Abstract PDF 10.1038/s41559-021-01604-yGames and enculturation: A cross-cultural analysis of cooperative goal structures in Austronesian games..
Leisterer-Peoples SM, Ross CT, Greenhill SJ, Hardecker S & Haun DBM. 2021. Games and enculturation: A cross-cultural analysis of cooperative goal structures in Austronesian games. PLOS ONE 16(11): e0259746.
While most animals play, only humans play games. As animal play serves to teach offspring important life-skills in a safe scenario, human games might, in similar ways, teach important culturally relevant skills. Humans in all cultures play games; however, it is not clear whether variation in the characteristics of games across cultural groups is related to group-level attributes. Here we investigate specifically whether the cooperativeness of games covaries with socio-ecological differences across cultural groups. We hypothesize that cultural groups that engage in frequent inter-group …
Abstract PDF 10.1371/journal.pone.0259746Do languages and genes share cultural evolutionary history?.
Greenhill SJ. 2021. Do languages and genes share cultural evolutionary history? Science Advances, eabm2472.
Languages and genes tell stories about the past but statistical analysis reveals that these are not always the same.
Abstract PDF 10.1126/sciadv.abm2472Bayesian phylogenetic analysis of linguistic data using BEAST..
Hoffmann K, Bouckaert R, Greenhill SJ, & Kühnert D. 2021. Bayesian phylogenetic analysis of linguistic data using BEAST. Journal of Language Evolution, 6: 119–135.
Bayesian phylogenetic methods provide a set of tools to efficiently evaluate large linguistic datasets by reconstructing phylogenies—family trees—that represent the history of language families. These methods provide a powerful way to test hypotheses about prehistory, regarding the subgrouping, origins, expansion, and timing of the languages and their speakers. Through phylogenetics, we gain insights into the process of language evolution in general and into how fast individual features change in particular. This article introduces Bayesian phylogenetics as applied to languages. We describe …
Abstract PDF 10.1093/jole/lzab005Pathways to social inequality..
Haynie H, Kavanagh PH, Jordan FM, Ember CR, Gray RD, Greenhill SJ, Kirby KR, Kushnick G, Low BS, Tuff T, Vilela B, Botero C, & Gavin MC. 2021. Pathways to social inequality. Evolutionary Human Sciences, 3, E35.
Social inequality is ubiquitous in contemporary human societies, and has deleterious social and ecological impacts. However, the factors that shape the emergence and maintenance of inequality remain widely debated. Here we conduct a global analysis of pathways to inequality by comparing 408 non-industrial societies in the anthropological record (described largely between 1860 and 1960) that vary in degree of inequality. We apply structural equation modelling to open-access environmental and ethnographic data and explore two alternative models varying in the links among factors proposed by …
Abstract PDF 10.1017/ehs.2021.32Kin Against Kin: Internal Co-selection and the Coherence of Kinship Typologies..
Passmore S, Barth W, Quinn K, Greenhill SJ, Evans N, & Jordan FM. 2021. Kin Against Kin: Internal Co-selection and the Coherence of Kinship Typologies. Biological Theory, 16(3), 176–193.
Across the world people in different societies structure their family relationships in many different ways. These relationships become encoded in their languages as kinship terminology, a word set that maps variably onto a vast genealogical grid of kinship categories, each of which could in principle vary independently. But the observed diversity of kinship terminology is considerably smaller than the enormous theoretical design space. For the past century anthropologists have captured this variation in typological schemes with only a small number of model system types. Whether those types …
Abstract PDF 10.1007/s13752-021-00379-6Blowing in the wind: Using ‘North Wind and the Sun’ texts to sample phoneme inventories..
Baird L, Evans N, & Greenhill SJ. 2021. Blowing in the wind: Using 'North Wind and the Sun' texts to sample phoneme inventories. Journal of the International Phonetic Association, 1-42.
Language documentation faces a persistent and pervasive problem: How much material is enough to represent a language fully? How much text would we need to sample the full phoneme inventory of a language? In the phonetic/phonemic domain, what proportion of the phoneme inventory can we expect to sample in a text of a given length? Answering these questions in a quantifiable way is tricky, but asking them is necessary. The cumulative collection of Illustrative Texts published in the Illustration series in this journal over more than four decades (mostly renditions of the ‘North Wind and the Sun’) …
Abstract PDF 10.1017/S002510032000033XHistorical, archaeological and linguistic evidence test the phylogenetic inference of Viking-Age plant use..
Teixidor-Toneu I, Kool A, Greenhill SJ, Kjesrud K, Sandstedt JJ, Manzanilla V, Jordan FM. 2021. Historical, archaeological and linguistic evidence test the phylogenetic inference of Viking-Age plant use. Philosophical Transactions of the Royal Society B: Biological Sciences, 376, 20200086.
In this paper, past plant knowledge serves as a case study to highlight the promise and challenges of interdisciplinary data collection and interpretation in cultural evolution. Plants are central to human life and yet, apart from the role of major crops, people–plant relations have been marginal to the study of culture. Archaeological, linguistic, and historical evidence are often limited when it comes to studying the past role of plants. This is the case in the Nordic countries, where extensive collections of various plant use records are absent until the 1700s. Here, we test if relatively …
Abstract PDF 10.1098/rstb.2020.0086The uses and abuses of tree thinking in cultural evolution..
Evans CL, Greenhill SJ, Watts J, List JM, Botero CA, Gray RD, & Kirby KR. 2021. The uses and abuses of tree thinking in cultural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 376, 20200056.
Modern phylogenetic methods are increasingly being used to address questions about macro-level patterns in cultural evolution. These methods can illuminate the unobservable histories of cultural traits and identify the evolutionary drivers of trait change over time, but their application is not without pitfalls. Here, we outline the current scope of research in cultural tree thinking, highlighting a toolkit of best practices to navigate and avoid the pitfalls and 'abuses' associated with their application. We emphasize two principles that support the appropriate application of phylogenetic …
Abstract PDF 10.1098/rstb.2020.0056The Austronesian Game Taxonomy: A cross-cultural dataset of historical games..
Leisterer-Peoples SM, Hardecker S, Watts J, Greenhill SJ, Ross CT & Haun DBM. 2021. The Austronesian Game Taxonomy: A cross-cultural dataset of historical games. Humanities & Social Sciences Communications, 8, 113.
Humans in most cultures around the world play rule-based games, yet research on the content and structure of these games is limited. Previous studies investigating rule-based games across cultures have either focused on a small handful of cultures, thus limiting the generalizability of findings, or used cross-cultural databases from which the raw data are not accessible, thus limiting the transparency, applicability, and replicability of research findings. Furthermore, games have long been defined as competitive interactions, thereby blinding researchers to the cross-cultural variation in the …
Abstract PDF 10.1057/s41599-021-00785-y
2020
Bayesian Phylolinguistics.
Greenhill SJ, Heggarty P, & Gray RD. 2020 Bayesian Phylolinguistics. In Janda RD, Joseph BD, & Vance BS (Eds) The Handbook of Historical Linguistics, Volume II, pp. 226--253. Wiley-Blackwell: New Jersey.
Change is coming to historical linguistics. Big, or at least “big‐ish” data (Gray and Watts 2017), are now becoming increasingly available in the form of large web‐ accessible lexical, typological, and phonological databases (e.g., Greenhill et al. 2008, Bowern 2016, Moran et al. 2014, Dryer and Haspelmath 2013, Bickel et al. 2017) and the soon to be released Lexibank, Grambank, Parabank, and Numeralbank (http:// www.shh.mpg.de/180672/glottobank). This deluge of data is way beyond the ability of any one person to process accurately in their head. The deluge will thus inevitably drive the …
Abstract 10.1002/9781118732168.ch11 OverviewCHIELD: the causal hypotheses in evolutionary linguistics database..
Roberts SG, Killin A, Deb A, Sheard C, Greenhill SJ, Sinnemäki K, …, & Jordan F. 2020. CHIELD: the causal hypotheses in evolutionary linguistics database. The Journal of Language Evolution.
Language is one of the most complex of human traits. There are many hypotheses about how it originated, what factors shaped its diversity, and what ongoing processes drive how it changes. We present the Causal Hypotheses in Evolutionary Linguistics Database (CHIELD, https://chield.excd.org/), a tool for expressing, exploring, and evaluating hypotheses. It allows researchers to integrate multiple theories into a coherent narrative, helping to design future research. We present design goals, a formal specification, and an implementation for this database. Source code is freely available for …
Abstract PDF 10.1093/jole/lzaa001 OverviewPhylogenetic exploration of language complexity in Austronesian, Bantu, and Indo-European Language Families.
Shcherbakova O, Skirgård H, & Greenhill SJ. 2020. Phylogenetic exploration of language complexity in Austronesian, Bantu, and Indo-European Language Families. Pp. 411-413, In Ravignani A, Barbieri C, Flaherty M, Jadoul Y, Lattenkamp E, Little H, Martins M, Mudd K, & Verhoef T (Eds). The Evolution of Language: Proceedings of the 13th International Conference (Evolang13). Nijmegen: The Evolution of Language Conferences.
While language complexity has received attention from sociolinguistic, psycholinguistic, and computational perspectives, the processes of simplification and complexification over time remain challenging to examine and explain. One strand of research focuses on complexity ‘tradeoffs’ and ‘local complexity’ asking whether complexification in one grammatical domain necessitates simplification in another so that all languages are 'equi-complex' (Miestamo, 2009, Sinnemäki, 2008). The tradeoffs may or may not occur between different language systems, such as phonetics and morphology (Shosted, 2006), …
Abstract PDFThe Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies.
Rzymski C, Tresoldi T, Greenhill SJ, Wu M-S, Schweikhard NE, Koptjevskaja-Tamm M, Gast V, et al. 2020. The Database of Cross-Linguistic Colexifications, Reproducible Analysis of Cross-Linguistic Polysemies. Scientific Data 7 (1): 1–12.
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets. Here we present CLICS, a Database of Cross-Linguistic Colexifications (CLICS). CLICS tackles interconnected interdisciplinary research questions about the colexification of words across semantic categories in the …
Abstract PDF 10.1038/s41597-019-0341-x Overview
2019
Emotion semantics show both cultural variation and universal structure.
Jackson JC, Watts J, Henry TR, List JM, Forkel R, Mucha PJ, Greenhill SJ, Gray RD, & Lindquist KA. 2019 Emotion semantics show both cultural variation and universal structure. Science, 366, 1517-1522.
It is unclear whether emotion terms have the same meaning across cultures. Jackson et al. examined nearly 2500 languages to determine the degree of similarity in linguistic networks of 24 emotion terms across cultures (see the Perspective by Majid). There were low levels of similarity, and thus high variability, in the meaning of emotion terms across cultures. Similarity of emotion terms could be predicted on the basis of the geographic proximity of the languages they originate from, their hedonic valence, and the physiological arousal they evoke. Many human languages have words for emotions …
Abstract PDF 10.1126/science.aaw8160 OverviewDated language phylogenies shed light on the ancestry of Sino-Tibetan.
Sagart L, Jacques G, Lai Y, Ryder RJ, Thouzeau V, Greenhill SJ, List J- M. 2019 Dated language phylogenies shed light on the ancestry of Sino-Tibetan. Proceedings of the National Academy of Sciences, 201817972.
Given its size and geographical extension, Sino-Tibetan is of the highest importance for understanding the prehistory of East Asia, and of neighboring language families. Based on a dataset of 50 Sino-Tibetan languages, we infer phylogenies that date the origin of the language family to around 7200 B.P., linking the origin of the language family with the late Cishan and the early Yangshao cultures.The Sino-Tibetan language family is one of the world{ extquoteright}s largest and most prominent families, spoken by nearly 1.4 billion people. Despite the importance of the Sino-Tibetan languages, …
Abstract PDF 10.1073/pnas.1817972116 OverviewThe ecological drivers of variation in global language diversity.
Hua X, Greenhill SJ, Cardillo M, Schneemann H & Bromham L. 2019. The ecological drivers of variation in global language diversity. Nature Communications, 10, 2047.
Language diversity is distributed unevenly over the globe. Intriguingly, patterns of language diversity resemble biodiversity patterns, leading to suggestions that similar mechanisms may underlie both linguistic and biological diversification. Here we present the first global analysis of language diversity that compares the relative importance of two key ecological mechanisms – isolation and ecological risk – after correcting for spatial autocorrelation and phylogenetic non-independence. We find significant effects of climate on language diversity, consistent with the ecological risk …
Abstract PDF 10.1038/s41467-019-09842-2Drivers of geographical patterns of North American language diversity.
Pacheco Coelho MT, Barreto Pereira E, Haynie HJ, Rangel TF, Kavanagh P, Kirby KR, Greenhill SJ, Bowern C, Gray RD, Colwell RK, Evans N, & Gavin MC. 2019. Drivers of geographical patterns of North American language diversity. Proceedings of the Royal Society, B, Biological Sciences, 286: 20190242.
Although many hypotheses have been proposed to explain why humans speak so many languages and why languages are unevenly distributed across the globe, the factors that shape geographical patterns of cultural and linguistic diversity remain poorly understood. Prior research has tended to focus on identifying universal predictors of language diversity, without accounting for how local factors and multiple predictors interact. Here, we use a unique combination of path analysis, mechanistic simulation modelling, and geographically weighted regression to investigate the broadly described, but …
Abstract PDF 10.1098/rspb.2019.0242
2018
Treemaker.
Greenhill, SJ. 2018. Treemaker: A Python library for creating a Newick formatted tree from a set of classification strings. Journal of Open Source Software, 3(31), 1040.
treemaker is a Python library to convert a text-based classification schema into a Newick file for use in phylogenetic and bioinformatic programs. Research in linguistics or cultural evolution often produces or uses tree taxonomies or classifications. However, these are usually not in a format readily available for use in programs that can understand and manipulate trees.
Abstract PDF 10.21105/joss.01040 Code WebsiteParasites and politics: why cross-cultural studies must control for relatedness, proximity and covariation.
Bromham L, Hua X, Cardillo M, Schneemann H & Greenhill SJ. 2018. Parasites and politics: why cross-cultural studies must control for relatedness, proximity and covariation. Royal Society Open Science, 5, 191100.
A growing number of studies seek to identify predictors of broad-scale patterns in human cultural diversity, but three sources of non-independence in human cultural variables can bias the results of cross-cultural studies. First, related cultures tend to have many traits in common, regardless of whether those traits are functionally linked. Second, societies in geographical proximity will share many aspects of culture, environment and demography. Third, many cultural traits covary, leading to spurious relationships between traits. Here, we demonstrate tractable methods for dealing with all …
Abstract PDF 10.1098/rsos.181100CLICS2: An Improved Database of Cross-Linguistic Colexifications Assembling Lexical Data with Help of Cross-Linguistic Data Formats.
List J-M, Greenhill SJ, Anderson C, Mayer T, Tresoldi T & Forkel R. 2018. CLICS2: An Improved Database of Cross-Linguistic Colexifications Assembling Lexical Data with Help of Cross-Linguistic Data Formats. Linguistic Typology, 22: 277-306.
The Database of Cross-Linguistic Colexifications (CLICS), has established a computer-assisted framework for the interactive representation of cross-linguistic colexification patterns. In its current form, it has proven to be a useful tool for various kinds of investigation into cross-linguistic semantic associations, ranging from studies on semantic change, patterns of conceptualization, and linguistic paleontology. But CLICS has also been criticized for obvious shortcomings, ranging from the underlying dataset, which still contains many errors, up to the limits of cross-linguistic …
Abstract PDF 10.1515/lingty-2018-0010 WebsiteSequence Comparison in Computational Historical Linguistics: Phonetic Alignments and Cognate Detection with LingPy 2.6..
List J-M, Forkel R, Greenhill SJ, Tresoldi T & Walworth M. 2018. Sequence Comparison in Computational Historical Linguistics: Phonetic Alignments and Cognate Detection with LingPy 2.6. Journal of Language Evolution, 3(2): 130-144.
With increasing amounts of digitally available data from all over the world, manual annotation of cognates in multilingual word lists becomes more and more time-consuming in historical linguistics. Using available software packages to pre-process the data prior to manual analysis can drastically speed up the process of cognate detection. Furthermore, it allows us to get a quick overview on data which has not yet been intensively studied by experts. LingPy is a Python library which provides a large arsenal of routines for sequence comparison in historical linguistics. With LingPy, linguists can …
Abstract PDF 10.1093/jole/lzy006Post-Marital Residence Patterns Show Lineage-Specific Evolution.
Moravec JC, Atkinson QD, Bowern C, Greenhill SJ, Jordan D, Ross RM, Gray RD, Marsland S & Cox MP. 2018. Post-Marital Residence Patterns Show Lineage-Specific Evolution. Evolution and Human Behavior, 39(6): 594-601.
Where a newly-married couple lives, termed post marital residence, varies cross-culturally and changes over time. While many factors have been proposed as drivers of this change, among them general features of human societies like warfare, migration and gendered division of subsistence labour, little is known about whether changes in residence patterns exhibit global regularities. Here, we study ethnographic observations of post-marital residence in societies from five large language families (Austronesian, Bantu, Indo-European, Pama-Nyungan and Uto-Aztecan), encompassing 371 ethnolinguistic …
Abstract PDF 10.1016/j.evolhumbehav.2018.06.002Comment on Pigoli et al.
Greenhill SJ. 2018. Comment on Pigoli et al. Journal of the Royal Statistical Society, C, Applied Statistics, 67(5): 1135-1136.
I thank Pigoli et al. for their interesting contribution. One question: have they evaluated the fit of their data onto the family tree of these languages? All the languages analysed are Romance languages, but there are degrees of relatedness within this grouping (e.g. Italian is the most divergent, while Portuguese is closer to Spanish than French). Is this analysis able to correctly identify these nested patterns? Rather than averaging across words, are the authors able to align each word separately in discrete manner and use this to recover phylogeny?
Abstract Get Paper 10.1111/rssc.12258Cross-Linguistic Data Formats, advancing data sharing and re-use in comparative linguistics.
Forkel R, List J-M, Greenhill SJ, Bank S, Rzymski C, Cysouw M, Hammarström H, Haspelmath M & Kaiping GA & Gray RD. 2018. Cross-linguistic Data Formats, advancing data sharing and reuse in comparative linguistics. Scientific Data, 5:180205.
The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.g. parallel texts, and dictionaries). The new specification for cross-linguistic data formats comes along with a software package for …
Abstract PDF 10.1038/sdata.2018.205 Website OverviewWhat smartphone apps may contribute to language evolution research.
Morin O, Winters J, Müller T, Morisseau T, Etter C & Greenhill SJ. 2018. What smartphone apps may contribute to language evolution research. Journal of Language Evolution, 3(2): 91-93.
Unlike a standard online experiment, a gaming app lets participants interact freely with a vast number of partners, as many times as they wish. The gain is not merely one of statistical power. Cultural evolutionists can use gaming apps to allow large numbers of participants to communicate synchronously; to build realistic transmission chains that avoid the losses of information that occurs in linear chains; and to study the effects of partner choice as well as partner control in social interactions. We are releasing an app designed to take advantage of these opportunities and generate …
Abstract Get Paper 110.1093/jole/lzy005Population Size and the Rate of Language Evolution: A Test Across Indo-European, Austronesian, and Bantu Languages.
Greenhill SJ, Hua X, Welsh CF, Schneemann H & Bromham L. 2018. Population Size and the Rate of Language Evolution: A Test Across Indo-European, Austronesian, and Bantu Languages. Frontiers in Psychology, 9:576.
What role does speaker population size play in shaping rates of language evolution? There has been little consensus on the expected relationship between rates and patterns of language change and speaker population size, with some predicting faster rates of change in smaller populations, and others expecting greater change in larger populations. The growth of comparative databases has allowed population size effects to be investigated across a wide range of language groups, with mixed results. One recent study of a group of Polynesian languages revealed greater rates of word gain in larger …
Abstract PDF 10.3389/fpsyg.2018.00576A Bayesian phylogenetic study of the Dravidian language family.
Kolipakam V, Jordan FM, Dunn M, Greenhill SJ, Bouckaert R, Gray RD & Verkerk A. 2018. A Bayesian phylogenetic study of the Dravidian language family. Royal Society Open Science 5: 171504.
The Dravidian language family consists of about 80 varieties (Hammarström H. 2016 Glottolog 2.7) spoken by 220 million people across southern and central India and surrounding countries (Steever SB. 1998 In The Dravidian languages (ed. SB Steever), pp. 1–39: 1). Neither the geographical origin of the Dravidian language homeland nor its exact dispersal through time are known. The history of these languages is crucial for understanding prehistory in Eurasia, because despite their current restricted range, these languages played a significant role in influencing other language groups including …
Abstract PDF 10.1098/rsos.171504 Overview
2017
The evolutionary dynamics of language systems.
Greenhill SJ, Wu C-H, Hua X, Dunn M, Levinson S, & Gray, RD. 2017. The evolutionary dynamics of language systems. Proceedings of the National Academy of Sciences: USA, 114(42):E8822-E8829.
Understanding how and why language subsystems differ in their evolutionary dynamics is a fundamental question for historical and comparative linguistics. One key dynamic is the rate of language change. While it is commonly thought that the rapid rate of change hampers the reconstruction of deep language relationships beyond 6,000–10,000 y, there are suggestions that grammatical structures might retain more signal over time than other subsystems, such as basic vocabulary. In this study, we use a Dirichlet process mixture model to infer the rates of change in lexical and grammatical data from 81 …
Abstract PDF 10.1073/pnas.1700388114 OverviewThe Potential of Automatic Word Comparison for Historical Linguistics.
List J-M, Greenhill SJ, & Gray RD. 2017. The Potential of Automatic Word Comparison for Historical Linguistics. PLoS ONE 12(1): e0170046.
The amount of data from languages spoken all over the world is rapidly increasing. Traditional manual methods in historical linguistics need to face the challenges brought by this influx of data. Automatic approaches to word comparison could provide invaluable help to pre-analyze data which can be later enhanced by experts. In this way, computational approaches can take care of the repetitive and schematic tasks leaving experts to concentrate on answering interesting questions. Here we test the potential of automatic methods to detect etymologically related words (cognates) in cross-linguistic …
Abstract PDF 10.1371/journal.pone.0170046
2016
A Combined Comparative and Phylogenetic Analysis of the Chapacuran Language Family.
Birchall J, Dunn M & Greenhill SJ. 2016. A Combined Comparative and Phylogenetic Analysis of the Chapacuran Language Family. International Journal of American Linguistics 82(3). 255–284.
The Chapacuran language family, with three extant members and nine historically attested lects, has yet to be classified following modern standards in historical linguistics. This paper presents an internal classification of these languages by combining both the traditional comparative method (CM) and Bayesian phylogenetic inference (BPI). We identify multiple systematic sound correspondences and 285 cognate sets of basic vocabulary using the available documentation. These allow us to reconstruct a large portion of the Proto-Chapacuran phonemic inventory and identify tentative major …
Abstract Get Paper 10.1086/687383D-PLACE: A Global Database of Cultural, Linguistic and Environmental Diversity.
Kirby KR, Gray RD, Greenhill SJ, Jordan FM, Gomes-Ng S, Bibiko H-J, Blasi D, Botero CA, Bowern C, Ember CR, Leehr D, Low BS, McCarter J, Divale W, Gavin MC. 2016. D-PLACE: A Global Database of Cultural, Linguistic and Environmental Diversity. PLoS ONE 11(7): e0158391.
From the foods we eat and the houses we construct, to our religious practices and political organization, to who we can marry and the types of games we teach our children, the diversity of cultural practices in the world is astounding. Yet, our ability to visualize and understand this diversity is limited by the ways it has been documented and shared: on a culture-by-culture basis, in locally-told stories or difficult-to-access repositories. In this paper we introduce D-PLACE, the Database of Places, Language, Culture, and Environment. This expandable and open-access database (accessible at …
Abstract PDF 10.1371/journal.pone.0158391 Code WebsiteCultural and Environmental Predictors of Pre-European Deforestation on Pacific Islands.
Atkinson QD, Coomber T, Passmore S, Greenhill SJ, & Kushnick G. 2016. Cultural and Environmental Predictors of Pre-European Deforestation on Pacific Islands. PLoS ONE 11(5): e0156340.
The varied islands of the Pacific provide an ideal natural experiment for studying the factors shaping human impact on the environment. Previous research into pre-European deforestation across the Pacific indicated a major effect of environment but did not account for cultural variation or control for dependencies in the data due to shared cultural ancestry and geographic proximity. The relative importance of environment and culture on Pacific deforestation and forest replacement and the extent to which environmental impact is constrained by cultural ancestry therefore remain unexplored. Here …
Abstract PDF 10.1371/journal.pone.0156340Overview: Debating the effect of environment on language.
Greenhill SJ. 2016. Overview: Debating the effect of environment on language. Journal of Language Evolution, 1, 30-32.
If languages do indeed evolve then they must show the three crucial aspects of an evolving system: variation of traits, inheritance of those traits, and the differential survival—that is selection—of those traits (Lewontin 1970). We know that languages vary (otherwise the fields of linguistic typology and sociolinguistics would be boring). We know that variation is passed through speech communities and inherited from parent language to its daughters (otherwise historical linguistics would likewise be boring). However, whether linguistic traits are selected for is much less clear (Ramsey and De …
Abstract PDF 10.1093/jole/lzv007Phylogemetric: A Python library for calculating phylogenetic network metrics.
Greenhill SJ. 2016. Phylogemetric: A Python library for calculating phylogenetic network metrics. Journal of Open Source Software, 1(2), 28.
Phylogemetric is a Python library for calculating the δ-score (Holland et al. 2002) and Q-Residual (Gray, Bryant, and Greenhill 2010) for phylogenetic data. These methods are used in studies of linguistic and cultural evolution to quantify reticulation in data. This Python library provides a command-line script interface for use on Nexus-formatted data (D. R. Maddison, Swofford, and Maddison 1997), and an importable Python library for use on any binary matrix.
Abstract PDF 10.21105/joss.00028 Code Website
2015
Links between language diversity and species richness can be confounded by spatial autocorrelation.
Cardillo M, Bromham L, Greenhill SJ. 2015. Links between language diversity and species richness can be confounded by spatial autocorrelation. Proceedings of the Royal Society of London B, 282: 20142986.
Turvey & Pettorelli [1] present a fascinating study exploring links between biological and linguistic diversity across New Guinea. With the world’s highest linguistic diversity (around 900 languages, an average of one language per 1000 km2 [2]), as well as the high biodiversity characteristic of a large mountainous tropical island, New Guinea is an ideal test case for investigating patterns and drivers of biocultural diversity. Turvey & Pettorelli's finding that numbers of languages and mammal species are correlated across grid cells in New Guinea is consistent with studies in other parts of …
Abstract PDF 10.1098/rspb.2014.2986Broad supernatural punishment but not moralizing high gods precede the evolution of political complexity.
Watts J, Greenhill SJ, Atkinson QD, Currie TE, Bulbulia J & Gray RD. 2015. Broad supernatural punishment but not moralizing high gods precede the evolution of political complexity in Austronesia. Proceedings of the Royal Society B, 20142556.
Supernatural belief presents an explanatory challenge to evolutionary theorists -- it is both costly and prevalent. One influential functional explanation claims that the imagined threat of supernatural punishment can suppress selfishness and enhance cooperation. Specifically, morally concerned supreme deities or 'moralising high gods' have been argued to reduce free-riding in large social groups, enabling believers to build the kind of complex societies that define modern humanity. Previous cross-cultural studies claiming to support the moralising high god hypothesis rely on correlational …
Abstract PDF 10.1098/rspb.2014.2556 OverviewTransNewGuinea.org: An Online Database of New Guinea Languages.
Greenhill SJ. 2015. TransNewGuinea.org: An Online Database of New Guinea Languages. PLoS ONE 10(10): e0141563.
The island of New Guinea has the world’s highest linguistic diversity, with more than 900 languages divided into at least 23 distinct language families. This diversity includes the world’s third largest language family: Trans-New Guinea. However, the region is one of the world’s least well studied, and primary data is scattered across a wide range of publications and more often then not hidden in unpublished “gray” literature. The lack of primary research data on the New Guinea languages has been a major impediment to our under-standing of these languages, and the history of the peoples in New …
Abstract PDF 10.1371/journal.pone.0141563 Website OverviewPulotu: Database of Austronesian Supernatural Beliefs and Practices.
Watts J, Sheehan O, Greenhill SJ, Gomes-Ng S, Atkinson QD, Bulbulia J & Gray RD. 2015. Pulotu: Database of Austronesian Supernatural Beliefs and Practices. PLoS ONE 10(9): e0136783.
Scholars have debated naturalistic theories of religion for thousands of years, but only recently have scientists begun to test predictions empirically. Existing databases contain few variables on religion, and are subject to Galton's Problem because they do not sufficiently account for the non-independence of cultures or systematically differentiate the traditional states of cultures from their contemporary states. Here we present Pulotu: the first quantitative cross-cultural database purpose-built to test evolutionary hypotheses of super-natural beliefs and practices. The Pulotu database …
Abstract PDF 10.1371/journal.pone.0136783Rate of language evolution is affected by population size.
Bromham L, Hua X, Fitzpatrick T, & Greenhill SJ. 2015. Rate of language evolution is affected by population size. Proceedings of the National Academy of Sciences, USA. 201419704.
The effect of population size on patterns and rates of language evolution is controversial. Do languages with larger speaker populations change faster due to a greater capacity for innovation, or do smaller populations change faster due to more efficient diffusion of innovations? Do smaller populations suffer greater loss of language elements through founder effects or drift, or do languages with more speakers lose features due to a process of simplification? Revealing the influence of population size on the tempo and mode of language evolution not only will clarify underlying mechanisms of …
Abstract PDF 10.1073/pnas.1419704112 OverviewEvolution and Language: Phylogenetic Analyses.
Greenhill SJ. 2015. Evolution and Language: Phylogenetic Analyses. In The International Encyclopedia of the Social and Behavioral Sciences, 2nd Edition. Wright, JD (Ed). Elsevier: Oxford.
Language phylogenies are a potentially powerful way to answer questions about how languages and cultures evolve. Recently phylogenetic methods have been applied to answer a range of questions about the evolution of human languages and cultures. This chapter reviews the historical background of these approaches and provides a detailed methodological overview. Three different applications of phylogenetic methods are discussed: how language phylogenies can be used to test population dispersal hypotheses, to investigate processes in language evolution, and to infer patterns in cultural evolution. …
Abstract
2014
Research priorities in historical-comparative linguistics: A view from Asia, Australia and the Pacific.
Koch H, Mailhammer R, Blust R, Bowern C, Daniels D, François A, Greenhill SJ, Joseph B, Reid L, Ross M & Sidwell P. 2014. Research priorities in historical-comparative linguistics: A view from Asia, Australia and the Pacific. Diachronica, 31:2, 267-278.
The first issue of Diachronica contained an evaluation of the comparative method as applied to “exotic” languages (Boretzky 1984). Thirty years later, it is worth taking stock of what our discipline has accomplished and identifying future priorities and pressing issues that have (re-)emerged. The following represents the considered judgement of several practitioners in language families from a large region of the world that is underrepresented in international fora. The ideas were first presented during the 20th International Conference on Historical Linguistics (ICHL 20), Osaka, Japan, 2011.
Abstract PDF 10.1075/dia.31.2.04kocThe evolution of traditional knowledge: environment shapes medicinal plant use in Nepal.
Saslis-Lagoudakis CH, Hawkins JA, Greenhill SJ, Pendry CA, Watson MF, Tuladhar-Douglas W, Baral SR & Savolainen V. 2014. The evolution of traditional knowledge: Environment shapes medicinal plant use in Nepal. Proceedings of the Royal Society B, 281: 20132768.
Traditional knowledge is influenced by ancestry, inter-cultural diffusion and interaction with the natural environment. It is problematic to assess the contributions of these influences independently because closely related ethnic groups may also be geographically close, exposed to similar environments and able to exchange knowledge readily. Medicinal plant use is one of the most important components of traditional knowledge, since plants provide healthcare for up to 80% of the world's population. Here, we assess the significance of ancestry, geographical proximity of cultures and the …
Abstract PDF 10.1098/rspb.2013.2768Demographic correlates of language diversity.
Greenhill SJ. 2014. Demographic correlates of language diversity. In Bowern C & Evans B (Eds). The Routledge Handbook of Historical Linguistics. Routledge: London.
Why do some languages change at a different rate to others? What causes one language to change faster than another? Why do some language families have many languages and why do some families only have a few? According to the latest version of the Ethnologue (Lewis 2013), there are 7547 languages in the world divided into at least 289 language families (including isolates).1 However, there is substantial variation in the number of languages — what I will call diversity — in each family. The Niger-Congo and Austronesian language families contain 1543 and 1255 languages respectively — about 37% …
Abstract PDF
2013
First Shots Fired For The Phylogenetic Revolution in Religious Studies: a Commentary on David Sloan Wilson.
Bulbulia J, Atkinson QD, Greenhill SJ & Gray RD. 2013. First Shots Fired For The Phylogenetic Revolution in Religious Studies: a Commentary on David Sloan Wilson. Cliodynamics.
Wilson’s target article illustrates how evolutionary hypotheses are advancing the science of complex cultural systems. We agree. The following extends the conversation to consider the benefits of evolutionary methods. We restrict our review to computational phylogenetic methods as these are being used to test evolutionary hypotheses about religions.
Abstract PDF 10.21237/C7clio4119066Why do religious cultures evolve slowly? The cultural evolution of cooperative calling and the historical study of religions.
Bulbulia J, Atkinson QD, Gray RD & Greenhill SJ. 2013. Why do religious cultures evolve slowly? The cultural evolution of cooperative calling and the historical study of religionsIn: Mind, Morality and Magic: Cognitive Science Approaches in Biblical Studies. I. Czachesz & R. Uro (Eds.), Durham. Acumen.
The languages and folkways of ancient peoples hold little relevance for us, except in one respect: the religions of the ancient world remain our religions. Though religions change, core features of the scriptures and rituals of the world’s most popular religious traditions appear to have been conserved with remarkably high fidelity. We suggest how this striking conservation may be explained from an evolutionary model for religious cooperation according to which slow religious change facilitates cooperation among strangers. At the end, we clarify how historians of religion, in collaboration …
Abstract PDFA Lexicostatistical Study of the Khasian Languages: Khasi, Pnar, Lyngngam, and War.
Nagaraja KS, Sidwell P & Greenhill SJ. 2013. A Lexicostatistical Study of the Khasian Languages: Khasi, Pnar, Lyngngam, and War. Mon-Khmer Studies Journal, 42, 1-11.
This paper presents the results of lexicostatistical, glottochronological, and Bayesian phylogenetic analyses of a 200 word data set for Standard Khasi, Lyngngam, Pnar and War. The present analysis supports both the strong identity of Khasian as a unitary branch, with an internally nested branching structure that fits neatly with known historical, geographical and linguistic facts. Additionally, lexically based dating methods suggest that the internal diversification of Khasian began roughly between 1500 and 2000 years ago.
Abstract PDFPopulation structure and cultural geography of a folktale in Europe.
Ross RM, Greenhill SJ & Atkinson QD. 2013. Population structure and cultural geography of a folktale in Europe. Proceedings of the Royal Society, B. 280, 20123065.
Despite a burgeoning science of cultural evolution, relatively little work has focused on the population structure of human cultural variation. By contrast, studies in human population genetics use a suite of tools to quantify and analyse spatial and temporal patterns of genetic variation within and between populations. Human genetic diversity can be explained largely as a result of migration and drift giving rise to gradual genetic clines, together with some discontinuities arising from geographical and cultural barriers to gene flow. Here, we adapt theory and methods from population genetics …
Abstract PDF 10.1098/rspb.2012.3065Phylogenetic models of language change: Three new questions..
Gray RD, Greenhill SJ & Atkinson QD. 2013. Phylogenetic models of language change: Three new questions. In Richerson PJ and Christiansen MH (Eds). Cultural Evolution: Society, Technology, Language, and Religion. MIT Press: Cambridge.
Computational methods derived from evolutionary biology are increasingly being applied to the study of cultural evolution. This is particularly the case in studies of language evolution, where phylogenetic methods have recently been used to test hypotheses about divergence dates, rates of lexical change, borrowing, and putative language universals. This chapter outlines three new and related questions that could be productively tackled with computational phylogenetic methods: What drives language diversification? What drives differences in the rate of linguistic change (disparity)? Can we …
Abstract PDF
2012
Basic vocabulary and Bayesian phylolinguistics: Issues of understanding and representation.
Greenhill SJ, & Gray RD. 2012. Basic vocabulary and Bayesian phylolinguistics: Issues of understanding and representation. Diachronica, 29(4): 523-537.
Donohue et al.’s critique of our work on the origins and spread of the Austronesian language family is marred by misunderstandings of our approach. We respond to these by noting that our Bayesian phylogenetic approach: (1) distinguishes between retentions and innovations probabilistically, (2) focuses on basic vocabulary not ‘the lexicon’, (3) eliminates known loanwords, (4) produces results that are congruent with the results of the comparative method and conflict with the scenarios requiring unprecedented amounts of language shift postulated by Donohue et al.
Abstract PDF Supplementary Material 10.1075/dia.29.4.05greMapping the Origins and Expansion of the Indo-European Language Family.
Bouckaert R, Lemey P, Dunn M, Greenhill SJ, Alekseyenko AV, Drummond AJ, Gray RD, Suchard MA, Atkinson QD. 2012. Mapping the Origins and Expansion of the Indo-European Language Family. Science, 337: 957-960.
There are two competing hypotheses for the origin of the Indo-European language family. The conventional view places the homeland in the Pontic steppes about 6000 years ago. An alternative hypothesis claims that the languages spread from Anatolia with the expansion of farming 8000 to 9500 years ago. We used Bayesian phylogeographic approaches, together with basic vocabulary data from 103 ancient and contemporary Indo-European languages, to explicitly model the expansion of the family and test these hypotheses. We found decisive support for an Anatolian origin over a steppe origin. Both the …
Abstract Get Paper 10.1126/science.1219669
2011
Universal typological dependencies should be detectable in the history of language families.
Levinson SC, Greenhill SJ, Gray RD & Dunn M. 2011. Universal typological dependencies should be detectable in the history of language families. Linguistic Typology, 15: 509-534.
We claim that making sense of the typological diversity of languages demands a historical/evolutionary approach. We are pleased that the target paper (Dunn et al. 2011a) has served to bring discussion of this claim into prominence, and are grateful that leading typologists have taken the time to respond (commentaries denoted by boldface). It is unfortunate though that a number of the commentaries in this special issue show significant misunderstandings of our paper. In the following section we try to explain the basic underlying reasoning, turning in the remaining sections to some of these …
Abstract PDF 10.1515/LITY.2011.034Levenshtein distances fail to identify language relationships accurately.
Greenhill SJ. 2011. Levenshtein distances fail to identify language relationships accurately. Computational Linguistics, 37(4): 689-698.
The Levenshtein distance is a simple distance metric derived from the number of edit operations needed to transform one string into another. This metric has received recent attention as a means of automatically classifying languages into genealogical subgroups. In this paper I test the performance of the Levenshtein distance for classifying languages by subsampling three language subsets from a large database of Austronesian languages. Comparing the classification proposed by the Levenshtein distance to that of the comparative method shows that the Levenshtein classification is correct only …
Abstract PDF 10.1162/COLI_a_00073POLLEX-Online: The Polynesian Lexicon Project Online.
Greenhill SJ & Clark R. 2011. POLLEX-Online: The Polynesian Lexicon Project Online. Oceanic Linguistics, 50(2), 551-559.
The Polynesian lexicon project, POLLEX, was initiated in 1965 by Bruce Biggs in order to provide a large-scale comparative dictionary of Polynesian languages. Since then, POLLEX has grown to include over 55,000 reflexes of more than 4,700 reconstructed forms in 68 languages. These data have enabled many fundamental advances in Polynesian linguistics and prehistory. At almost half a century old, POLLEX is one of the longest-standing databases of linguistic information, and has moved through various incarnations, from type-writer and edge-punched cards, through microfiche to mainframe computer. …
Abstract PDF 10.1353/ol.2011.0014 WebsiteEvolved structure of language shows lineage-specific trends in word-order universals.
Dunn M, Greenhill SJ, Levinson SC & Gray RD. 2011. Evolved structure of language shows lineage-specific trends in word-order universals. Nature. 473, 79–82.
Languages vary widely but not without limit. The central goal of linguistics is to describe the diversity of human languages and explain the constraints on that diversity. Generative linguists following Chomsky have claimed that linguistic diversity must be constrained by innate parameters that are set as a child learns a language. In contrast, other linguists following Greenberg have claimed that there are statistical tendencies for co-occurrence of traits reflecting universal systems biases, rather than absolute constraints or parametric variation. Here we use computational phylogenetic …
Abstract Get Paper 10.1038/nature09923 OverviewLanguage evolution and human history: what a difference a date makes.
Gray RD, Atkinson QD & Greenhill SJ. 2011. Language evolution and human history: what a difference a date makes. Philosophical Transactions of the Royal Society, B, 366, 1090-1100.
Historical inference is at its most powerful when independent lines of evidence can be integrated into a coherent account. Dating linguistic and cultural lineages can potentially play a vital role in the integration of evidence from linguistics, anthropology, archaeology and genetics. Unfortunately, although the comparative method in historical linguistics can provide a relative chronology, it cannot provide absolute date estimates and an alternative approach, called glottochronology, is fundamentally flawed. In this paper we outline how computational phylogenetic methods can reliably estimate …
Abstract PDF 10.1098/rstb.2010.0378
2010
Rise and fall of political complexity in island South-East Asia and the Pacific.
Currie TE, Greenhill SJ, Gray RD, Hasegawa T & Mace R. 2010. Rise and fall of political complexity in island South-East Asia and the Pacific. Nature, 467:801-804.
There is disagreement about whether human political evolution has proceeded through a sequence of incremental increases in complexity, or whether larger, non-sequential increases have occurred. The extent to which societies have decreased in complexity is also unclear. These debates have continued largely in the absence of rigorous, quantitative tests. We evaluated six competing models of political evolution in Austronesian-speaking societies using phylogenetic methods. Here we show that in the best-fitting model political complexity rises and falls in a sequence of small steps. This is …
Abstract Get Paper 10.1038/nature09461On the shape and fabric of human history.
Gray RD, Bryant D & Greenhill SJ. 2010. On the shape and fabric of human history. Philosophical Transactions of the Royal Society, B, 365:3923-3933.
In this paper we outline two debates about the nature of human cultural history. The first focuses on the extent to which human history is treelike (its shape), and the second on the unity of that history (its fabric). Proponents of cultural phylogenetics are often accused of assuming that human history has been both highly tree-like and consists of tightly linked lineages. Critics have pointed out obvious exceptions to these assumptions. Instead of a priori dichotomous disputes about the validity of cultural phylogenetics phylogenies, we suggest that the debate is better conceptualized as …
Abstract PDF 10.1098/rstb.2010.0162Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits.
Currie TE, Greenhill SJ & Mace R. 2010. Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Philosophical Transactions of the Royal Society, B, 365:3903-3912.
Phylogenetic comparative methods (PCMs) provide a potentially powerful toolkit for testing hypotheses about cultural evolution. Here we build on previous simulation work by Nunn et al. (2006) to assess the effect horizontal transmission between cultures has on the ability of both phylogenetic and non-phylogenetic methods to make inferences about trait evolution. We found that the mode of horizontal transmission of traits has important consequences for both methods. Where traits were horizontally transmitted separately PCMs accurately reported when trait evolution was not correlated even at the …
Abstract PDF 10.1098/rstb.2010.0014The shape and tempo of language evolution.
Greenhill SJ, Atkinson QD, Meade A & Gray RD. 2010. The shape and tempo of language evolution. Proceedings of the Royal Society, B, 277:2443-2450.
There are approximately 7000 languages spoken in the world today. This diversity reflects the legacy of thousands of years of cultural evolution. How far back we can trace this history depends largely on the rate at which the different components of language evolve. Rates of lexical evolution are widely thought to impose an upper limit of 6-10 thousand years on reliably identifying language relationships. In contrast, it has been argued that certain structural elements of language are much more stable. Just as biologists use highly conserved genes to uncover the deepest branches in the tree of …
Abstract PDF 10.1098/rspb.2010.0051How Accurate and Robust Are the Phylogenetic Estimates of Austronesian Language Relationships?.
Greenhill SJ, Drummond AJ & Gray RD. 2010. How Accurate and Robust Are the Phylogenetic Estimates of Austronesian Language Relationships? PLoS ONE, 5(3): e9573.
We recently used computational phylogenetic methods on lexical data to test between two scenarios for the peopling of the Pacific. Our analyses of lexical data supported a pulse-pause scenario of Pacific settlement in which the Austronesian speakers originated in Taiwan around 5,200 years ago and rapidly spread through the Pacific in a series of expansion pulses and settlement pauses. We claimed that there was high congruence between traditional language subgroups and those observed in the language phylogenies, and that the estimated age of the Austronesian expansion at 5,200 years ago was …
Abstract PDF 10.1371/journal.pone.0009573
2009
Darwin, language, and two great Pacific voyages.
Greenhill SJ, & Gray RD. 2009. Darwin, language, and two great Pacific voyages. New Zealand Science Review, 66: 97-101.
On the 21st of December 1835 Charles Darwin arrived in New Zealand on the HMS Beagle. The Beagle had just visited the Galapagos islands, where Darwin had made some of the critical observations that he would later incorporate into his theory of evolution. Darwin did not like New Zealand: "I believe we were all glad to leave New Zealand. It is not a pleasant place. Amongst the natives there is absent that charming simplicity which is found in Tahiti; and the greater part of the English are the very refuse of society. Neither is the country itself attractive. (Darwin 1860, p. 430)". Around 1000 …
Abstract PDFAustronesian language phylogenies: Myths and misconceptions about Bayesian computational methods.
Greenhill SJ & Gray RD. 2009. Austronesian language phylogenies: Myths and misconceptions about Bayesian computational methods. In Austronesian historical linguistics and culture history: a festschrift for Robert Blust (Pp 375-397). A. Adelaar & A. Pawley (Eds). Canberra: Pacific Linguistics.
Historical linguistics has never been particularly intimate with computers. The first wave of computational historical linguistics—lexicostatistics—was developed in the 1950s and quickly applied to language groups around the world from Indo-European to Austronesian. However, critics were quick to point out the problems caused by assuming a single constant rate of lexical replacement and repeatedly noted the erroneous results that this produced. As a consequence of these critiques lexicostatistics has been widely rejected by mainstream historical linguists. The last few years have seen a second …
Abstract PDFDoes horizontal transmission invalidate cultural phylogenies?.
Greenhill SJ, Currie TE & Gray RD. 2009. Does horizontal transmission invalidate cultural phylogenies? Proceedings of the Royal Society B. 276: 2299-2306.
Phylogenetic methods have recently been applied to studies of cultural evolution. However, it has been claimed that the large amount of horizontal transmission that sometimes occurs between cultural groups invalidates the use of these methods. Here, we use a natural model of linguistic evolution to simulate borrowing between languages. The results show that tree topologies constructed with Bayesian phylogenetic methods are robust to realistic levels of borrowing. Inferences about divergence dates are slightly less robust and show a tendency to underestimate dates. Our results demonstrate that …
Abstract PDF 10.1098/rspb.2008.1944Matrilocal residence is ancestral in Austronesian societies.
Jordan FM, Gray RD, Greenhill SJ & Mace R. 2009. Matrilocal residence is ancestral in Austronesian societies. Proceedings of the Royal Society B. 276:1957-1964.
The nature of social life in human prehistory is elusive, yet knowing how kinship systems evolve is critical for understanding population history and cultural diversity. Post-marital residence rules specify sex-specific dispersal and kin association, influencing the pattern of genetic markers across populations. Cultural phylogenetics allows us to practise "virtual archaeology" on these aspects of social life that leave no trace in the archaeological record. Here we show that early Austronesian societies practised matrilocal post-marital residence. Using a Markov-chain Monte Carlo comparative …
Abstract PDF 10.1098/rspb.2009.0088Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement.
Gray RD, Drummond AJ, & Greenhill SJ. 2009. Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement. Science, 323: 479-483.
Debates about human prehistory often center on the role that population expansions play in shaping biological and cultural diversity. Hypotheses on the origin of the Austronesian settlers of the Pacific are divided between a recent “pulse-pause” expansion from Taiwan and an older “slow-boat” diffusion from Wallacea. We used lexical data and Bayesian phylogenetic methods to construct a phylogeny of 400 languages. In agreement with the pulse-pause scenario, the language trees place the Austronesian origin in Taiwan approximately 5230 years ago and reveal a series of …
Abstract Get Paper 10.1126/science.1166858 Website
2008
The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics.
Greenhill SJ, Blust R, & Gray RD. 2008. The Austronesian Basic Vocabulary Database: From Bioinformatics to Lexomics. Evolutionary Bioinformatics, 4:271-283.
Phylogenetic methods have revolutionised evolutionary biology and have recently been applied to studies of linguistic and cultural evolution. However, the basic comparative data on the languages of the world required for these analyses is often widely dispersed in hard to obtain sources. Here we outline how our Austronesian Basic Vocabulary Database (ABVD) helps remedy this situation by collating wordlists from over 500 languages into one web-accessible database. We describe the technology underlying the ABVD and discuss the benefits that an evolutionary bioinformatic approach can provide. …
Abstract PDF 10.4137/EBO.S893 WebsiteLanguages evolve in punctuational bursts.
Atkinson QD, Meade A, Venditti C, Greenhill SJ & Pagel M. 2008. Languages evolve in punctuational bursts. Science, 319, 588.
Linguists speculate that human languages often evolve in rapid or punctuational bursts, sometimes associated with their emergence from other languages, but this phenomenon has never been demonstrated. We use vocabulary data from three of the world’s major language groups – Bantu, Indo-European and Austronesian – to show that 10-33% of the overall vocabulary differences among these languages arises from rapid bursts of change associated with language splitting events. Our findings identify a general tendency for increased rates of linguistic evolution in fledgling languages, perhaps arising …
Abstract Get Paper 10.1126/science.1149683
2007
The Pleasures and Perils of Darwinizing Culture (with phylogenies).
Gray RD, Greenhill SJ & Ross RM. 2007. The Pleasures and Perils of Darwinizing Culture (with phylogenies). Biological Theory, 2(4): 360-375.
Current debates about “Darwinizing culture” have typically focused on the validity of memetics. In this paper we argue that meme-like inheritance is not a necessary requirement for descent with modification. We suggest that an alternative and more productive way of Darwinizing culture can be found in the application of phylogenetic methods. We review recent work on cultural phylogenetics and outline six fundamental questions that can be answered using the power and precision of quantitative phylogenetic methods. However, cultural evolution, like biological evolution, is often far from …
Abstract PDF 10.1162/biot.2007.2.4.360
2005
Testing Population Dispersal Hypotheses: Pacific Settlement, Phylogenetic Trees, and Austronesian Languages.
Greenhill SJ & Gray RD. 2005. Testing Population Dispersal Hypotheses: Pacific Settlement, Phylogenetic Trees, and Austronesian Languages. In:The Evolution of Cultural Diversity: Phylogenetic Approaches. Editors: R Mace, C Holden, & S Shennan. Publisher: UCL Press.
Dispersals have been commonplace throughout the history of genus homo (Templeton 2002). However, it is only recently that scenarios about human population expansions have begun to be studied again after a long period of marginalisation (Anthony 1990; Burmeister 2000 and associated commentaries). Some authors, such as Diamond and Bellwood (2003), have argued that dispersals, especially those linked to the development of agriculture, are the "most important process in Holocene human history" (p 597). Unfortunately, many expansion scenarios are little more than plausible narratives. A common …
Abstract PDF