Dr. Simon J. Greenhill

I research why and how people created all the amazing languages around us, and what they tell us about human prehistory.
I use (mainly) Bayesian phylogenetic methods to tackle these questions and have investigated everything from how the Austronesian peoples settled the Pacific, to modelling the co-evolution of linguistic structure. And I have built a number of large-scale databases to help answer these questions.
Currently I'm one of the editors of Language Dynamics and Change and on the editorial board of the Journal of Language Evolution.
I'm an Associate Professor in the School of Biological Sciences at the University of Auckland. Before that I was senior scientist in the Department of Linguistic and Cultural Evolution at the Max Planck Institute for the Science of Human History in Jena, Germany, and the ARC Centre of Excellence for the Dynamics of Language at Australian National University.
Publications:
Zariquiey R, Vera J, Greenhill SJ, Valenzuela P, Gray RD, & List J-M. 2023. Untangling the evolution of body-part terminology in Pano: conservative versus innovative traits in body-part lexicalization. Interface Focus, 13(1). https://doi.org/10.1098/rsfs.2022.0053.
Abstract PDF 10.1098/rsfs.2022.0053.Although language-family specific traits which do not find direct counterparts outside a given language family are usually ignored in quantitative phylogenetic studies, scholars have made ample use of them in qualitative investigations, revealing their potential for identifying language relationships. An example of such a family specific trait are body-part expressions in Pano languages, which are …
Shcherbakova O, Gast V, Blasi DE, Skirgård H, Gray RD, & Greenhill SJ. 2022. A quantitative global test of the complexity trade-off hypothesis: the case of nominal and verbal grammatical marking. Linguistics Vanguard. https://doi.org/10.1515/lingvan-2021-0011.
Abstract PDF 10.1515/lingvan-2021-0011Nouns and verbs are known to differ in the types of grammatical information they encode. What is less well known is the relationship between verbal and nominal coding within and across languages. The equi-complexity hypothesis holds that all languages are equally complex overall, which entails trade-offs between coding in different domains. From a diachronic point of view, this hypothesis implies …
Barbieri C, Blasi DE, Arango-Isaza E, Sotiropoulos AG, Hammarström H, Wichmann S, Greenhill SJ, Gray RD, Forkel R, Bickel B, & Shimizu KK. 2022. A global analysis of matches and mismatches between human genetic and linguistic histories. Proceedings of the National Academy of Sciences, 119(47). https://doi.org/10.1073/pnas.2122084119.
Abstract PDF 10.1073/pnas.2122084119Human history is written in both our genes and our languages. The extent to which our biological and linguistic histories are congruent has been the subject of considerable debate, with clear examples of both matches and mismatches. To disentangle the patterns of demographic and cultural transmission, we need a global systematic assessment of matches and mismatches. Here, we assemble a genomic …
Koile E, Greenhill SJ, Blasi DE, Bouckaert R, & Gray RD. 2022. Phylogeographic analysis of the Bantu language expansion supports a rainforest route. Proceedings of the National Academy of Sciences, 119(32) e2112853119.
Abstract PDF 10.1073/pnas.2112853119The Bantu expansion transformed the linguistic, economic, and cultural composition of sub-Saharan Africa. However, the exact dates and routes taken by the ancestors of the speakers of the more than 500 current Bantu languages remain uncertain. Here, we use the recently developed “break-away” geographical diffusion model, specially designed for modeling migrations, with “augmented” geographic …
List JM, Forkel R, Greenhill SJ, Rzymski C, Englisch J & Gray RD. 2022. Lexibank, a public repository of standardized wordlists with computed phonological and lexical features. Scientific Data, 9(1): 316.
Abstract PDF 10.1038/s41597-022-01432-0the past decades have seen substantial growth in digital data on the world’s languages. at the same time, the demand for cross-linguistic datasets has been increasing, as witnessed by numerous studies devoted to diverse questions on human prehistory, cultural evolution, and human cognition. Unfortunately, most published datasets lack standardization which makes their comparison difficult. Here, we …
Projects:
Glottobank
Glottobank is an international research consortium established to document and understand the world’s linguistic diversity. We have established five global databases documenting variation in language structure (Grambank), lexicon (Lexibank), paradigm systems (Parabank), numerals (Numeralbank), and phonetic changes (Phonobank).
Database of Places, Language, Culture and Environment
From the foods we eat, to who we can marry, to the types of games we teach our children, the diversity of cultural practices in the world is astounding. Yet, our ability to visualize and understand this diversity is often limited by the ways it traditionally has been documented and shared: on a culture-by-culture basis, in locally-told stories or difficult-to-access books and articles. D-PLACE represents an attempt to bring together this dispersed corpus of information.
Trans-New Guinea Online
TransNewGuinea.org is a database of the Trans-New Guinea language family and friends. The Trans-New Guinea language family currently occupies most of the interior of New Guinea. This family is possibly the third largest in the world with 400 languages and is tentatively thought to have originated with root-crop agriculture around 10,000 years ago. However, vanishingly little is known about this family’s history.
POLLEX: Polynesian Lexicon Project Online
The Polynesian Lexicon Project Online is a large-scale comparative dictionary of Polynesian languages.
Austronesian Basic Vocabulary Database
The Austronesian Basic Vocabulary Database is the world’s largest cross-linguistic database of the Pacific. It contains ~300,000 lexical items from ~1,600 languages spoken throughout the Pacific region.