I’ve seen a number of R packages and scripts that cache a copy of Glottolog for users to use.
This has the problem of rapidly going stale and out of date.
Here is a simple function to always get the latest version of the data by querying the Zenodo API endpoint:
load_glottolog <- function() {
# 1. query API endpoint
o <- jsonlite::fromJSON('https://zenodo.org/api/records/14006636')
# 2. get url
latest_url <- o$files[1,]$links$self
# 3. download
glottolog_zipfile <- tempfile()
download.file(latest_url, glottolog_zipfile, method="curl", extra='-L')
# 4. find values.csv
valuefile <- grep('values.csv$', unzip(glottolog_zipfile, list=TRUE)$Name, ignore.case=TRUE, value=TRUE)
# 5. return values
readr::read_csv(unz(glottolog_zipfile, filename = valuefile))
}
An alternative solution is to use my RCLDF Package:
devtools::install_github("SimonGreenhill/rcldf", dependencies = TRUE)
glottolog <- rcldf::load_glottolog()