mixing acoustic phonetics, statistics and comparative philology to bring speech back from the past

Oxford University logospeaker
University of Cambridge
 Phonetics Laboratory

 Statistical Laboratory  

What we are trying to do
Audio demonstrations
Indo-European digits database

Interesting links



In this project, we examine an old question – what did words sound like in the past? – in a revolutionary new way. Since the 19th century, historical linguists have studied in detail the forms of words in many languages at different points in history, the varieties and mechanisms of sound change, and, for the Indo-European language family in particular, they have used that knowledge to infer the forms of words from a time before writing. For example, from word-forms as diverse as Old English weorc, Old High German werc, Latin orgia, Greek ergon, and Armenian gorc, philologists infer a Proto-Indo-European stem u̯erg̑-, a formula that hints at a pronunciation something like werg. But what did it actually sound like? The innovation of this project is that, rather than reconstructing written forms of ancient words, we are developing methods to triangulate backwards from contemporary audio recordings of simple words in modern Indo-European languages to regenerate audible spoken forms from earlier points in the evolutionary tree. In 2014 we worked on the audio reconstruction of spoken Latin words for numbers, from audio recordings in French, Italian, Spanish and Portuguese. In 2015, with a grant from the Arts and Humanities Research Council, we shall extend this work to some Germanic languages (English, German dialects and Dutch, at least), together with Modern Greek, to try to advance the horizon of audio reconstruction into the deeper past of the Indo-European language family. We have already developed most of the necessary technical methods for realising this extraordinary ambition, these early successes opening up a wide range of new questions, which shall be the focus of this project: How far back in time can extrapolation from contemporary recordings progress? How “wide” and diverse must a language family tree be in order to triangulate to sounds that are plausible i.e. reasonably consistent with written forms from antiquity? Are any attested sound changes outside the limits of the acoustic transformations we can currently model, and if so, how to address that? How do we deal with changes that not acoustically continuous or gradual, such as analogical formations and loanwords? We may also begin to be able to address questions of rate of change e.g. do sound changes proceed at a uniform, gradual rate? Or if not, how can we model varying rates of sound change in different branches of a language family, or in different periods?

John Coleman is supported by a Science in Culture Innovation Award from the
AHRC logo

John Aston is supported by a Fellowship from theEPSRC logo