mixing acoustic phonetics, statistics and comparative philology to bring speech back from the past

Oxford University logospeaker
University of Cambridge
 Phonetics Laboratory

 Statistical Laboratory  

What we are trying to do
Audio demonstrations
Indo-European digits database

Interesting links


Audio demonstrations

Here I'll gather together in one place all the audio demos previously tweeted on the twitter feed @sounds_ancient (, on the blog, and in presentations and lectures. Check out the blog (see link at left) for more demos and examples.

23 July 2020. The pronunciation of "hate" comes from Old English hete, from Proto-Germanic *hat(iz), from Post-PIE *k̂ad- (Grimm's Law) from *k̂eh2d- (vocalization of the laryngeals):

16-21 July 2020.
English "acre" derives from Old English "æcer", via the Great Vowel Shift, and in many UK dialects, "loss" of the final [r]. Here's my audio simulation: (or, if your device prefers MP3's,

Old English "æcer" comes from Proto-Germanic *akraz (here sounding more like [akroz] or [akros]), from Post-Proto-Indo-European *aĝros (< PIE *h2eĝros). Audio simulations: (wav) (mp3)

28 April 2020. How could Proto-Indo-European *ĝ develop into e.g. [dʒ], [ʒ], and [z], as in *ĝneh3- "know" > Sanskrit जानाति jānāti, Lithuanian žinau, Russian знаю [znaju]? Assuming h3 was something like [ŏ], here's a simulation of [ɟnɛŏ] > [džnaʊ] > Lith. žinau:

Here's another example: (some sort of approximation to) Proto-Indo-European *ĝhwér- > Lithuanian žvėr-inė. Listen:,
(the latter recoded from

23 January 2020.
"Average pronunciations" (1) Means of a bunch of recordings of French numbers: (2) Medians of same set of recordings:

14 January 2020.
A *laryngeal reflex in Modern Persian: سیاه siyah "black", ultimately from Proto-Indo-European *k̂yeh1-, cognate with English "hue". The final [h] is not a random one-off, but seems common/normal in Iranian Persian, Tajik etc. Listen:

17 December 2019.
In May 2015 I posted a simulation of how English "ten" developed from Proto-Indo-European *dekmt (using Lithuanian [dešmt] as proxy). Now, a much better job: ten < Old English tien < Proto-Germanic *tehun < PIE *dekmt. Listen and enjoy:

21 November 2019. English "nine" [naɪn] comes from Middle/Old English [ni:n], which is from Proto-Germanic *nigon (perhaps [niɣɵn]), from Proto-Indo-European *h1newh1n̩/m (Mallory/Adams) or *(h1)néwn̩ (Ringe): I'm going with [neun] or [neum]. Listen:

It's not quite right, because [naɪn] should go through [neɪn] on the way to [ni:n], but this simulation goes via something like [noɪn]. But it's a start.

9 July 2019. Laryngeals in Lithuanian? (a thread) "raudonas", Lith. for "red", comes from Proto-Indo-European *h1reudh-; the initial h1 is hypothetical, based on evidence from several languages. BUT listen to this first syllable:



Here's another example of "raudona" from Proto-Indo-European *h1reudh-. Listen to this first syllable: Downloaded from and converted to .wav. Unlike my other demos, these have not been manipulated at all.

I've been looking at tokens of "raudona" today. In most of them, the initial /r/ begins with a little prothetic schwa and then one or two taps [əɾ(ɾ)]. In a minority (~20%?) it's something like [həɾ(ɾ)-], as in the two examples I just posted.

July 24th follow-up, following feedback from various Twitter commentators:
I got the LIEPA [Lithuanian corpus] data and have been working through words beginning with (orthographic) r. The occasional initial [h]-like frication seems to be pretty random and is not related to the presumed PIE laryngeals. But still, kind of interesting.

30 April 2019. VIDEOS! I've been making some videos to illustrate processes of sound change using spectrograms that morph from one into another. First, here's "te" changing into "se" (which happened in Balochi): See the energy rising at the left?

Here's Latin quinque changing into Modern Italian cinque. See the energy rising at the left as the initial plosive morphs into an affricate.

Here's the vowel [e] in the first syllable of Proto-Indo-European "*kwetwor" changing into [a] in Latin "quattuor". The lower frequency energy creeps upwards, because [a] has a higher first formant frequency than [e]

Here Proto-Indo-European oin- develops into Anglo-Saxon "an" (pronounced "aan"). The left-to-right upward-sweeping resonance (the second formant of [oi]) collapses, as [a:] has more energy in lower frequencies than [i] does:

Tip: if any of these video clips don't play properly in your browser (e.g. if you just get a black screen), try saving them to your computer and then opening them with e.g. VLC player

7 December 2018. English "six" comes ultimately from Proto-Indo-European *kswek̂s, via [seks].
Strictly speaking, it should pass through Proto-Germanic *sehs [sexs]; my version of that came out a bit growly so maybe that's one for another day ...

6 December 2018. This has been a while in the making: English "four" is from Proto-Indo-European "kʷetwóres", via Old English "feower", Proto-Germanic "fidwor" and Pre-Proto-Germanic "hwidwor". Listen: 
(For an easier life, I ignore the -es ending until the very end.)

10 October 2018. English "long" is related to Modern Persian دراز (deraz). "Long" comes from Anglo-Saxon "lang", which came from Proto-Indo-European *dlonghos something like this (I ignore the -os ending), listen: 

Proto-Indo-European *dlonghos developed into Middle Persian "derang", something like this: 

(Obviously we don't have recordings of Middle Persian; this "derang" is fiddled from a Low German speaker saying "Drang")

Middle Persian "derang" developed into Modern Persian "deraz", I'm guessing something like this, listen: 

The whole sequence from "lang" to "deraz" (with a few pitch changes along the way): 

24 July 2018. Anglo-Saxon "gōs" came from Proto-Indo-European *ghans (via "gōs" came from Proto-Indo-European *ghans (via something Germanic "gans"), like this: 

29 June 2018. Listen to Latin "duo" morphing into French "deux" 
The "duo" recording was disyllabic, much longer than the "deux" recording, so I compressed it in time to make them the same length, the best I could manage for now.

28 March 2018. In this thread, we show how the English and Urdu words for "goose" are related.

Modern English "goose" comes from Anglo-Saxon gōs, like this:
Thanks to Prof. Laura Ashe, our voice of Anglo-Saxon.

Anglo-Saxon "gōs" came from Proto-Indo-European *ghans (via something Germanic "gans"), like this:

And going southwards, PIE *ghans developed (eventually) into Urdu "hans", like this:
Thanks to Qurrat for the Urdu recording

11 December 2017. West meets East: English "fierce" is from Middle English fers, from Latin fer-us, from Proto-Indo-European *ĝhwēr-

Something like this: 
(MP3 version: )

(I left out the 2nd vowel in "ferus")

In Iranian, Proto-Indo-European *ĝhwēr- developed into žver, thence sher (like in the Jungle Book tiger, Shir Khan). Listen: 
(MP3 version: )

Now it gets really interesting: Iranianشیر šīr
was borrowed into Chinese and continued to evolve (e.g. to modern Mandarin shīzi 狮子 'lion'). Listen: 
(MP3 version: )
In short, 狮子 is cognate with 'fierce'!

6 December 2017. Grimm's Law 1a: The initial [b] in English "(to) bear" comes from a voiced aspirate [bh] in Proto-Indo-European *bher(e/o)-, like this: 

MP3 version: 

Grimm's Law 1b: [b] in English "(to) bear" is from a voiced aspirate [bh] in Proto-Indo-European *bher(e/o)-. Sanskrit "bhar-" retains the initial [bh]: 
(based on recording of @suhasm saying "bharati")
MP3 version: 

7 April 2017. In Old and Middle English, "bite" was pronounced more like the modern word "beat". The vowel evolved, like this: 

MP3 version of Middle English "bite" changing to its modern pronunciation 
(The pitch change is irrelevant!)

1 April 2017. (No, not an April Fool's Day joke!)

For , a few demos of how English and Persian have a common ancestry in Indo-European.

English "belly" (also "bulge") comes from PIE *bhólĝhis. Irish "bolg" (same root) makes a pretty good proxy: 

MP3 simulation of how "belly" comes from PIE *bhólĝhis (Irish "bolg"):

Persian balish "pillow" is also from PIE *bhólĝh- 

Like Persian balish بالش, Slovenian "blazina" (both mean "pillow") is also from PIE *bhólĝh- 

A borough (Old English burh) is a fortified town, from PIE *bherĝhs, "high place": 

MP3 version: 

Also from PIE *bherĝhs, "high place": Armenian bardzr, Pers. borz, Balochi borza, Russ béreg, Ir. brí, Gk. púrgos ...

Here's Persian "borz" evolving from (sort of) PIE *bherĝhs 

28 October 2016. Today's experiment: Proto-Indo-European *bhreĝ, 'break'

Modern English 'break' comes from Proto-Indo-European *bhreĝ something like this:

23 October 2015. Previously I posted demo of "five" from (Lithuanian) "penki". But PIE has *penkwe, not penki. So here done better:

Starting to fill new table of Indo-European digit sounds at New tokens of *treies, *ksweks, quinque and Ancient Gk, and *penkwe, *septm (wrong stress, but hey), quattuor (hybrid of Ladin kwater and Welsh pedwar, maybe too prominent). Comments +/- welcomed.

11 August 2015. Clips relating to the paper we gave at the 18th International Congress of Phonetic Sciences, Glasgow (Coleman, Aston and Pigole 2015, "Reconstructing the sounds of words from the past"), is available from the "papers" page (see sidebar at left).

26 May 2015. "Three" comes from Proto-Indo-European "*treyes". Not from Spanish "tres", but that's the nearest I've got. Listen: Here's the MP3 version:

25 May 2015. "One" comes from Proto-Indo-European *oinos, via Middle English "oon", Anglo-Saxon "an", Germanic "oin(s)". Listen:

"One" from "oin(s)", MP3 format:

Previously [12th May] we generated a continuum of sounds from "two" to "twa" and vice-versa. Now, we follow "two" all the way back to Proto-Indo-European *dwo(H). WAV: MP3:

or .

18 May 2015. "Eight" came from Proto-Indo-European *Hokto, via changes something like this: (MP3 version )

15 May 2015. "Four" comes from Anglo-Saxon "feower":

15 May 2015. "Five" comes via fif, fimf, pemp from Proto-Indo-European *penkwe. Lithuanian penki is nearest living word. Listen:

Or if you prefer going forwards in time from Anglo-Saxon "twa" to Modern English "two":

1 April 2015. For , we made a ": to show Anglo-Saxon pronunciations (or something close to them) that survived until quite recently in various German dialects.

John Coleman is supported by a Science in Culture Innovation Award from the
AHRC logo

John Aston is supported by a Fellowship from theEPSRC logo