Comparing dialects and languages using statistical measures of rhythm.
Linguists, from Ken Pike and Arthur James onwards have talked about the different rhythms of different languages. It's hard not to agree with them: one gets the strong feeling that (for example) Mandarin is not just English with all the words changed. This difference is normally described as the rhythm of a language, in analogy with musical rhythms. But, what the subjective descriptions have not done is establish exactly what we are hearing when we hear a rhythmic difference.
Our challenge is to go beyond the subjective impression of rhythmic difference, and to make it objective. We built this project around the following questions:
- When people talk about speech rhythm, what objective properties are they talking about?
- Is there really a substantial difference between the way languages are performed that could lead to a perceived difference in rhythm?
- How best can one measure rhythm?
Our working assumption was that there are substantial rhythmic differences between languages that are large enough so that speakers of one (e.g. English) rarely produce the rhythms of another language (e.g. French). Our approach was to collecte paragraphs in five languages (English, French, Greek, Russian and Mandarin (Chinese)), apply various techniques for measuring rhythm, and see which technique(s) did the best job of separating our languages.
Most publications on speech rhythm have used techniques that, one way or another, depend on the durations of speech sounds. Typically, the published techniques look at the variance of vowel duration, the fraction of a paragraph that is made of consonants, or contrasts between neighbouring vowel durations, or something similar. We reviewed the literature and found 15 techniques that we could implement into an algorithm that could be applied across a range of languages. We tested all the algorithms, three variants of each, and all combinations of two algorithms and even all combinations of three algorithms.
Somewhat surprisingly, we found that none of these variants and combinations would cleanly separate our languages. There were always (for instance) French people whose patterns of speech duration were typical of Greek. This leads to one or more of the following onclusions:
- Rhythm is not primarily expressed by patterns of duration.
- Languages don't actually have dramatically different rhythms. People may have emphasized the differences by focussing on the more interesting contrasts. Or, since changing language often implies a change in the person speaking, perhaps individual differences were misinterpreted as differences between languages.
- The person-to-person and paragraph-to-paragraph variation within a language may be much larger than expected. Perhaps linguists may have idealized the form of each language, neglecting the variation,
In the process, we have learned things about how to measure rhythm:
- Any rhythm experiment that intends to measure properties of a language needs to average away the substantial person-to-person and paragraph-to-paragraph variation that we have observed. We recommend 10 or more people per group and 10 or more paragraphs per person, leading to experiments that are substantially larger than most previous work.
- When people label speech, they do it differently for languages that they know well, compared to languages that are unfamiliar. That means human labellers may inadvertently bias the results in a multi-lingual experiment, leading to spurious differences between languages. We avoided this problem by building a strictly language-independent automatic segmentation system.
- We have early results that suggest that rhythm measures should include other acoustical properties, beyond duration measurements.
Our results also include studies of rhythmic differences between British English dialects and the relationship between rhythm and phonology.
Download the Data: See the details and download the Oxford Aesop Corpus.
If you want to use PVI (or other rhythm measures) in your research: See our FAQ on rhythm measures.
Loukina, A., Kochanski, G., Rosner, B., Shih, C., Keane, E. 2011. Rhythm measures and dimensions of durational variation in speech. Journal of the Acoustical Society of America, vol. 129, issue 5, pp. 3258-3270
Kochanski, G., Loukina, A., Keane, E., Shih, C., Rosner, B. 2010. Long-range prosody prediction and rhythm. Speech Prosody 2010 100222:1-4.
Loukina, A., Kochanski, G., Shih, C., Keane, E. and Watson, I. 2009. Rhythm measures with language-independent segmentation. Proceedings of interspeech 2009 : speech and intelligence. Brighton, UK. 6-10 September, International Speech Communications Association. 1531-1534
Conference presentations and posters:
Loukina, A., Kochanski, G., Keane, E., Shih, C., Rosner, B. 2011. Measuring linguistic rhythm. Poster presented at Oxford Sound Day, 5 March 2011.
Loukina, A., Kochanski, G. 2010. Patterns of durational variation in British dialects. Presentation at PAC colloquium 2010 on 13 september 2010, Montpellier, France
Kochanski, G., Loukina, A., Keane, E., Shih, C., Rosner, B. 2010. Long-range prosody prediction and rhythm Presentation at Speech Prosody 2010, 11-14 May, Chicago, IL.
Keane, E., Loukina, A., Kochanski, G., Shih, C., Rosner, B. 2010. How far can phonological properties explain rhythm measures? Presentation at BAAP colloquium 2010 on 30 March 2010, London, UK.
Kochanski, G., Loukina, A., Keane, E., Shih, C., Rosner, B. 2010. Predicting prosody in poetry and prose. Poster presented at BAAP colloquium 2010 on 30 March 2010, London, UK.
Loukina, A., Kochanski, G., Keane, E., Shih, C., Rosner, B. 2010. Do rhythm measures separate languages or speakers? Poster presented at BAAP colloquium 2010 on 30 March 2010, London, UK.