The IViE Corpus

Speech data and intonation transcriptions from nine urban dialects of British English in five speaking styles

About the IViE project

In the IViE project, we investigate cross-varietal and stylistic variation in English intonation (IViE stands for 'Intonational Variation in English'). We are looking at so-called modern or mainstream dialects (Trudgill 1998) and we have recorded data from nine urban varieties of English spoken in the British Isles. Our speakers are male and female adolescents.

The map below shows the locations where we have made recordings: London, Cambridge, Cardiff, Liverpool, Bradford, Leeds, Newcastle, Belfast in Northern Ireland and Dublin in the Republic of Ireland.

Additionally, the map shows that three of our speaker groups are from ethnic minorities: we have recorded bilingual Punjabi/English speakers, bilingual Welsh/English speakers and speakers of Carribean descent.

Listen to some Examples

Aims and Objectives

(1) to set up the IViE corpus: to record and make available a corpus of prosodically labelled speech data from several varieties of English spoken in the British Isles in a range of speaking styles.

(2) to describe selected aspects of prosodic variation in the IViE corpus.

We took the following steps to achieve this aim:

The funding period for the IViE project ended in March 2002. Since then, the project has been rated 'outstanding' by ESRC evaluators.

Intonation is investigated from a number of angles. For instance, researchers look at intonational meaning, the role of intonation in discourse, intonation synthesis, focus structure, acoustic structure and phonology. The IViE project is about intonational structure: the phonology and acoustic-phonetic realisation of intonation. The data base we have collected allow for other types of investigations also, and we are planning to work on non-structural aspects of intonation in due course. In the present project, we aim to provide the phonological and phonetic basis for such investigations.

Our approach to cross-varietal intonational phonology

Phonological analyses of intonation are not directly comparable to phonological analyses of segmental structure. First of all, in intonation analysis, we do not have an equivalent of the minimal pair test. Consequently, it can be difficult to be sure whether we are dealing with two instances of the same intonation contour or two different contours. The motivation behind a speaker's choice of a particular intonation contours is not easy to establish either. His/her choice depends on the text, the context, and his/her communicative intent. Finally, the acoustic-phonetic realisation of a particular intonation contour changes. Instances of the same phonological entity can look rather different in F0 on different texts.
A consequence of the difficulties inherent in intonation analysis is that analysts disagree on the number of distinctive contours we find in different languages or varieties. Disagreements can be resolved only if we have sufficient information about (a) the acoustic-phonetic realisation of intonation and (b) the contribution of intonation to different utterances. Work on both areas of intonation analysis is currently in progress within the field, and the IViE corpus is intended to allow for both types of investigations.

The absence of generally agreed criteria for phonological category membership in intonation prompted us to take a comparative approach to intonation analysis. The IViE data allow for three kinds of comparisons:

Urban Varieties of English in the IViE Corpus

We have recorded directly comparable samples of speech from nine urban varieties of English spoken in the British Isles (approximately 40 hours of speech). The recordings were made in:

- London (speakers of West Indian descent)
- Cambridge
- Cardiff (bilingual Welsh-English speakers)
- Leeds
- Bradford (bilingual Punjabi-English speakers)
- Liverpool
- Newcastle
- Belfast
- Dublin

The data were collected in urban secondary schools, and the speakers were 16 years old at the time of recording. Twelve speakers were chosen from each variety (six male, six female) and all speakers took part in the same battery of tasks. The tasks were designed to elicit comparable data in five speaking styles.


Kimberley Farrar made recordings in Cambridge, Leeds, Newcastle and Dublin. Brechtje Post recorded the data from Bradford, London and Cardiff. Catherine Sangster recorded the data from Liverpool. The Belfast recordings were made by Orla Lowry. Farrar, Post and Sangster speak Southern Standard British English. Post is a native speaker of Dutch. Lowry is from Belfast. Prior to the recordings, the experimenters explained to the subjects how to carry out the non-interactive tasks. In the interactive tasks, subjects spoke to each other. The experimenter remained in the recording room during the recordings.

Speaking Styles and Examples

We recorded the following types of data:

(1) Conversations

Face-to-face, single-sex pairs, recodings made in local school in a quiet room, speakers know each other. Some pairs are close friends, but not all. Topic of conversation: smoking. Complete conversations last between 2 and 5 minutes. The extracts below are shorter.

A selection of examples from:

(2) Goal directed interactions

An adaptation of the Map task (i.e. we made our own map): 'find your way around a small town'.

Click here to see Map 1 (instruction giver) and here to see Map 2 (instruction follower).

Examples from:

(3) Story telling from memory

The fairy tale Cinderella; examples from:

We have also recorded a set of directly comparable data:

(4) A read passage of speech

Cinderella; example of opening lines from:

(5) Phonetically controlled sentences
A variety of grammatical structures; example of statements from:

For more information on the stimuli, click here

Prosodic Annotation

We have designed a machine-readable prosodic labelling system for the IViE database. It is called IViE (pronounced like the woman's name Ivy), and the acronym stands for 'Intonational Variation in English'. Here is a
snapshot of an IViE transcription.

The IViE Labelling Guide.

IViE is a two-tone system, and transcriptions are made using H and L symbols associated with stressed syllables and intonation phrase boundaries. The IViE system differs from other two-tone system in that it is intended specifically for the transcription of intonation variation. Prosodic transcriptions are made on three separate tiers: one for phonological variation, one for phonetic variation and one for variation in the location of stressed syllables. The tonal system is based on work by Gussenhoven 1984, Grabe 1998a and the ToBI system for prosodic labelling. The symbols on the IViE tone tier are similar to those given in the ToDI system.

A complete IViE transcription provides the user with information about

(1) the location of word boundaries
(2) the location of rhythmic prominences
(3) the acoustic-phonetic targets observed in the fundamental frequency trace (NB - tone targets are transcribed auditorily if F0 tracking is not possible or if there are tracking errors)
(4) an autosegmental-metrical phonological analysis. A pool of tone labels allows for the transcription of several different varieties of one language in a single transcription system.

The IViE transcription system can be used in conjunction with xwaves, PitchWorks or PRAAT or wavesurfer. We work with xwaves, and we use customised labelling scripts which allows us to listen to the speech, look at the F0 trace and a spectrogram, if needed. Transcriptions are entered into a set of labelling templates (essentially a set of text files, displayed time-aligned with the speech signal).

Output in the form of publications

Summary paper: Grabe, E. (2004). Intonational variation in urban dialects of English spoken in the British Isles. In Gilles, P. and Peters, J. (eds.) Regional Variation in Intonation. Linguistische Arbeiten, Tuebingen, Niemeyer, pp. 9-31.

Other papers:

Grabe, E. and Post, B. (2002)
.doc Intonational Variation in English. In B.Bel and I. Marlin (eds), Proceedings of the Speech Prosody 2002 Conference, 11-13 April 2002, Aix-en-Provence: Laboratoire Parole et Langage, 343-346. ISBN 2-9518233-0-4.

Grabe, E. (2002) .doc Variation adds to prosodic typology. In B.Bel and I. Marlin (eds), Proceedings of the Speech Prosody 2002 Conference, 11-13 April 2002, Aix-en-Provence: Laboratoire Parole et Langage, 127-132. ISBN 2-9518233-0-4.

A PowerPoint presentation on IViE given at the European Science Foundation TIE/INTAS Workshop on Tone and Intonation in Europe, Vitoria-Gasteiz, Spain, June 2001. View slides

Fletcher, J., Grabe, E., and Warren, P. (to appear) .doc Intonational variation in four dialects of English: the high rising tune. In Sun-Ah Jun (ed) Prosodic typology and transcription - a unified approach. Oxford, OUP.

Grabe, E., Post, B. and Nolan, F. (2001) .doc Modelling intonational Variation in English. The IViE system. In Puppel, S. and Demenko, G. (eds). Proceedings of Prosody 2000. Adam Mickiewitz University, Poznan, Poland.

Grabe, E., Post, B., Nolan, F., and Farrar, K. (2000). Pitch accent realisation in four varieties of British English. Journal of Phonetics 28.

Nolan, F. and Farrar, K. (1999). Timing of f0 Peaks and Peak Lag. Proceedings of the International Congress of Phonetic Sciences, 961-967.

Evans, B. and Grabe, E. (1999). Connected Speech Processes in Intonation. Proceedings of the International Congress of Phonetic Sciences, 1201-1204.

Grabe, E., Nolan, F., and Farrar, K. (1998). IViE - A comparative transcription system for intonational variation in English. Proceedings of ICSLP 98, Sydney, Australia.

Nolan, F., and Grabe, E. (1997). Can ToBI transcribe intonational variation in English? In Proceedings of the ESCA Workshop on Intonation: Theory, Models and Applications, Athens, Greece.

1. IViE CD Sets

We made available two types of CD-ROM:

In 2001, we released the complete set of speech data from the IViE corpus on sets of five CD-ROMs. These data are in .wav format.

In 2000, we made available a
beta-version of the prosodically annotated IViE-CD. The final version of this data set is available here, on-line.

The data can be viewed with xwaves (NB: xwaves can no longer be purchased but is still widely available speech laboratories), PitchWorks, PRAAT or wavesurfer.

2. On-line Versions of the Corpus

The complete IViE corpus is now available on-line. There are two pages for the complete set of speech data (nine dialects, five speaking styles):

On a further page, we have made available prosodically and orthographically transcribed IViE data.

