English intonation in the British Isles

Beta-Version of the Annotated IViE Corpus on CD-ROM


ESRC Award Number R000237145

This document contains copies of the text files which accompany the beta-version of the IViE-corpus (released in 2000).
Current users of the beta-version: please note that some of the files have been updated.






Read me first

About this corpus

How to see the data

Overview of files on the CD

The Stimuli (texts)

Keys to file name coding and speaker initials






Read Me First

The IViE beta-CD contains prosodically labelled data from the IViE corpus (Economic and Social Research Council grant R000237145), and the data can be viewed in xwaves(TM) under UNIX.


If you would like to see and hear data from the IViE Corpus, please proceed as follows:

1. Start by reading the file 'About_this_corpus' (file included on the CD - the web-version of the file is given below). In this file, you will find information about the IViE corpus, the transcription system and the data in this package.

2. Read 'How_to_see_the_data' next. This file tells how to view prosodically labelled data from five urban varieties of British English, how to transcribe unlabelled IViE data using the IViE system, and how to use the IViE labeller to transcribe your own data

3. In 'Overview_files', you will find a key to the directory structure and the filenames in this package.

4. In 'The_Stimuli', you find orthographic transcriptions of the data (but note that you will also see orthographic transcriptions when you view the data).

5. In 'IViE_labels', you find copies of the IViE labeller and the IViE menus. Information about how to use the labeller to transcribe your own data is given in the file 'How_to_see_the_data'.


Back to the top




About this Corpus

The IViE corpus was set up for the investigation of cross-varietal and stylistic variation in British English intonation. The beta-version of the corpus contains machine-readable, prosodically labelled speech data from five urban varieties:

- Belfast English
- British Punjabi English spoken in Bradford
- Cambridge English
- Leeds English
- Newcastle English

The data were collected in urban secondary schools, and the speakers are 16 years old. We have recorded six male and six female speakers from each variety.

The complete IViE Corpus contains data in five speaking styles, and the beta-version contains speech files from all speaking styles, and prosodically labelled files from

(1) The controlled sentences Data from Cambridge, Leeds, Bradford Punjabi, Newcastle and Belfast 5 syntactic structures, 30 speakers, 660 sentences

(2) The read passage Data from Cambridge, Bradford, Belfast One section each of the Cinderella fairy tale

(3) Additionally, there are unlabelled samples from (3) the retold passage (4) the map task (5) the free conversation produced by Cambridge, Bradford Punjabi, and Belfast English speakers.



The data were labelled with the IViE system for prosodic labelling. For more information about the IViE labelling system, see the following paper:

Grabe, E., Post, B., and Nolan, F.
(to appear, preprint in .doc format). Modelling intonational Variation in English. The IViE system. Proceedings of Prosody 2000, Krakow, Poland, October 2000.


If you have comments on the beta-version of the IViE corpus, please write to Esther Grabe or Brechtje Post.

If you would like to view labelled data from the IViE corpus, please proceed to the next section.


Back to the top




How to see the Data


The IViE beta-CD allows you to

I. View prosodically labelled data from five urban varieties of British English
II. Transcribe unlabelled IViE data using the IViE system
III. Use the IViE labeller to transcribe your own data


The data can be viewed in xwaves under UNIX.

Information about the IViE system for prosodic labelling and our labelling guide can be found
here.



How to make the Labeller Operational (UNIX, XWAVES)


Step 1 Create a directory IViE_Beta_Corpus

Step 2 Place the contents of this package into the directory IViE_Beta_Corpus

Step 3 Go to the /sentences/ directory

- type: cd sentences


Step 4 open the file label in a text editor (emacs, jot etc.)

- type: jot label


Step 5 find the following line:

TMP=/CDROM/IViE_labels/label$$

Edit this line.


Put in the directory path from YOUR machine which leads to the /sentences directory

e.g.

/home/myfiles/IViE_Beta_Corpus/sentences/label$$


To find out what the path is,
- type: pwd

at the command line while you're in the /sentences/ directory


Save your changes and quit the text editor.

Do the same in the file 'mlabel' which gives you an f0 range for male speakers.



Make the labellers executable by typing

chmod +x label

chmod +x mlabel


You're ready to use the labeller


Do the same in the directories /Cinderella_passage/ and /spontaneous/.
Here, the directory path in label and mlabel has to lead to the directories /Cinderella_passage/ and /spontaneous/.





Viewing Transcribed Data


This package contains two directories with prosodically labelled data:

/sentences

/Cinderella_passage



The directory /spontaneous contains unlabelled spontaneous speech data for comparison.


(1) The Sentences

The sentence directory contains controlled sentences produced by six speakers (three male and three female) from five different locations in the British Isles. There are five different syntactic structures: simple statements, questions without morphosyntactic markers, inversion questions, WH- questions and one type of coordination structure. All sentences are labelled on two orthographic and three prosodic tiers. Data from each of the five varieties can be found in their own subdirectory:

/sentences/Cambridge
/sentences//Leeds
/sentences//Bradford
/sentences//Newcastle
/sentences//Belfast


Each of these directories contains five further subdirectories in which you find the five different syntactic structures, e.g.:

Directory name:------------Directory contains:


/Cambridge/statements-------8 different fully voiced statements

/Cambridge/Q_no_morph-----3 different fully voiced questions without morphosyntactic markers

/Cambridge/WH_questions----3 different fully voiced WH-questions

/Cambridge/inversions-------3 different fully voiced inversion-questions

/Cambridge/coordinations-----5 different fully voiced coordination structures, conjuction: 'or'


There are 132 prosodically labelled sentences from each variety.



How to View the Sentence Data

To see statement 1 produced by a male Belfast speaker, do the following:

Go to the /sentence directory.

To see an example of statement 1 produced by a male Belfast speaker
type mlabel Belfast/statements/s1bgm

('mlabel' starts up the labeller for a male speaker)

To see an example of statement 1 produced by a female Belfast speaker
type label Belfast/statements/s1bcc

('label' starts up the labeller for a female speaker)

You will see the pressure wave displayed at the top, the F0 trace at the bottom, and the time-aligned labelling template in the middle.

The letters in the filename 's1bgm' mean: 's' = 'statement', '1' = 1, 'b' = Belfast, 'gm' = initials of male Belfast speaker GM. 'cc' = initials of female Belfast speaker CC.

A list of the files, and keys to filenames, speaker initals and speaker gender are given in the file 'Overview_files' in this package.



(2) The Cinderella Passage

The second directory contains data from the Cinderella passage. There are three sections of the passage in this directory, one from Cambridge English, one from British Punjabi English spoken in Bradford, and one from Belfast.

To see these data:

Go to the Cinderella_passage directory

type mlabel p1cma to see the Cambridge passage
(p1cma = passage section 1, Cambridge, male speaker MA)

type label p1pfm to see the British Punjabi English passage from Bradford
(p1pfm = passage section 1, Punjabi, female speaker FM)

type mlabel p1bgm to see the Belfast passage
(p1bgm = passage section 1, Belfast, male speaker GM)





Listening to the Spontaneous Speech Data

If you would like to hear some spontaneous speech from the IViE corpus, or if you would like to label some IViE data using the IViE system, go to the directory

/spontaneous

Here you find spontaneous speech data from Cambridge, Bradford Punjabi and Belfast English. The speakers are the same as in the controlled sentences and the passage task.


To hear a retold version of a section from the Cinderella passage produced by a Cambridge speaker

- type mlabel rcma
(retold speech, Cambridge, male speaker MA)

or sgplay rcma.d

The labelling templates will be empty. In the following sections, you can find out how to see the menus and how to insert IViE labels.

More spontaneous speech data




IViE LABELLING


LABELLING ON THE ORTHOGRAPHIC TIER

If you put the cursor in the lowest tier (the orthographic tier), and hold down the right mouse-button, you will see a menu that allows you to insert, delete, replace or move around words. Type the words spoken by the speaker, one by one, and align them with the end of the words in the speech wave using the MOVE button (right-mouse-meanu). After choosing MOVE, click the middle mouse button. The word which is closest to your cursor will jump the to the location of the curser.

If the words have been inserted correctly, you can hear each word by clicking on the word with the left mouse-button.


THE RHYTHMIC TIER

The second lowest tier (the rhythmic tier) is intended for the transcription of rhythmic prominences. The right-mouse-menu offers three symbols: 'P' for prominence, '%' for rhythmic boundary, and the hash sign for a hesitation, or a speech error. Insert P in the middle of a prominent syllable, % at the end of a word that is followed by a rhythmic boundary, and hash at the location of the hesitation or error.

For more for on this tier and all following tiers, see the IViE labelling guide.


THE TONE TARGET TIER

The middle menu is intended for the transcription of pitch movement surrounding prominent syllables. The menu contains a selection of pitch movement labels. The labels given in the menu are no more than a SUBSECTION of possible pitch movement labels. Generally, transcribers make up their own labels and type them into the tier using the 'insert' command. The % is used to indicate the end of a pitch movement implementation domain (ID) which co-incides with an IP boundary. As IP boundaries are transcribed on the phonological tier, some transcribers insert the % on the pitch movement tier only after they have transcribed the intonational structure of the utterance on the phonological tier.


THE PHONOLOGICAL TIER

On the second highest tier, the intonational structure is labelled. The labels given in the menu represent a pool of labels rather than a closed phonological system. Transcribers can select different subsections of labels for different varieties of English. The labels given allow us to account for the five varieties of English in this package.


THE COMMENT TIER

The highest tier is intended for alternative transcriptions and comments. Some options are given in the right-mouse-menu. Otherwise, you can type in your own comments.




More about the Spontaneous Speech Data

To hear a retold version of a section from the Cinderella passage produced by a Bradford Punjabi English speaker

type label rpfm
(retold speech, Punjabi, female speaker FM)

To hear a retold version of a section from the Cinderella passage produced by a Belfast speaker

type mlabel rbgm
(retold speech, Belfast, male speaker GM)


To hear sections from the map task

type mlabel macma
(map task, section a, Cambridge, male speaker MA)

type mlabel mbcma
(map task, section b, Cambridge, male speaker MA)

type label mpfm
(map task, Punjabi, female speaker FM)

type label m1brg
(map task, Belfast, male speaker GM)

To hear a section of free conversation

type mlabel fcma
(free conversation, Cambridge, male speaker MA)

type label fpfm
(free conversation, section 1, Punjabi, female speaker FM)

type label fbgm
(free conversation, Belfast, male speaker GM)




How to Use the IViE Labeller to Transcribe Your Own Speech Data


Please note: we have updated the tone labels in the IViE labelling system; please refer to the
IViE labelling guide.


If you would like to use the IViE labeller to transcribe your own data, you need to make a separate directory on your machine in which you put:

(1) the speech files which are to be labelled

(2) f0 files made from the speech files in xwaves (you can type 'get_f0 infile outfile' to make an f0 file from your speech file.) Speechfile and f0 files have to have the same name, but different extensions. Give the speechfile the extension '.d', and the f0 file the extension '.f0' (i.e. filename.d, and filename.f0).

(3) copies of the following files in this package:

label

mlabel

wordmenu

rhythmmenu

pitchmenu

tonemenu

commentmenu


You find these files in the directory 'IViE_labels'


When you've got all the relevant files in your directory, open the file 'label' using a text editor (e.g. emacs or jot).

- type: jot label

then find the following line:

TMP=/CDROM/IViE_labels/label$$

Again, you need to change the directory path in this line. Otherwise, the labeller will not work in your directory. Put in the directory path which leads to the directory you're in at the moment.

e.g. /home/myfiles/new_directory/label$$

To find out what the path is, type:

pwd

at the command line while you're in the directory with your speech files and the label files

Save your changes and quit the text editor.

Do the same in the file 'mlabel' which gives you an f0 range for male speakers.

Make the labeller executable by typing

chmod +x label
chmod +x mlabel

You're ready to use the labeller.

To open the data files and the labelling template, type

label filename for a female speaker, and
mlabel filename for a male speaker

Do not add any extensions.

You should see the speech wave at the top of your screen, an empty labelling template with 5 tiers underneath, and an f0 file at the bottom.


Finally, please note that the menu files and the labeller can be edited using a text editor. For instance, you can change the labels in the tonemenu to suit your own transcription needs, and you can add extra tiers or remove tiers if you edit the labelling script.



Back to the top




Overview of the Data Files on the CD


Sentence files can be viewed while you are in the /sentences directory.

Cinderella passage files are viewed from the /Cinderella_passage directory.

Spontaneous speech files are viewed from the /spontaneous directory.


I. The Sentences


Directory /sentences
Subdirectories /Belfast, /Bradford, /Cambridge, /Leeds, /Newcastle

Subdirectories within the directories for each variety are:
/statements, /Q_no_morph, /inversions, /WH_questions, /coordinations

(Q_no_morph = questions without morphosyntactic markers)

These subdirectories contain the data.
The directory /sentences/Belfast/statements, for instance, contains

s1bgm.extension
s2bgm.extension
s3bgm.extension
....etc.




Key to File Names

The first letter of the file name indicates the sentence type:
's' = statement, 'q' = question without morphosyntactic makers, 'w' = WH-question, 'i' = inversion, 'c' = coordination,.

The following number shows which sentence was produced (see the file 'Stimuli' for texts). The directory '/statement', for instance, contains 8 different statements. The next letter indicates the variety.
'c' = Cambridge, 'l' = Leeds, 'p' = British Punjabi, 'n' = Newcastle,'b' = Belfast The two letters at the end are the initals of the speaker, in this case speaker GM.



Key to Speaker Initials

Belfast: male speakers are GM, RG and RO, female speakers are CC, AM, EC
Cambridge: male: MA, MC, PT, female: LP, ER, SM
Leeds: male: JP, MD, RP, female: CM, KF, NB
Newcastle: male: AM, MC, RF female: EP, RP, VW
Bradford: male: AA, RH, WA, female: FM, KI, RA




How to View the Files


To view sentence files, you need to be in the /sentence directory. Choose a variety (Cambridge, Belfast, Bradford, Newcastle, Leeds), a sentence type (statements, Q_no_morph, inversions, WH-questions, coordinations), and a speaker (see list of speaker initials above). For an inversion question produced by a female Bradford Punjabi English speaker, type

label Bradford/inversions/i1pki
(i1pki = inversion question 1, Punjabi, speaker KI)

For a statement produced by a male Leeds English speaker, type

mlabel Leeds/statements/s1ljp
(s1ljp = statement 1, Leeds, speaker jp)

etc.



II. The Cinderella Passage


Directory /Cinderella_passage

Files: p1cma.extension
p1pfm.extension
p1bgm.extension

To see data from a male Cambridge English speaker, type
mlabel p1cma

To see data from a female Bradford Punjabi English speaker, type
label p1pfm

To see data from a male Belfast English speaker, type
mlabel p1bgm



II. Spontaneous Speech Data


In the directory /spontaneous, you find unlabelled data in 3 speaking styles from 3 different varieties. The speakers are the same as in the sentences and the passage task. The /spontaneous directory contains:

(1) 3 retold versions of a section of the Cinderella passage (Cambridge, Bradford, Belfast)
(2) 3 map task sections (Cambridge, Bradford, Belfast)
(3) 3 sections from our free conversation data (Cambridge, Bradford, Belfast)


To see the files, type the following:

(1) To hear semi-spontaneous (retold) speech data

mlabel rcma (Cambridge)
mlabel rbgm (Belfast)
label rpfm (Bradford)

(3) To hear Map Task data (the initials refer to one of the speakers only)
mlabel mcama (Cambridge, section a)
mlabel mcbma (Cambridge, section b)
mlabel mbgm (Belfast)
label mpfm (Bradford)


(3) To hear a conversation (the initials refer to one of the speakers in the pair)

mlabel fcma (Cambridge)
mlabel fbgm (Belfast)
label fpfm (Bradford)



Back to the top



The Stimuli


For a key to filenames for a particular stimulus , please read the section
keys to file names and speaker initials.

I. Sentences

(1) Simple Statements.

1. We live in Ealing.
2. You remembered the lillies.
3. We arrived in a limo.
4. They are on the railings.
5. We were in yellow.
6. He is on the lilo.
7. You are feeling mellow.
8. We were lying.

(2) Questions without morphosyntactic markers:

1. He is on the lilo?
2. You remembered the lillies?
3. You live in Ealing?

(3) Inversion questions:

1. May I lean on the railings?
2. May I leave the meal early?
3. Will you live in Ealing?

(4) WH-Questions:

1. Where is the manual?
2. When will you be in Ealing?
3. Why are we in a limo?

(5) Coordinations

1. Are you growing limes or lemons?
2. Is his name Miller or Mailer?
3. Did you say mellow or yellow?
4. Do you live in Ealing or Reading?
5. Did he say lino or lilo?


II. The Cinderella Passage

Once upon a time there was a girl called Cinderella. But everyone called her Cinders. Cinders lived with her mother and two stepsisters called Lily and Rosa. Lily and Rosa were very unfriendly and they were lazy girls. They spent all their time buying new clothes and going to parties. Poor Cinders had to wear all their old hand-me-downs! And she had to do the cleaning!
One day, a royal messenger came to announce a ball. The ball would be held at the Royal Palace, in honour of the Queenıs only son, Prince William. Lily and Rosa thought this was divine. Prince William was gorgeous, and he was looking for a bride! They dreamed of wedding bells!
When the evening of the ball arrived, Cinders had to help her sisters get ready. They were in a bad mood. They'd wanted to buy some new gowns, but their mother said that they had enough gowns. So they started shouting at Cinders. 'Find my jewels!' yelled one. 'Find my hat!' howled the other. They wanted hairbrushes, hairpins and hair spray.
When her sisters had gone, Cinders felt very down, and she cried. Suddenly, a voice said: 'Why are you crying, my dear?'. It was her fairy godmother!
The girl poured her heart out: 'Lily and Rosa have it all!' she cried, 'even though they're awful, and fat, and they're dull! And I want to go to the ball, and meet Prince William!'
'You will, wonıt you?' laughed her fairy godmother. 'Go into the garden and find me a pumpkin'. Cinders went, and found a splendid pumpkin which the fairy changed into a dazzling carriage.
'Now bring me four white mice,' the godmother said. The girl went, and found one... two...three...four mice. The fairy godmother changed the mice into four lovely horses to pull the carriage.
Then the girl looked at her old rags. 'Oh dear!' she sighed. 'Where will I find something to wear? I don't have a gown!' 'Hmmm...' said the fairy : 'Let's see, what do you need? You'll need a ballgown... you need jewellery... you need shoes, and... something needs to be done about your hair. And would you like a blue gown or a green gown?'
For the third time, Cinders' godmother waved her magic wand. A ballgown, a robe and jewels appeared. And there were some elegant glass slippers. 'You look wonderful,' her fairy godmother said, smiling. 'Just remember one thing - the magic only lasts until midnight!' And off Cinders went to the ball.
In the Royal Palace, everyone was amazed by the radiant girl in the beautiful ballgown. 'Who is she?' they asked. Prince William thought Cinders was the most beautiful girl he had ever seen. 'Have we met?' he asked. 'And may I have the honour of this dance?'
Prince William and Cinders danced for hours. Cinders was so glad that she failed to remember her fairy godmotherıs warning. Suddenly the clock chimed midnight! Cinders ran from the ballroom. 'Where are you going?' Prince William called. In her hurry, Cinders lost one of her slippers. The Prince wanted to find Cinderella, but he couldn't find the girl. 'I don't even know her name,' he sighed. But he held on to the slipper.
After the ball, the Prince was resolved to find the beauty who had stolen his heart. The glass slipper was his only clue. So he declared: 'The girl whose foot will fit this slipper shall be my wife'. And he began to search the kingdom.
Every girl in the land was willing to try on the slipper. But the slipper was always too small. When the Royal travellers arrived at Cinders' home, Lily and Rosa tried to squeeze their feet into the slipper. But it was no use; their feet were enormous! 'Do you have any other girls?' the Prince asked Cinders' mother. 'One more,' she replied. 'Oh no,' cried Lily and Rosa. 'She is much too busy!' But the Prince insisted that all girls must try the slipper.
Cinders was embarrassed. She didn't want the Prince to see her in her old apron. And her face was dirty! 'This is your daughter?' the Prince asked, amazed. But then Cinders tried on the glass slipper, and it fitted perfectly!
The Prince looked carefully at the girl's face, and he recognised her. 'It's you, my darling isn't it?' he yelled. 'Will you marry me?' Lily and Rosa were horrified. 'It was you at the ball, Cinders?' they asked. They couldn't believe it! Then Cinders married William, and they lived happily ever after.


Back to the top