How to view and search within the BNC Spoken Audio files using Praat

You can display an audio waveform, spectrogram and the corresponding transcriptions using Praat. (If you don't have Praat, it's freely available from here.) You'll need the audio file (say, 021A-C0897X0081XX-AAZZP0.wav) and the corresponding transcription in Praat TextGrid format (say, D92-0081.TextGrid).

1. Start the Praat software (click on its icon).

2. Ignore the "Praat Picture" window. (You can minimize it if you want to put it out of the way.)

3. Look at the line of menus at the top of the "Praat Objects" window. Select "Read" then "Read from file"; a file browser window labelled "Read Object(s) from file" will appear.

4. Browse around until you've found the .wav file, wherever you've put it on your computer. With the .wav file name (including its complete location pathname) in the Selection box at the bottom of the, click on "OK". The list of "Objects:" at the upper left of the "Praat Objects" window should read:

    1. Sound 021A-C0897X0081XX-AAZZP0

5. Repeat steps 3 and 4 to load the corresponding .TextGrid file into Praat. The TextGrid will appear as the second item in your list of Praat Objects.

6. Click on object 1 (the sound file name); it should turn to white text on a black background.

7. Enter SHIFT+DownArrow simultaneously; object 2 (the TextGrid name) will also turn to white text on a black background. This indicates that you have selected both files together.

8. Click on the "Edit" button at the upper right of the Praat Objects window. After a brief delay, a new window will pop up that displays the audio waveform, a spectrogram (or a placeholder for a spectrogram below it), and then the transcription tiers below that. In our Spoken Audio BNC TextGrids, the upper transcription tier (tier 1) contains intervals labelled with phoneme labels; tier 2, at the bottom, contains words in ordinary spelling (in capital letters).

9. The buttons labelled "all", "in", "out", "sel" and "bak" at the bottom left, and the scroll bar to their right, enables you to zoom in and out and to move forwards and backwards through the audio.

To search for a particular word

(You can find out what words occur in each Spoken Audio BNC sampler file by looking in the HTML transcription. Let's suppose that you look at
http://www.phon.ox.ac.uk/SpokenBNCdata/D92.html and you want to hear how the word "Hertford" is pronounced.)

10. Using the "in" button, zoom in until you can the text of the transcriptions. Then, move the scroll bar to the extreme left, so that you are looking at the start of the audio file.

11. In the bottom tier, click on the leftmost 'word'. (In the current example, this is labelled "sp", which stands for "short pause".)

12. Enter "CONTROL-F". A small window labelled "Find text" pops up.

13. Point and click in the text box, to make it active. Then, type in the word you want to find. To agree with the transcriptions, this should all be in capitals, i.e. HERTFORD. Click on the button labelled "OK".

14. Assuming everything is in order (i.e., you typed correctly, the word tier has been selected and the word you want is actually in the TextGrid), Praat will move the view to the relevant portion of the audio file. The found word will be in red text on an orange background, and the phonemic transcription and the selected portion of audio will have a pink background.

15. The duration of the selected word will be indicated in a grey box immediately below the word transcription. Either by clicking on that grey box, or by pressing the TAB key, the selected section of audio should play.

(Sometimes, you may get an error message saying that the audio playback doesn't work. The most common reason for this is that some other window or application on your computer has control over the audio playback. To fix that, close any other applications that use sound, including non-Praat windows that are viewing sound files.)

Please note: just because it finds a portion of audio for the word you have entered doesn't guarantee that the alignment is good, i.e. in the right place! If you're lucky, it will be just right. But frequently, the alignment is somewhat or VERY inaccurate. Even then, however, you may find that you've been taken to ROUGHLY the right portion of the file. If you zoom out a bit and use your cursor to select a larger section of the audio waveform in the upper part of the display, you can listen to a wider portion of the speech. By referring to the corresponding HTML transcription file, you might be able to work out where you are in the audio file.

To close the audio file or TextGrid (for example, if you want to inspect a different audio file)

16. In the "Praat Obects" window, selecting the Sound or TextGrid objects and clicking on the button labelled "Remove" (in red) at the bottom will close those files, and also will close the display. Go back to step 3, above, if you want to look at another audio file.