Introducing Speech and Language Processing

Week 2 homework: speech recognition and forced alignment practicals

1) Go to archive.org, and find and download an audio recording of "Treasure Island", chapter 1. For example,
https://archive.org/details/treasure_island_1307_librivox

You may need a browser tool such as videodownloadhelper (in Firefox) to grab the audio file from the site.

2) Convert the mp3 audio file to .wav format, using an audio conversion tool

3) Save chapter 1 of the text as a separate file (e.g. treasure_island_chapter1.txt)

Using e.g. Praat or wavesurfer, delete any extraneous material at the start of the recording, so that it matches the text of chapter 1. Check that the end of the audio matches the end of the text. Delete any extraneous material at the end of the recording.

4) IMPORTANT: now edit a short paragraph from the text and the audio (c. 2 min of audio). The online tools will be very slow if you try to upload and process long sound files all at once without first chopping them up into smaller chunks.

4) Using the online resource WebMAUS Basic at
https://clarin.phonetik.uni-muenchen.de/BASWebServices/interface
to generate a forced alignment of the text to the audio.

5)


Task 2:

Using the BNC frequency data in http://www.phon.ox.ac.uk/jcoleman/new_SLP/Lecture_2/triplet_counts, determine what is the most likely sentence beginning with "HAVE YOU GOT ..."