Introduction to Speech and Language Processing

Hilary Term 2019, Thursdays 11,  Room 207, Centre for Linguistics and Philology

Professor J. S. Coleman

This course will introduce a range of computational techniques for the analysis of speech and language. The target audience is 1st year M.Phil. students taking Linguistics, Philology and Phonetics papers B (vii) Experimental Phonetics or B (ix) Computational Linguistics, and it will touch upon probabilistic parsing in syntax. MSc Social Data Science students preparing for the option course in Speech and Language Processing are also welcome to register and participate. The course attempts to introduce the necessary technical concepts as gently as possible and has a strongly practical focus.

Given the diverse background of the students, no prior knowledge of digital signal processing, computer programming, probability theory or automata theory is assumed, but as the material will be covered incrementally and intensively, the full active participation of all who come is essential, and attendance at later sessions is predicated on attendance at earlier ones. The weekly format will be preliminary reading, a 1 hour class, followed by exercises and private study of computer programs developed week-by-week.

The course content will follow selected chapters from my textbook, Introducing Speech and Language Processing, which was developed from earlier cycles of this course in years past. Course participants are not required and will not need to purchase the textbook, unless they wish to: all necessary material (except for the text of the classes!) will be provided. The programming languages used in the textbook are C and Prolog, but these classes will be made somewhat easier by using GNU/Octave (Matlab) instead of C. Students more familiar with other programming languages, especially Python, are free to use that instead: I shall give a few pointers along the way to relevant sources in Python 3, though as a beginner in that language I shall not translate my materials into Python yet (not this year, at any rate).

In order to follow along during the classes and to do the practical assignments in your own time, you'll need to bring your own laptop; I'm anticipating that some people will have Apple laptops, others will have Windows machines, and perhaps some people (like me) will be using Linux. I will provide software and tuition suitable for all these platforms; I'll also ask you to download and install various pieces of software in advance, in readiness for the classes.

Whatever kind of computer you've got, the main languages I shall be using are Prolog and GNU Octave, an open-source package that is similar to Matlab, and somewhat like R. They are both quite simple to get started with, and in the classes I shall take you step-by-step through many useful examples.
For the first half of the course (focus on language processing), you'll need to be able to run Prolog. I suggest SWI-Prolog.

For the second half of the course (focus on speech and signal processing), you'll need GNU Octave. It is very simple to download and install Octave in Linux, or Windows, e.g. from here: https://www.gnu.org/gnu/octave/windows

It is possible, but can be a little more difficult, to install Octave on a Mac, e.g. from here: https://www.gnu.org/software/octave/download.html There is some guidance at http://wiki.octave.org/Octave_for_MacOS_X

If all else fails, our IT support team has produced a package for Macs that makes it somewhat easier: this installs Linux on your Mac as a "virtualbox" virtual machine, with Octave installed in Linux. The instructions for that are here.

One way or another, you'll need to get Octave installed and running on your laptop. It would be helpful if you test out Octave with some simple computations such as these:

> x  = [1 2 3 4 5];

> y = x.^2

ans =
    1    4    9   16   25

> plot(x,y)

(This should open a new window and plot a figure.) Please also download and install the following extra packages in Octave (NB these are already included in the Mac virtual machine download mentioned above):

> pkg install -forge io

> pkg install -forge statistics

> pkg load statistics

And also:

> pkg install -forge control

> pkg install -forge signal

> pkg load signal

If any of the above doesn't work for you, or is too difficult, please (a) don't panic! (b) contact me by email so we can help you. The course is practical, so we shall be taking everything one step at a time. But if you could get the software onto your laptop before we get started, that could save some time in the first session.

Please inform me what kind of laptop (i.e. Windows, Mac, Linux flavour) you will be bringing.

Other software that would be useful to have:

Reading list

Week 1, 17th Jan. Finite State Machines

    Week 1 homework

Week 2, 24th Jan. Probabilistic finite-state models; language models, speech recognition and forced alignment.

    Week 2 homework

Week 3, 31st Jan. Parsing: a quick introduction.

    Week 3 homework

Week 4, 7th Feb. Probabilistic parsing

    (No lecture notes yet, just a zipfile of code to download and install) 

Preparatory assignment for week 5

Week 5, 14th Feb. Digital signals. Generation of a sine wave.

    Audio files: cosine.dat (same as cosine.raw)

Week 6, 21st Feb. Working with audio files and speech corpora. Frequency analysis. (Fourier spectrum, spectrogram, pitch tracking). Based on chapter 4. Read about cepstrum and linear prediction for homework.

Week 7, 28th Feb. Cepstral analysis and linear prediction.


Week 8, 7th March. Analysis of speech parameters using functional data analysis