Phonetics at Oxford University

Releases

ESPS

ESPS (Entropic Signal Processing System) is a package of UNIX-like commands and programming libraries for speech signal processing. As a commercial product of Entropic Research Laboratory, Inc, it became extremely widely used in phonetics and speech technology research laboratories in the 1990's, in view of the wide range of functions it offered, such as get_f0 (for fundamental frequency estimation), formant (for formant frequency measurement), the xwaves graphical user interface, and many other commands and utilities. Following the acquisition of Entropic by Microsoft in 1999, Microsoft and AT&T licensed ESPS to the Centre for Speech Technology at KTH, Sweden, so that a final legacy version of the ESPS source code could continue to be made available to speech researchers. At KTH, code from the ESPS library (such as get_f0) was incorporated by Kåre Sjölander and Jonas Beskow into the Wavesurfer speech analysis tool. This is a very good alternative way to use many ESPS functions if you want a graphical user interface rather than scripting.

Subsequently, Stephen Isard (Institute for Research in Cognitive Science, University of Pennsylvania) updated the sources to compile and run on current OSX and Linux machines, and Danny Yee (University of Oxford) has packaged them as a .deb file for Ubuntu 12.04. The licence terms are exactly those of the final ESPS source code release, without modification.

The Aesop Corpus

This corpus is speech in five languages, over 100 hours in total, recorded from people reading paragraphs of text. It is fully described here, and downloads are available.

Speech Data Collection Software (Project Aesop)

Need to collect some speech? We have just the thing here. Open-source, configurable and customizeable software for recording speech. Set up an experiment by defining an input file then run it. It collects and records extensive metadata along with the audio so you can know exactly what was recorded when, by who. More information is available here, and the source code can be inspected here.

[Download]

FIAT 1.2 data format

This data file format is used in some of our releases and many internal computations. It's a simple, easily readable, understandable format that many Windows programs can interpret as CSV. In addition to giving you columns of data, it also can store metadata in a file header. This release includes the FIAT 1.2 description and a python reference implementation from gmisclib-0.69.0.

Bootstrap Markov Chain Monte Carlo and Full Signal Detection Theory

These files are the code used to compute the G. Kochanski and B. S. Rosner paper "Bootstrap Markov Chain Monte Carlo and optimal solutions for the Law of Categorical Judgement (Corrected)", http://arxiv.org/abs/1008.1596 . This can be referenced as arXiv:1008.1596.

A novel procedure is described for accelerating the convergence of Markov chain Monte Carlo computations. The algorithm uses an adaptive bootstrap technique to generate candidate steps in the Markov Chain. It is efficient for symmetric, convex probability distributions, similar to multivariate Gaussians, and it can be used for Bayesian estimation or for obtaining maximum likelihood solutions with confidence limits. As a test case, the Law of Categorical Judgment (Corrected) was fitted with the algorithm to data sets from simulated rating scale experiments. The correct parameters were recovered from practical-sized data sets simulated for Full Signal Detection Theory and its special cases of standard Signal Detection Theory and Complementary Signal Detection Theory.

Python code (gmisclib-0.65.5 package contiaining the mcmc.py module). This is a general-purpose optimization, sampling, and simulated annealing routine along with utility functions. Documentation is available at http://kochanski.org/gpk/code/speechresearch/gmisclib.

C++ code to implement the Full Signal Detection Theory model (bsr_analysis-0.3.0.tar.gz). This code is callable from python and can be solved with mcmc.py.

These downloads are also available via Sourceforge.

Patterns of Durational Variation in British Dialects

Anastassia Loukina and Greg Kochanski

Talk presented at the PAC workshop in Montpellier, France, 13 September 2010.

This uses linear discriminant classifiers (g_classifiers-0.30.1.tar.gz) and python speech research libraries (gmisclib-0.67.9.tar.gz). Online documentation is available under http://kochanski.org/gpk/code/speechresearch. These downloads are also available via Sourceforge.

The 2008 Oxford Tick1 Corpus

This corpus contains the data from the "Tick1" experiment from ESRC grant "Articulation and Coarticulation in the Lower Vocal Tract" with G. Kochanski and J. Coleman as principal investigators. Data is courtesy of the UK's Economics and Social Research Council, derived from project RES-000-23-1094, 7/2005 through 3/2008.

The experiment involves subjects repetitvely saying simple phrases to a metronome: "under the desk. <tick> under the desk <tick> ...", along with control utterances under normal reading conditions.