The following corpora are available for use by University of Oxford students and staff; please contact firstname.lastname@example.org if you need help to access them.
/res/corpora/BNC - Audio and aligned text from the British National Corpus (UK English)
/res/corpora/COLT - Corpus of London Teenage Talk
/res/corpora/CSJ - Corpus of Spoken Japanese
/res/corpora/DASS - Digital Archive of Southern Speech (Southern US English)
/res/corpora/DECTE - Tyneside English (Diachronic)
/res/corpora/emd - English Monosyllable Database
/res/corpora/IViE - Intonation Variation in English
/res/corpora/NECTE - Tyneside English
/res/corpora/SCOTS - Local copy of SCOTS corpus, with our time-aligned transcription files
/res/corpora/sec The text of the Spoken English Corpus.
Postscript papers, speech and text databases
The Phonetics Lab has a number of useful resources available on the Linux network in the /res directory. These include:
A collection of miscellaneous papers in compressed postscript format. These may be viewed from using a command such as zcat filename.ps.Z | xpsview -.
Machine-readable dictionaries including:
the Oxford Advanced Learner's Dictionary
the Kucera-Francis dictionary
the Carnegie-Mellon Pronouncing Dictionary
two British English word frequency dictionaries derived from the British National Corpus
1. Phonetic fonts for word processing
Fonts allowing you to display and print characters and diacritics from the International Phonetic Alphabet are available from several sources. These include the Summer Institute of Linguistics, University of Toronto, Linguist's Software, Inc. and Adobe Systems, Inc.
Our preferred solution for both PC and Mac platforms is to use the SIL Doulos IPA fonts available under freeware license from the Summer Institute of Linguistics. These fonts contain every current character, diacritic and suprasegmental mark in the International Phonetic Alphabet.
For futher details on SIL IPA (including information on how to download and install the fonts), click here.
2. Phonetic fonts for the web
There are several different ways of displaying phonetic characters in documents on the web, each of which has advantages and disadvantages in terms of ease of use, browser compatibility and visual quality. Two solutions are unicode and in-line graphics.
Many unicode fonts include IPA symbols, and since most browsers now render unicode, this is a recommended method for getting IPA into web pages.
If you require a only a small number of phonetic characters, you can embed JPEG or GIF graphical images in your web pages. This method has the advantage that it works with nearly all browsers, but creating the graphics and embedding them in the web pages is quite time-consuming.
James Tauber has created a set of over a hundred phonetic symbols in GIF format which can be downloaded from http://www.jtauber.com/linguistics/phonGIF/
3. Phonetic font-related links
IPA guide to phonetic fonts for word-processing
IPA guide to phonetic fonts for www Documents
Yamada Language Center's guide to phonetic fonts
Phonetic fonts for TeX and LaTeX
Type IPA phonetic symbols