IPOX Demos : All-Prosodic Speech Synthesis

Arthur Dirksen
John Coleman

This document presents audio demos complementing our paper All-prosodic speech synthesis, which appears in:

J.P.H. van Santen, R.W. Sproat, J.P. Olive, & J. Hirschberg (eds.) Progress in Text-to-Speech Synthesis, New York, Springer Verlag. 91-108.

The audio demos are 16-bit .wav files, sampled at 11025 Hz, generated by IPOX. 

Demo 1 - Coarticulation

Spreading of vocalic place features in the phonology is reflected in phonetic interpretation by subtle differences in frication spectra associated with /s/:

Demo 2 - Syllable overlap

Three versions of /bot@l/ bottle, with ambisyllabic /t/, generated with different amounts of syllable overlap:

Demo 3 - Ambisyllabicity

Intervocalic clusters are parsed with maximal ambisyllabicity. In the following words, the bracketed clusters are ambisyllabic:

Note that /t/ is aspirated in winter, but not in system.

Demo 4 - Syllable compression I

Again, three versions of /bot@l/ bottle, this time with different amounts of compression for the first and second syllable:

Note that when the second syllable /t@l/ is compressed to 62%, the vowel is almost fully eclipsed, creating the impression of a syllabic sonorant.

Demo 5 - Syllable compression II

Two versions of /s^powz/ suppose, with different amounts of compression for the unstressed prefix /s^p/:

  • 60% (reduced vowel)
  • 52% (vowel eclipsed)

Demo 6 - Syllable compression III

A segmental analysis of vowel elision would seem to predict that s'pport is phonetically identical to sport . Our analysis in terms of syllable compression correctly predicts subtle (and less subtle) differences:

Note that /p/ is aspirated in s'pport, but not in sport.

Demo 7 - Syllable compression IV

The three words below have been synthesized using full vowels (/ow/ as in blow, /o/ as in pot, and /a/ as in sad ) in analysis as well as phonetic interpretation. By varying syllable compression in accordance with metrical-prosodic structure, we obtain the expected alternations between full and reduced vowels.

Demo 8 - Connected speech

Our first attempt at generation of a full sentence with IPOX.

February 1995, revised June 2000