The vocal tract and larynx

(Examples and exercises are at

1. The vocal tract

If the human head were to be cut in half down the midline, the organs of speech would appear like this. (Click on the MRI image to see a movie clip showing what it looks like during speech.)

Midsagittal section of head

Organs of speech

(Downloaded from

The lungs, which are in the chest, are connected to the tube which goes up to the vocal cords (the trachea or windpipe). Perhaps the most important organ of speech, the brain, is visible in the MRI image.

2. The larynx
larynx from front
Above: (i) External structure of the larynx, as viewed from the front. (ii) Coronal (vertical) section through the larynx, as viewed from the back.

The upper part of the interior of the larynx may be examined using various instruments, allowing us to obtain images such as the following. Viewed from above, with the vocal cords slightly apart, the upper part of the larynx appears like this:
labelled vocal cords

The space between the vocal cords is called the glottis. When the vocal cords vibrate, the resulting disturbance in the air imparts a "buzzing" quality to the speech, called voice or voicing . (Click on the above figure to see a short movie of the vocal cords during sustained production of an [ɑ]-like vowel. The movie was made using stroboscopic lighting, so the apparent rate of vibration is much slower than the real rate of vocal cord vibration.)

(Left) With the vocal cords completely closed, as for a glottal stop [ʔ]: (Right) With the vocal cords wide open, as during quiet respiration preceding speech. (Note how the "bumps" of the arytenoid cartilages are held apart, making the tissue taut):
Cords closed Cords open

3. Phonation is the contribution the larynx makes to speech.

The nature of voicing: the mucosal wave

For voiced sounds, the vocal cords are held together by the action of the arytenoid cartilages, but they are held together less tightly than for a glottal stop (1).
Mucosal wave

When air is forced up the trachea from the lungs, at a certain pressure it is able to force its way through the vocal cords, pushing them open (2, 3 and 4). As air passes through the glottis, the air pressure in the glottis falls, because when a gas or liquid runs through a constricted passage, its velocity increases (the Venturi tube effect). This increase in velocity results in a drop in pressure of that gas or liquid (the Bernouilli principle). Because of the drop in pressure, the vocal cords snap together, at the lower edge first, closing again (6-10). The cycle then begins again. A single cycle of opening and closing takes in the region of 1/100th second: therefore, the cycle repeats at rates in the region of 100 times per second (to be more specific, between about 80-200 cycles per second). This rate is too rapid for the human ear to be able to discriminate each individual opening/closing of the vocal cords. However, we perceive variations in the overall rate of vibration as changes in the pitch of the voice, "pitch" being the perceptual correlate of acoustic frequency.


If the vocal cords are held apart, air can flow between them without being obstructed, so that no noise is produced by the larynx. In voiceless fricatives such as [f], [s], [ʃ], [ʂ], [θ ], [ç], [x], and [ χ], the vocal cords are held apart. If there is a sufficiently high rate of airflow through the open glottis, a quiet disruption of the air, whisper , results. The glottal fricative [h] has whisper phonation, as do whispered vowels, and the aspiration portion of voiceless aspirated stops such as English /p/, /t/, or /k/ in pre-vocalic position. The IPA diacritic [° ], written below a symbol, indicates such voicelessness. Voiceless vowels, nasals and liquids can be transcribed using that diacritic. For stops and fricatives, on the other hand, there are separate letters for voiced and voiceless sounds, e.g. [b] (voiced) vs. [p]. In these cases, the voicelessness diacritic can be used to denote a (possibly partially) devoiced realisation of a phoneme that might otherwise be expected to be voiced, such as the pronunciation of the / ɹ/ in /tɹeIn/, "train", in which the /ɹ/ may be devoiced due to its following a voiceless, aspirated /t/.

Breathy voice, or murmur

This phonation is a combination of breath and voice, which occurs if the vocal cords do not close completely along their entire length while they are vibrating, the air which flows through the remaining aperture adds whisper to the vocal cord vibrations.

Creaky voice, or creak. In this kind of voicing, the vocal cords are stiffened, so that they are very rigid as they vibrate.