Pronounce this phrase as quickly as possible: “The hat rack tapped the bottle.” Now, take the words: “hat rack,” “tap” and “bottle,” read them aloud and sound out each individual sound, paying careful attention to how you pronounce the sounds that are written with the letter <t>. Did anything change?
It is likely you pronounced each <t> sound differently within the whole phrase, but maybe pronounced them all as a “regular [t]” when deliberately sounding out the words.
Say the phrase quickly again, and you will probably notice the following pronunciations: In “hat rack,” the sound is a quick stop in your throat. In “tap,” it is a quick contact of your tongue close behind your teeth. In “bottle,” it’s probably a quick tap of the tongue on the roof of the mouth, similar to the "r" in Spanish's “pero.”
Although this might seem totally inconsequential or even trivial, the pattern you observe is fundamental to the modern understanding of language, namely that there are several levels of representation of language and speech sounds.
The reason each sound can deliberately be sounded out as the “regular t”— such as in “tap”— is that in our brains, that sound represents the idea of a /t/. Yes, we think there is a /t/ in words like “hat” and “Batman” because they are spelled with a <t>, but there is also a level of cognitive awareness of the /t/ sound independent of spelling.
In the discipline of linguistics, this unit is called a phoneme, and it represents the smallest perceivable unit of sound.
This is closely related to the notion of the phone, the smallest physical unit of sound. So, in the case of the examples above, the three phones (the actual pronunciations) all correspond to the same psychological unit of reality, the “idea” of /t/.
Not convinced? Take, for instance, the –s ending which denotes third singular present tense verbal inflection as in “he goes” and “he makes.” Pronounce the <s> sound, and you will discover that the first one is actually a [z]. These two physical sounds derive from the same level of psychological reality.
Why is this important? Firstly, this model helps capture that language exists in our brain often in a form distinct from that which we actually speak or hear. It also identifies that, although we hear the physical phenomenon of speech, we are actually, as some linguists have asserted, “hallucinating,” in that we actually perceive phonemes — psychological units of reality. Trippy.
This model can also explain differences in dialects within any language. In the case of the /r/ in English, everyone is conscious of the “idea” of /r/ in a word like “car,” but, depending on your accent, you might not pronounce it. Nonetheless, despite different pronunciations, any speaker of English knows what word, a different level of psychological reality, is being invoked.
Speaking of accents, the cases of /t/ that I referenced at the start of the article are interesting for a study of sound variation in English. If you pronounce “bottle” or “hat” with a “regular t,” your friends might say you sound “British,” or maybe a little too “deliberate,” which shows that these specific pronunciations are thought to be tied to specific accents and thus represent distinct identities.
So you may be quite shocked if you pay attention to the words and phrases you hear on a daily basis and sound them out, finding the sounds that come out of your mouth are often not exactly the same as they exist in your head, although they are inextricably related.
The fact that humans can receive and decipher such a complex set of signals, all the while controlling the production of more speech signals all in the course of milliseconds, is a testament to the marvel that is the human brain.
Don’t take it for granted.
Jordan MacKenzie is a second-year UF linguistics master’s student. His column appears on Wednesdays.