Twitter and Reddit users are up in arms lately over the latest case of phonetic misperception (remember "Laurel" and "Yanny"?). This time it concerns the love-able Grover from Sesame Street who, if you watch the clip below, is either saying "that sounds like an excellent idea" or "that's a f*ckin' excellent idea." Did Grover drop the F-bomb on Sesame Street?
As a phonetician, these types of misperceptions are sometimes fun because they force you to carefully listen to what people (in this case, Grover's voice) are doing as they produce speech very quickly. Phoneticians focus on the transcription and, more often, careful analysis of speech. Speech is fast, speech is messy, and when the conditions are right, one can misperceive one sound for another.
What is even more difficult in a case like this is that Grover is always speaking quickly. He's the puppet constantly on his quadruple espresso. So this means that many of the sounds you expect to hear in certain words are actually quite different. Vowels can be cut short and sound very different. Consonants can be deleted entirely. Both of these cases are what linguists call phonetic reduction. To understand why you hear the F-word instead of "like an", we must understand a little bit about how sounds reduce.
If you were speaking very carefully, you pronounce "That sounds like an..." as [ðæt saʊndz laɪk ən], where each vowel is carefully produced and each of the consonants at the end of "sounds" are pronounced distinctly. Yet, humans are rarely this clear. Moreover, if we were always this clear, our speech would be quite slow. Life is short and so becomes our speech.
In reality, we do not pronounce this phrase this way. One thing that English speakers will do is to reduce the final consonants in 'sounds.' Instead of pronouncing each of the /n/, /d/, and /z/ sounds (yes, it's more like a "Z" here - spelling is deceptive), people will pronounce just the /n/ and the /z/. We do this all the time. A word like "friends" has no "d" sound. This pattern leaves us with [ðæt saʊnz laɪk ən], with one sound missing.
Grover takes reduction a few steps further than this, but his manner of pronouncing words is not very different from what other English speakers do when speaking quickly. Instead of pronouncing the vowel /aʊ/ (the vowel in "ouch"), he reduces this vowel down to something like the vowel in 'sun' /sʌn/. This might seem weird to you, but try saying "that sun's nice" and "that sounds nice" quickly after each other. They might in fact be hard to distinguish. The same thing happens with the vowel in 'like' - it's pronounced more like the vowel in 'luck.' So, now we have gone to a phonetic sequence of [ðæt sʌnz lʌk ən].
That alone is not enough to make you hear the F-bomb, but Grover's voice does two additional things that many English speakers have been doing for some time. First, he does not pronounce the "n" in the word "sounds." The "n" sound is a nasal consonant and many English speakers just nasalize their vowels in a context like the word "sounds." Essentially the "n" is no longer a consonant, but its character is now on the vowel. So, going further, we've now gone to [ðæt sʌ̃z lʌk ən] (the squiggly line over the vowel is the phonetic transcription for nasalization).
The second thing that Grover does is to pronounce what is normally a "z" sound as an "s" sound. American English speakers do this all the time. Try saying the words 'fuzz' and 'fuss.' The words sound different (hint - the vowel is longer in one case), but the final "z" and "s" are often both pronounced like [s]. So, moving along, now we've gone to [ðæt sʌ̃s lʌk ən]. But how do you get an "f" here?
From [sl] to [f] - the big jump
In running speech, there are no pauses. Words blend right into each other. This is why it's possible to mishear "kiss the sky" as "kiss this guy" (as in the famous Jimi Hendrix song). So, in reality, Grover is pronouncing [ðætsʌ̃slʌkən], with no pauses. However, something funny happens in the sequence between the "s" sound and the "l" sound. The "s" sound is a voiceless consonant, meaning that your vocal cords are not vibrating when you pronounce it. Try saying the "s" sound while touching your neck and then the "z" sound while doing the same. You can feel your vocal cords vibrate in the "z" sound but not in the "s" sound.
When a voiceless sound like [s] precedes a voiced consonant like "L" [l], it can cause the voiced consonant to become voiceless. Phoneticians and phonologists call this voicing assimilation. English speakers make the "L" sound voiceless in words like "play" [pl̥eɪ] (the dot under the consonant indicates that it is voiceless). Try saying "play" and holding the "L" sound. It should not sound like a typical "L" sound to you (and if you say "puh-lay", you're cheating). The "L" is voiceless here because the "p" sound is voiceless. Grover's voice did this in the clip - he says [ðætsʌ̃sl̥ʌkən...].
But why does this sound like "f"? A voiceless "L" sound actually sounds an awful lot like 'f' - it shares a lot more of the acoustic characteristics with "f" than it does with other sounds that you are used to. It is possible to hear [sl̥] as [f] as a result. However, this misperception is in your ears. If you are not used to listening for these sorts of phonetic sequences, especially when people (or muppets) are speaking quickly, then you might mis-hear these sequences.
That brings us to the big leap. Take a look at the phonetic differences between Grover's utterance and a sequence with the F-bomb in it:
[ðætsʌ̃sl̥ʌkən...] - 'that sounds like an' - Grover's speech
[ðætsʌ̃fʌkən...] - 'that's a f*ckin' - speech with the F-bomb
The only differences here between the two phrases is in the initial consonants and, for reasons described above, listeners are likely to mishear such sequences. Grover, in my estimation, is a perfectly well-behaved muppet. Though, he should maybe cut down on the coffee consumption.