There has been much discussion in the news media lately about the phenomenon known as "vocal fry" and its use among English-speaking women in the United States. Vocal fry refers to the irregular vibration of one's vocal folds and it is normally produced with low pitch. In an interview with Terry Gross, Susan Sankin, a speech-language pathologist stated that vocal fry is harmful to one's vocal folds. In a follow-up piece on 7/23/15 on NPR, she maintains this view, stating
...I have heard ENTs say that it can cause damage. And for a lot of the languages where it's a habitual pattern - as you develop from a young age, that's how you're training and using your vocal cords. And I think when you start to fall into that pattern later on, I think that it can cause some damage. Again, I'm not a doctor, so I can't say that I've looked at people's vocal cords and I've seen it, but I have heard ENTs say that they do notice that it can cause damage. And sometimes the jury is out on that as well.
Just what is behind this notion that vocal fry may be damaging for one's vocal folds? After all, what we're calling "vocal fry" is used in many languages to contrast meaning among words, just like one might contrast the words 'heed' and 'hid' by their vowel sounds. It is also ubiquitous throughout the languages of the world to mark boundaries between phrases. How can something that is so common be considered a vocal pathology?
To answer this question, it's necessary to first make a distinction between speech articulation and speech acoustics. Speech articulation involves what you do in your oral cavity to produce speech sounds. Speech acoustics involves what sounds you hear that convey a linguistic message. Phonetics involves the study of both these things and phoneticians are interested in understanding how certain articulations produce certain acoustic characteristics. One can more easily investigate this relationship for sounds with un-hidden articulations. For instance, the 'p', as in 'pan', is made with the lips. One can see them close when this sound is produced and observe silence in the acoustic signal while one's lips remain closed.
The same thing is not true for the vocal folds though. When it comes to the vocal folds, it's often a rather messy business to investigate what they are actually doing. They're quite small (just about 1 - 2.5 cm in length, depending on one's sex) and taking a video recording of them moving during speech involves inserting a small camera attached to a wire through one's nostrils to hang near the upper portion of one's pharynx (throat) and peer downward. As you might imagine, many people object to having foreign objects inserted into their noses.
One way around this is to just look at the acoustic signal and interpret what the configuration of the vocal folds must be. People don't object nearly as much to being recorded as to having wires inserted into their noses. Moreover, plenty of other articulations have consistent acoustic consequences. For instance, lowering one's tongue and jaw during speech changes the acoustic resonances of the oral cavity in a rather consistent manner. So, the theory goes, one can rely on the acoustics of the speech signal to tell us what the speech articulators are doing. So far, so good.
While this method is fairly robust, there's something problematic about it with the vocal folds. What is called "vocal fry" involves irregular vibration of the vocal folds (see below, taken from a previous post). In the figure here, one notices the irregular vocal fold vibrations on the right. Each glottal pulse is individually stronger (has higher amplitude) but the timing between each is erratic. To quote a well-known linguist, this voice quality sounds like "a stick being dragged along a fence."
But, to return to our main interest, what is the articulation that gives rise to this acoustic pattern. The term "vocal fry" refers not to the articulatory configuration, but to one's perception of the acoustics. As it turns out, there are many things that can produce the type of vocal fold vibration that we observe above. Much like a wheel that is fastened too tightly, if one constricts the larynx (where the vocal folds sit), it is harder for the vocal folds to vibrate regularly. Since the vibration of the vocal folds requires consistent airflow from the lungs, if one runs out of breath at the end of a sentence, the vocal folds also do not vibrate so regularly.
For people who have developed vocal fold nodules, brought on by laryngeal cancer or other pathologies, the vocal folds also do not vibrate so regularly. Clearly, the same acoustic pattern matches a number of different articulatory configurations. Yet, all of this irregular vibration is described with a cover term, "vocal fry."
So, if one were to observe vocal fry in different speakers, what could one conclude? While there is independent evidence for the health of speakers in a clinical setting, the notion that vocal fry is pathological is a case of the symptom getting confused with the cause. Since we rely on the acoustic signal to tell us about articulation, we associate the presence of a certain characteristic of the acoustic signal with an articulatory pathology. In other words, vocal fry must be pathological, right? No, in fact this is a classical logical error (affirming the consequent).
Research on the production of voice quality across languages has shown that speakers use a number of different configurations to constrict the larynx and produce what is known as "vocal fry." Acoustically, and only acoustically, these might appear similar to pathologies that produce irregular vibration of the vocal folds. Yet, the cause of the irregular vibration is different. The articulation of the vocal folds is difficult to examine. So, researchers have assumed aspects of their configuration on the basis of what the acoustic signal says. Yet, this only works insofar as there is not a one-to-many association between the acoustic signal and the articulatory mechanism involved.
The problem is, we do have a many-to-one relationship when it comes to voice quality. Thus, one can not just infer on the basis of one part of the acoustic signal what articulation is involved. Speech-language pathologists, like Susan Sankin, might heed this before they label "vocal fry" as damaging to one's vocal folds. It's not the voice quality that is damaging, but this misunderstanding of cause and effect.
What does this mean for the young women whose vocal fry is singled out as being unhealthy and damaging for their careers? It's the attitudes and knowledge about women's voices that needs to change, not the voices themselves.