Tuesday, July 28, 2015

The hard business of trying to specify allomorphs in FLEx

While a substantial part of my research is on the phonetics and phonology of different Otomanguean languages, I have been working on the morphophonology of the Itunyoso Triqui language for many years. Ever since I first started my work on the language, I was fascinated by the many ways in which a single verb root, for instance, could have a multitude of forms when one includes aspectual prefixes and personal enclitics.

One of the most notable things about Triqui morphology is just how much tone plays a role in marking different distinctions. Take the verb /a³chi³/ 'to peel', for example. There are four possible tonal shapes of stems, shown below (note "j" is /h/, "h" is /ʔ/, and a post-vocalic "n" in the final syllable marks contrastive vowel nasality):

Table 1: Stem shapes of verb /a³chi³/ 'to peel.'
This particular paradigm displays some common patterns in Triqui morphology. First, the 1st person singular is marked by a change in tone (to /5/) and involves the insertion of a coda "j" /h/. Second, the 2nd person singular is marked by tone raising to /4/ before the clitic. Third, the perfective prefix on vowel-initial stems is just /k-/. Fourth, the potential prefix involves prefixation of /k-/ and a change of tone on the initial syllable of the root. 

The result of these processes is five possible stem shapes: /a³chi³, a³chij⁵, a³chi⁴, a²chij⁵, a²chi³/, marked in bold above. Each of these morphological processes can be described well enough. However, things start to get rather messy when we wish to include additional verbs. Note the verb /a³chinj⁵/ 'to request' below.

Table 2: Stem shapes of verb /a³chinj⁵/ 'to request.'
We notice different patterns here. Instead of inserting a coda "j" /h/ to mark first person, we delete it from the root and change tone /5/ to /43/. Since the verb stem already has a high final stem tone, we do not observe any tone raising before the 2S clitic /=reh¹/. However, the form of the potential is rather different. Like in the habitual or unmarked form of the verb, we find that the coda "j" /h/ is deleted, but the entire stem changes its tone to /2/. This change is not particular to the 1S either - it occurs with all other persons in the potential, as the example with the 3SM clitic demonstrates. As a result of these processes, we have four possible stem shapes for the verb in Table 2: /a³chinj⁵, a³chin⁴³, a²chin², a²chinj²/.

I won't begin to provide a full analysis of the tonal morphology in Trique here (but see DiCanio, forthcoming). Rather, I wish to focus on two particular patterns and to discuss how they might be analyzed from a practical point of view. The first pattern is the marking of the 1S. This involves either the insertion of a coda "j" if it is not present on the stem or its deletion if it is present. Such a process is called a morphological reversal or exchange rule (see Inkelas, 2014). Tonal changes co-occur with this process for verbs with upper register tones (DiCanio, forthcoming), but we will not focus on these here.

The second pattern involves the way in which the potential aspect is marked. For certain verbs, it is marked by a change to tone /2/ on the syllable to which the prefix is attached, as in Table 1. On other verbs, it is marked by a change to tone /2/ on every syllable of the stem, as in Table 2. In such cases, the 1st person clitic no longer involves a tone change since the tone on the stem is now /2/, which belongs to the lower register. (Incidentally, one might describe this as a case of morphological opacity, where stage 1 prefixal/aspectual morphology bleeds the conditions for the application of clitic tone raising.)

At least segmentally, the 1S clitic is easy enough to characterize, though how might one go about marking such forms in a digital lexicon/dictionary like FLEx? One procedure might be to mark each and every 1S form, e.g. include /a³chij⁵/ 'peel.1S' as a variant of /a³chi³/ 'peel.' While certain of the morphological patterns are motivated by phonological well-formedness constraints (DiCanio, forthcoming), listing the variants in a table or paradigm as above provides a useful framework for describing the morphological patterns within the Triqui lexicon. 

This "listing" approach is the one that I currently use. However, doing this is rather time-consuming, as all words in the Triqui lexicon undergo this very regular alternation (though the tonal processes are rather complex). Doing this also loses the broader generalization of the rule. Moreover, there is currently no neat way of including paradigms within FLEx; one must specify additional forms as variants or allomorphs derived via a rule.

Another approach might be to create a phonological rule within FLEx's phonological grammar. However, the only available way to encode such rules is via a classical rewrite rule. This would produce rules of the form: Vh > V /_# ; and V > Vh /_#. Yet, there is no way to connect this particular rule with the set of morphological processes that it affects. It is an alternation that is primarily used for marking the 1st person singular (though similar alternations also mark previously-mentioned 3rd person discourse referents and derive nominal forms from quantifiers).


The same possibilities seem to be relevant for the potential aspect marking. It is either specified in a paradigm or it can be derived via a rule. However, a new problem presents itself when one considers the latter possibility. For those verbs, as in Table 2, which undergo an entire stem change to tone /2/ with the potential aspect, what is the phonological environment for a rewrite rule? It is the entire word's tonal melody. FLEx currently provides no way of separating the stem's tonal shape from the stem itself as one might do with an autosegmental representation. Thus, FLEx is unable to make sense of a string like /ka²chin²/ 'request.POT.1S.' when it comes to morphological parsing.

This problem is compounded by the nature of Triqui morphology when one considers the interaction between the potential aspect and 1S marking mentioned above. If there are a specific set of rewrite rules for the 1S clitic, one must specify that the tonal part of the alternation does not apply if the stem has undergone a change to the potential aspect. I currently know of no solution as to how one might resolve these issues within a FLEx lexicon.

References:

DiCanio, C. (forthcoming) Tonal classes in Itunyoso Triqui person morphology, in Tone and Inflection, Empirical Approaches to Language Typology series, Mouton de Gruyter, Palancar, Enrique and Léonard, Jean-Léo (eds).

Inkelas, Sharon (2014) The interplay of Morphology and Phonology. Oxford Surveys in Syntax and Morphology. Oxford, UK.

Friday, July 24, 2015

The healthy and unhealthy vocal fries

There has been much discussion in the news media lately about the phenomenon known as "vocal fry" and its use among English-speaking women in the United States. Vocal fry refers to the irregular vibration of one's vocal folds and it is normally produced with low pitch. In an interview with Terry Gross, Susan Sankin, a speech-language pathologist stated that vocal fry is harmful to one's vocal folds. In a follow-up piece on 7/23/15 on NPR, she maintains this view, stating
...I have heard ENTs say that it can cause damage. And for a lot of the languages where it's a habitual pattern - as you develop from a young age, that's how you're training and using your vocal cords. And I think when you start to fall into that pattern later on, I think that it can cause some damage. Again, I'm not a doctor, so I can't say that I've looked at people's vocal cords and I've seen it, but I have heard ENTs say that they do notice that it can cause damage. And sometimes the jury is out on that as well.
Just what is behind this notion that vocal fry may be damaging for one's vocal folds? After all, what we're calling "vocal fry" is used in many languages to contrast meaning among words, just like one might contrast the words 'heed' and 'hid' by their vowel sounds. It is also ubiquitous throughout the languages of the world to mark boundaries between phrases. How can something that is so common be considered a vocal pathology?
To answer this question, it's necessary to first make a distinction between speech articulation and speech acoustics. Speech articulation involves what you do in your oral cavity to produce speech sounds. Speech acoustics involves what sounds you hear that convey a linguistic message. Phonetics involves the study of both these things and phoneticians are interested in understanding how certain articulations produce certain acoustic characteristics. One can more easily investigate this relationship for sounds with un-hidden articulations. For instance, the 'p', as in 'pan', is made with the lips. One can see them close when this sound is produced and observe silence in the acoustic signal while one's lips remain closed. 
The same thing is not true for the vocal folds though. When it comes to the vocal folds, it's often a rather messy business to investigate what they are actually doing. They're quite small (just about 1 - 2.5 cm in length, depending on one's sex) and taking a video recording of them moving during speech involves inserting a small camera attached to a wire through one's nostrils to hang near the upper portion of one's pharynx (throat) and peer downward. As you might imagine, many people object to having foreign objects inserted into their noses.
One way around this is to just look at the acoustic signal and interpret what the configuration of the vocal folds must be. People don't object nearly as much to being recorded as to having wires inserted into their noses. Moreover, plenty of other articulations have consistent acoustic consequences. For instance, lowering one's tongue and jaw during speech changes the acoustic resonances of the oral cavity in a rather consistent manner. So, the theory goes, one can rely on the acoustics of the speech signal to tell us what the speech articulators are doing. So far, so good.
While this method is fairly robust, there's something problematic about it with the vocal folds. What is called "vocal fry" involves irregular vibration of the vocal folds (see below, taken from a previous post). In the figure here, one notices the irregular vocal fold vibrations on the right. Each glottal pulse is individually stronger (has higher amplitude) but the timing between each is erratic. To quote a well-known linguist, this voice quality sounds like "a stick being dragged along a fence."

But, to return to our main interest, what is the articulation that gives rise to this acoustic pattern. The term "vocal fry" refers not to the articulatory configuration, but to one's perception of the acoustics. As it turns out, there are many things that can produce the type of vocal fold vibration that we observe above. Much like a wheel that is fastened too tightly, if one constricts the larynx (where the vocal folds sit), it is harder for the vocal folds to vibrate regularly. Since the vibration of the vocal folds requires consistent airflow from the lungs, if one runs out of breath at the end of a sentence, the vocal folds also do not vibrate so regularly.
For people who have developed vocal fold nodules, brought on by laryngeal cancer or other pathologies, the vocal folds also do not vibrate so regularlyClearly, the same acoustic pattern matches a number of different articulatory configurations. Yet, all of this irregular vibration is described with a cover term, "vocal fry." 
So, if one were to observe vocal fry in different speakers, what could one conclude? While there is independent evidence for the health of speakers in a clinical setting, the notion that vocal fry is pathological is a case of the symptom getting confused with the cause. Since we rely on the acoustic signal to tell us about articulation, we associate the presence of a certain characteristic of the acoustic signal with an articulatory pathology. In other words, vocal fry must be pathological, right? No, in fact this is a classical logical error (affirming the consequent).
Research on the production of voice quality across languages has shown that speakers use a number of different configurations to constrict the larynx and produce what is known as "vocal fry." Acoustically, and only acoustically, these might appear similar to pathologies that produce irregular vibration of the vocal folds. Yet, the cause of the irregular vibration is different. The articulation of the vocal folds is difficult to examine. So, researchers have assumed aspects of their configuration on the basis of what the acoustic signal says. Yet, this only works insofar as there is not a one-to-many association between the acoustic signal and the articulatory mechanism involved. 
The problem is, we do have a many-to-one relationship when it comes to voice quality. Thus, one can not just infer on the basis of one part of the acoustic signal what articulation is involved. Speech-language pathologists, like Susan Sankin, might heed this before they label "vocal fry" as damaging to one's vocal folds. It's not the voice quality that is damaging, but this misunderstanding of cause and effect.
What does this mean for the young women whose vocal fry is singled out as being unhealthy and damaging for their careers? It's the attitudes and knowledge about women's voices that needs to change, not the voices themselves.

Monday, July 6, 2015

Being cooperative is not evidence of confirmation bias

A few days ago, the New York Times posted a piece which argued that confirmation bias is a common failure of human thinking. Confirmation bias is the idea that one tends to interpret new facts in terms of one's existing preconceptions.

The author of the study, David Leonhardt, discusses confirmation bias by way of a mathematical example where the reader is asked to guess the rule determining the sequence "2, 4, 8" by testing additional examples. Thus, one can type in sequences like "4, 8, 16" or "10, 95, 387" and see if they follow the same rule as the sequence "2, 4, 8." If one enters a sequence like "4, 8, 16" into the boxes in Leonhardt's article, one receives a confirmation that it also follows the same rule as that which produced "2, 4, 8."

So, just what is this rule? Leonhardt states:

"...most people start off with the incorrect assumption that if we’re asking them to solve a problem, it must be a somewhat tricky problem. They come up with a theory for what the answer is, like: Each number is double the previous number."

The true rule, Leonhardt explains, is not that each number is double the previous number, but rather that each subsequent value is greater than the preceding value. That people assume the former rule is taken as evidence for confirmation bias. As stated, "Not only are people more likely to believe information that fits their pre-existing beliefs, but they’re also more likely to go looking for such information."

However, it strikes me that there are other, rather sensible reasons that people will assume the former rule that Leonhardt does not consider. One is found among the the well-known maxims of conversation, created by the famous philosopher of language, Paul Grice. These maxims, well-known to any introductory linguistics student, state that conversation is guided by constraints of quantity, quality, relation, and manner. As a default, we assume that speakers will give only enough information, be truthful with it, be relevant to the topic, and be clear, respectively. When speakers deviate from these expectations, we are annoyed with the conversation. In such cases, we might state "He was long-winded." or "He kept going off on tangents." Our ability to follow these maxims demonstrates our cooperation within a conversation. Hence, they fall under what Grice terms the cooperative principle.

Grice's maxim of quantity states that one should not make his/her contribution more informative than required. Thus, when someone asks for directions to a particular room in a building, one does not expect the speaker to provide instructions on how to open a door nor the history of certain rooms that the listener will likely pass. If additional details are provided, our interpretation is that they must somehow be relevant (incidentally, another maxim). So, just what might Grice have to do with Leonhardt's example here?

Consider the initial example that he provides: 2, 4, 8. The reader's expectation from this example is that it is as informative as necessary. If the author chose a sequence where each subsequent value is double the previous value, then this must be relevant to the question. After all, the expectation is that the author has provided this information and it must be important. When we hear that the rule is, "Haha!", not what we assumed, our reaction is one of surprise. Why provide this particular example if any random sequence, like 1, 5/3, 9, would have sufficed?

Providing too much information in this way would seem to be a case of conversational deception. The listeners/readers are led astray believing that the author was following the maxim of quantity and relevance when, in fact, the example was intended to be overly informative. So, are 78% of those who participate in this particular task guilty of confirmation bias? Perhaps some are, but Leonhardt would be wise to consider that most people are guided not by the expectation that the problem is tricky, but rather by the expectation that the author's example does not provide too much information. An entirely different outcome would be produced if the example were as simple as "1, 2, 3."

*Incidentally, the rule "Each number is double the previous number." is just a more specific case of the rule "Each number is greater than the previous number." The first entails the second.