Sunday, November 28, 2021

On the lexicalization of Triqui compounds

In the process of doing historical reconstruction, one is often led to believe that the conditioning factors leading to sound change are specific to a phonotactic context, i.e. one finds /k/ > [tʃ]/_i and perhaps only in onsets. Yet, there are several variable patterns in Itunyoso Triqui compounds that suggest that stress-induced simplification might also cause unique types of sound changes.

As a bit of background, it is important to know that Itunyoso Triqui words are mostly polysyllabic. About 70% of the lexicon is disyllabic or trisyllabic roots. Though, monosyllabic roots have higher token frequency in running speech (as per Zipf's law). The final syllable of these morphemes has special status. It is phonetically longer than non-final syllables and most of the contrasts occur on the final syllable (cf. DiCanio 2010).

What occurs in the final syllable in a polysyllabic word?
a. Every possible tone: /1, 2, 3, 4, (4)5, 13, 32, 43, 31/.
b. All consonants: /p, t, k, kʷ, tʃ, ʈʂ, ʔ, m, n, ⁿd, ᵑɡ, ᵑɡʷ, ɾ, β, s, l, j, ˀm, ˀn, ˀⁿd, ˀᵑɡ, ˀɾ, ˀβ, ˀl, ˀj/.
c. All vowels: /i, e, a, o, u, ĩ, ã, ũ/
d. Coda consonants /ʔ, ɦ/ (though all syllables are otherwise open).

What occurs in the non-final syllable of a polysyllabic word?
a. Only level tones /1, 2, 3, 4/, but the caveat is that tones /1/ and /4/ are not truly contrastive here - they only occur due to leftward tonal spreading onto the non-final syllable (cf. DiCanio, Martínez Cruz, and Martínez Cruz 2020). So, really it's just tone /2/ and tone /3/ that contrast here.
b. Only simple consonants (no prenasalized stops, no glottalized sonorants, no glottal stop): /p, t, k, kʷ, tʃ, ʈʂ, m, n, ɾ, β, s, l, j/.
c. Only oral vowels /i, e, a, o, u/ and mid vowels only occur if they also occur in the final syllable. So, really just /i, a, u/ are contrastive here.
d. All syllables are open.

So, we have many asymmetries in which sounds occur by syllable. We can call this stress or prominence or whatever term you wish, but the patterns above occur mostly without exception.

There is an additional observation too - a contrast between singletons and geminates only occurs in monosyllabic words, e.g. ta³ 'this' vs. tta³ 'field', nũ³² 'be inside' vs. nnũ³² 'epazote.' This contrast does not occur in polysyllabic words (cf. DiCanio 2010, 2012).

Now that we know about the stress-based consonant patterns, what does this mean for sound change? Consider that one very common type of word formation process in Triqui (and in Otomanguean languages more generally) is compounding. When each morpheme of a compound retains some of its phonological identity as a distinct root, there may be no sound changes. Yet, if the compound begins to lexicalize, the restrictions on phonological distributions above start to cause rather robust changes. Let's look at some examples.

1. The Triqui word 'de veras/truly' is a reduplicated form yya¹³ yya¹³, literally meaning 'true true.' Most adverbs in the language appear post-verbally before personal clitics (V+ADV+SUBJ order), so clitic morphophonology applies to them. The 1P clitic involves a > o, glottal stop insertion, and tone 4. Yet, with this word you get yyo¹³ yyoʔ⁴, with vowel harmony. Then with lexicalization, you can't get a contour tone on a non-final syllable and no geminates are permitted in polysyllabic words, so it's yo³yoʔ⁴.

2. The Triqui word 'each' is a reduplicated compound  ᵑɡo² ˀᵑɡo² 'one-one.' Yet, it is often pronounced as [ko²ˀᵑɡo²] in running speech. You lose the prenasalized stop in the penultimate syllable as per the patterns above.

3. The Triqui word 'soda/soft drink' is a compound nne³² tsiʔ¹ 'water + sweet.' Yet, it is often pronounced as [ne³siʔ¹]. You lose the contour tone and the gemination on the penultimate syllable because neither are permitted there.

4. The Triqui word for 'bread' is a historical compound /ʈʂːa³ ʈʂũɦ⁵/, lit. tortilla+horno (tortilla del horno). It is pronounced as [ʈʂa³ʈʂũɦ⁵] by older speakers but as [tʃa³tʃũɦ⁵] by younger speakers (who have mostly merged the retroflex and post-alveolar affricates). The historical gemination of 'tortilla' has been lost here.

5. The Triqui word for 'rifle' is [ʈʂu³ʈʂi³aʔ³], but the roots are ʈʂːũ³ 'wood' + ʈʂi³aʔ³  'to shoot.' In the compound, we see observe degemination (because it's in a disyllabic word now) and loss of the vowel nasalization too. And as mentioned above, many speakers now produce the retroflex series as post-alveolar.

I am mentioning this examples here because, as per Rensch (1976), it is extremely difficult to reconstruct non-final syllables in many Otomanguean languages. It may be that (a) processes of reduction in unstressed syllables and (b) a general pattern of distributional asymmetries in the phonological inventories will help to reconstruct them. The [k] you observe that comes from a reduced [ᵑɡ] (as in #2 above) might only occur in a handful of words because reduplicated compounds are relatively uncommon in Otomanguean languages.

In sum, neutralization due to stress-based distributional asymmetries can lead to superficial similarities between words, e.g. the /n/ onset in #3 'soda' is from */nn/ while a different word like /ne³tã³/ 'ejote/green bean' is probably related to Mixtec words like /ñityì/ (SJC Mixtec) where onset /n/ has a */ny/ reflex. 

Saturday, April 17, 2021

Linguistic tidbit

Some linguists obsessed with a theory of all
forget there are others who need to think small,
of how to inflect a verb that's perfective
or reasons why 'so' isn't just a connective.

And others might glean an elaborate fact
from language in use as a societal act
with agents whose motives are far from mundane
but an essence of self quite hard to contain.

There's meaning and purpose in digging quite deep
at cognates in history whose meaning we keep,
And time to get lost in the tangle of weeds,
a morphological context and the pattern it feeds.

And many a language, pattern, and word
hold secrets and histories that we've never heard
Of just how a people connect with the past
or just how a pattern changes so fast.

So before you admonish the detail-obsessed
those whose minutiae is seldomly blessed
with an appearance in Nature or Science and so
appears to be findings you don't need to know.

An ego obese with a theory so tangled
Can deflate in an instant when new data is wrangled.
Consider that details, however so small
are the basis of asking the biggest questions of all.

Saturday, January 2, 2021

What does not work for sentence elicitation with Triqui speakers

One part of doing fieldwork is discovering just what does not work while you're in the field. Several summers ago, after receiving some critical methodological remarks from a reviewer on a submission of mine, I started to seriously question just what works in my fieldwork.

We're all addicted to our past methods and sometimes we need a jolt to reconsider what we're doing in the field. I have a tendency to rely a lot of repetition among speakers because there is no literacy among most speakers in Triqui. There are three options for elicitation here, as it happens. One possibility is to just ask for speakers to provide a translation of a Spanish sentence, another is to have them see some image and describe it, and another is to have them repeat after another speaker who can read Triqui (my main consultants).

I rely a lot on the third method, but it's possible that Triqui speakers will overly mimic what the other speaker is doing when they are doing this. (There is a serious question as to what they would mimic - there is no non-tonal prosody in the language, but perhaps speech rate and optional pauses?) So, this logically leads reviewers and other linguists to suggest the first two options above. We can toss out the second option for anything that involves more carefully-controlled speech. If you are want to use identical nouns but change the verbs, for instance, this simply leaves way too much open to interpretation. Speakers will never provide the target sentence.

But what about the first option? This also often fails for various reasons. I'm in the process of looking at a large data set examining tonal changes with person morphology in Triqui across 11 speakers. We tried the translation method, but it regularly fails with speakers. Here's a transcription of one exchange:

Consultant: Cantaste una canción (You sang a song.)
Speaker: Ka³ra⁴³ ngo² chah³  (I sang a song.)
Consultant: Ka³raj⁵ ngo² chah³ (You sang a song.)
Speaker repeats consultant

Many fieldworkers might laugh at the following exchange - asking people to get personal pronouns correct in translation is a common issue. But if you're looking at how words change tone with personal pronouns, then it's important to get right.

There is an added issue though - we often assume that we can examine speech in translation because we assume strong bilingualism or a clear 1:1 mapping between words in a lingua franca and words in a language we're investigating in a field context. Sometimes neither can be found. In the same recording, we observe the speaker becoming confused when he has to distinguish between lavas 'you are washing' and lavaste 'you washed' in Triqui. 

Consultant: Lavas la ropa. (You wash the clothing.)
Speaker: nan...[s]... (long pause)
Me: Nanj⁵ reh¹...
Speaker: Nan⁴³ (I wash)....(pause)
Consultant: Nanj⁵ reh¹ a⁴sij⁴ (You wash the clothing.)
Speaker repeats consultant

In this exchange, the speaker is caught off guard because he is either uncertain about the aspect marking of 'wash' (as the previous exchanges involved him producing it with the perfective prefix - ki³nanj⁵) or he is confused about the pronominal referents again. The result is the same though - the speaker ends up relying on repetition from another speaker/consultant.

If you have to rely on repetition, perhaps a way around it is to have speakers count between hearing a sentence and repeating it. If the concern over repetition in elicited speech contexts in the field is that speakers are likely to mimic, then counting before repeating might resolve this. The idea here is that counting takes time and auditory memory decays quickly. So, if speakers have to say "one, two, three" (or ngoj¹³ bbi¹³ ba¹hnin³ in Triqui), then their reproductions of the target sentences might more closely resemble long-term memory representations for the words in the short sentences. I owe this idea to Lisa Davidson (via one of our interesting Facebook/Twitter discussions).

But in practice this only kinda ends up working. Speakers can do this, but they end up sometimes forgetting the target sentence. So, you get exchanges like the following:

Consultant: Ka³ne³ ni²hrua⁴¹ reh¹ chu⁴ba⁴³ beh³ (Te sentaste mucho en la casa.)
Speaker: ngoj¹³ bbi¹³ ba¹hnin³... ka³ne³..... ka³ne³...
Consultant: Ka³ne³ ni²hrua⁴¹ reh¹ chu⁴ba⁴³ beh³ 
Speaker repeats consultant

In effect, it is hard to pay attention to reproducing specific sentences when you have to produce numbers first. So, the end result is to just repeat what the consultant has said. When you add the additional stress of being recorded to this (many speakers become nervous knowing they are recorded), this can produce pauses/errors in the elicitation.

So, what is the way around all of this? One thing we might address head on is the assumption of mimicry. We seem to believe that all speakers/participants, when asked to repeat words, will focus on the specific phonetic characteristics of the signal they heard instead of the content. Though, the jury on this is still out. I have found two papers that have addressed the question - Cole and Shattuck-Hufnagel (2011) and D'Imperio, Cavone, and Petrone (2014). In both cases, speakers were told to explicitly imitate the form of the speech signal and they mostly imitated pitch accents, but not F0 level. In a language where only level is adjustable (lexical tone is fixed), what predictions does this previous work make for Itunyoso Triqui? I'm testing this with a study I ran in 2019. There is no work on what tone languages speakers do in such tasks (and we have no idea about what happens when the concern is just getting the words right - not trying to imitate fine phonetic detail).

I wish I could find an ideal way to do careful elicitation that was immune to these concerns. In the meanwhile though, prosody-folks might consider a warning mentioned in DiCanio, Benn, and Castillo García (2020) - no method for the elicitation of prosody is immune to stylistic effects. Read speech is just as much a speech style as repeated speech and most languages have no writing system or literacy (Harrison 2007). That means that this methodological concern must be addressed as we look at prosodic systems across more of the world's languages.

References:
Cole, J. and Shattuck-Hufnagel, S. (2011). The phonology and phonetics of perceived prosody: What do listeners imitate? In Proceedings from Interspeech 2011, pages 969–972. ISCA.

D’Imperio, M., Cavone, R., and Petrone, C. (2014). Phonetic and phonological imitation of intonation in two varieties of Italian. Frontiers in Psychology, 5(1226):1–10.

DiCanio, C., Benn, J., and Castillo García, R. (2020). Disentangling the effects of position and utterance-level declination on the production of complex tones in Yoloxóchitl Mixtec. Language and Speech, Onlinefirst (https://journals.sagepub.com/doi/10.1177/0023830920939132):1–43.

Harrison, K. D. (2007). When languages die. Oxford University Press.