Linguistic tidbit

Some linguists obsessed with a theory of all
forget there are others who need to think small,
of how to inflect a verb that's perfective
or reasons why 'so' isn't just a connective.

And others might glean an elaborate fact
from language in use as a societal act
with agents whose motives are far from mundane
but an essence of self quite hard to contain.

There's meaning and purpose in digging quite deep
at cognates in history whose meaning we keep,
And time to get lost in the tangle of weeds,
a morphological context and the pattern it feeds.

And many a language, pattern, and word
hold secrets and histories that we've never heard
Of just how a people connect with the past
or just how a pattern changes so fast.

So before you admonish the detail-obsessed
those whose minutiae is seldomly blessed
with an appearance in Nature or Science and so
appears to be findings you don't need to know.

An ego obese with a theory so tangled
Can deflate in an instant when new data is wrangled.
Consider that details, however so small
are the basis of asking the biggest questions of all.

What does not work for sentence elicitation with Triqui speakers

One part of doing fieldwork is discovering just what does not work while you're in the field. Several summers ago, after receiving some critical methodological remarks from a reviewer on a submission of mine, I started to seriously question just what works in my fieldwork.

We're all addicted to our past methods and sometimes we need a jolt to reconsider what we're doing in the field. I have a tendency to rely a lot of repetition among speakers because there is no literacy among most speakers in Triqui. There are three options for elicitation here, as it happens. One possibility is to just ask for speakers to provide a translation of a Spanish sentence, another is to have them see some image and describe it, and another is to have them repeat after another speaker who can read Triqui (my main consultants).

I rely a lot on the third method, but it's possible that Triqui speakers will overly mimic what the other speaker is doing when they are doing this. (There is a serious question as to what they would mimic - there is no non-tonal prosody in the language, but perhaps speech rate and optional pauses?) So, this logically leads reviewers and other linguists to suggest the first two options above. We can toss out the second option for anything that involves more carefully-controlled speech. If you are want to use identical nouns but change the verbs, for instance, this simply leaves way too much open to interpretation. Speakers will never provide the target sentence.

But what about the first option? This also often fails for various reasons. I'm in the process of looking at a large data set examining tonal changes with person morphology in Triqui across 11 speakers. We tried the translation method, but it regularly fails with speakers. Here's a transcription of one exchange:

Consultant: Cantaste una canción (You sang a song.)
Speaker: Ka³ra⁴³ ngo² chah³  (I sang a song.)
Consultant: Ka³raj⁵ ngo² chah³ (You sang a song.)
Speaker repeats consultant

Many fieldworkers might laugh at the following exchange - asking people to get personal pronouns correct in translation is a common issue. But if you're looking at how words change tone with personal pronouns, then it's important to get right.

There is an added issue though - we often assume that we can examine speech in translation because we assume strong bilingualism or a clear 1:1 mapping between words in a lingua franca and words in a language we're investigating in a field context. Sometimes neither can be found. In the same recording, we observe the speaker becoming confused when he has to distinguish between lavas 'you are washing' and lavaste 'you washed' in Triqui. 

Consultant: Lavas la ropa. (You wash the clothing.)
Speaker: nan...[s]... (long pause)
Me: Nanj⁵ reh¹...
Speaker: Nan⁴³ (I wash)....(pause)
Consultant: Nanj⁵ reh¹ a⁴sij⁴ (You wash the clothing.)
Speaker repeats consultant

In this exchange, the speaker is caught off guard because he is either uncertain about the aspect marking of 'wash' (as the previous exchanges involved him producing it with the perfective prefix - ki³nanj⁵) or he is confused about the pronominal referents again. The result is the same though - the speaker ends up relying on repetition from another speaker/consultant.

If you have to rely on repetition, perhaps a way around it is to have speakers count between hearing a sentence and repeating it. If the concern over repetition in elicited speech contexts in the field is that speakers are likely to mimic, then counting before repeating might resolve this. The idea here is that counting takes time and auditory memory decays quickly. So, if speakers have to say "one, two, three" (or ngoj¹³ bbi¹³ ba¹hnin³ in Triqui), then their reproductions of the target sentences might more closely resemble long-term memory representations for the words in the short sentences. I owe this idea to Lisa Davidson (via one of our interesting Facebook/Twitter discussions).

But in practice this only kinda ends up working. Speakers can do this, but they end up sometimes forgetting the target sentence. So, you get exchanges like the following:

Consultant: Ka³ne³ ni²hrua⁴¹ reh¹ chu⁴ba⁴³ beh³ (Te sentaste mucho en la casa.)
Speaker: ngoj¹³ bbi¹³ ba¹hnin³... ka³ne³..... ka³ne³...
Consultant: Ka³ne³ ni²hrua⁴¹ reh¹ chu⁴ba⁴³ beh³ 
Speaker repeats consultant

In effect, it is hard to pay attention to reproducing specific sentences when you have to produce numbers first. So, the end result is to just repeat what the consultant has said. When you add the additional stress of being recorded to this (many speakers become nervous knowing they are recorded), this can produce pauses/errors in the elicitation.

So, what is the way around all of this? One thing we might address head on is the assumption of mimicry. We seem to believe that all speakers/participants, when asked to repeat words, will focus on the specific phonetic characteristics of the signal they heard instead of the content. Though, the jury on this is still out. I have found two papers that have addressed the question - Cole and Shattuck-Hufnagel (2011) and D'Imperio, Cavone, and Petrone (2014). In both cases, speakers were told to explicitly imitate the form of the speech signal and they mostly imitated pitch accents, but not F0 level. In a language where only level is adjustable (lexical tone is fixed), what predictions does this previous work make for Itunyoso Triqui? I'm testing this with a study I ran in 2019. There is no work on what tone languages speakers do in such tasks (and we have no idea about what happens when the concern is just getting the words right - not trying to imitate fine phonetic detail).

I wish I could find an ideal way to do careful elicitation that was immune to these concerns. In the meanwhile though, prosody-folks might consider a warning mentioned in DiCanio, Benn, and Castillo García (2020) - no method for the elicitation of prosody is immune to stylistic effects. Read speech is just as much a speech style as repeated speech and most languages have no writing system or literacy (Harrison 2007). That means that this methodological concern must be addressed as we look at prosodic systems across more of the world's languages.

Algunas conexiones entre raices triquis y amuzgos

La semana entre las fiestas de navidad (o la coronidad de cubrebocas) es siempre un tiempo para reflejar y relajar en la casa para mí. Después de un semestre muy ocupado con charlas, conferencias, la enseñanza de dos cursos, dictamenes, sobrevivir en una pandemia etc, necesito tiempo para no pensar en el trabajo. En estos tiempos a veces regresan mis pasiones de trabajo - la fonología histórica de lenguas mixtecanas. Ya sé que debo continuar leyendo mis libros de ficción fantasía, tocar el piano y mirar películas largas pero sabes qué? Dicen que las ideas interesantes se surgen de estos momentos donde no se pone uno tanto a la meta de trabajar. Bueno. No necesito explicarme - eso es el amor de lenguas mixtecanas.

Siempre he pensado que las relaciones entre triqui y la familia mixtecana fueron interesantes. Hay buenos cognados para una gran cantidad de palabras (véase aquí) pero a mí me parece que a eso de 70% de las raices triquis no tienen cognados claros en lenguas mixtecas (no mixtecanas) por el trabajo que hemos hecho Michael Swanton y yo. Y por el diccionario cuicateco de Anderson y Roque (1983), parece que hay menos cognados con el cuicateco. Entonces, de dónde vienen las otras raices que observamos en las lenguas triquis? 

Empecé a estudiar algo de Amuzgo en estos días por el tesis de Cortés Vásquez (2016) para ver si hubieran algunos cognados entre triqui y amuzgo. Tal vez hay más raices en común entre lenguas amuzgos y lenguas triquis y esta comparación me podría decir de dónde vino el 70% de las raices triquis que me siguen pareciendo misteriosas. Revisé el tesis entero de Cortés Vásquez buscando cognados más obvios para mí y recopilé una lista de 68 palabras que parecen cognados con formas triquis de mi léxico de triqui de Itunyoso. Hay unas observaciones tal vez interesantes aquí abajo.

1. Más evidencia para la formación de raices con geminación inicial /kk/:

Triqui de Itunyoso San Pedro Amuzgos Proto-Mixteco
(Josserand 1983)
kkə̃:³² ntkẽĩ³ --- semilla
kkə̃h³ tkõ³⁵ --- huarache
kkə̃:³ tskĩ³ *jɨkɨ̃ʔ calabaza
kkoh³ ntsko³ *juku hoja
kkaʔ³ ska³ --- vela

La forma en proto-mixteco por 'huarache' es */ndiʃẽʔ/ (sin relación?) y las formas por 'semilla' y 'vela' no existen en el trabajo de Josserand (1983). Por mi trabajo (DiCanio 2014), el origen de la mayoría de las consonantes geminadas ("fortis") es la pérdida de una sílaba pre-tónica. Normalmente esta sílaba empieza con una semivocal opcional /j, w/ y una vocal alta como se observa en los cognados con proto-mixteco. En Amuzgo, parece que hay una oclusiva o fricativa en estas sílabas.

Ya sé que las vocales son raras acá. Por qué? Voy a adivinar algo no más porque es mi blog, no un artículo. Hay más vocales en Amuzgo y secuencias de vocales también. Amuzgo tiene 7 vocales orales y 7 vocales nasales. Pero triqui de Itunyoso solamente tiene 5 vocales orales y 3 nasales (/ĩ, ũ, ə̃/). En Triqui de Chicahuaxtla, mantienen la vocal /ɨ/ pero eso se produce como /i/ en Triqui de Itunyoso y /u/ en Triqui de Copala. Entre las variedades triquis, varias vocales nasales se unieron, p.ej. */õ, ũ/ > /ũ/, */ɨ̃, ã/ > /ə̃/. Creo que los cambios vocalicos arriba se surgieron de este tipo de proceso histórico.

2. Una relación entre /t͡s/ en Amuzgo y /j/ (o /β/) en triqui. Esta relación es parecido a la relación entre formas que empiezan con /j/ en triqui y las formas con /t/ en mixteco (van Doesburg et al. entregado) donde la /t/ refleja una mutación de /y/ > /t/ para marcar la posesión. Este proceso todavía existe en lenguas triquis pero observamos raices fosilizadas en lenguas mixtecanas.

Triqui de Itunyoso San Pedro Amuzgos Proto-Mixteco 
(Josserand 1983)
ja:³² tsa¹ --- lengua
j:ah³ tsʰaʔ³⁵ /ja:³²/ en mixteco de Yoloxóchitl ceniza
ja³ʔah³ tsᵃʔa¹ /jaʔa/ (en muchas variedades) chile
jãh³ tsõ³--- papel
ja³tã:³² tsã¹ --- granizo
j:eh³ tsʰɔʔ³ *juuʔ piedra
jã:³² tsãʔ¹ *jɨ̃ɨ̃ʔ sal
β:eh³² tsueʔ *juwiʔ cueva

Con esta lista he cambiado la transcripción de Cortés Vásquez un poco - incluye vocales laringizadas escritas por un diacrítico - /a̰/ - pero la glotalización según sus figuras acústicas refleja una secuencia como se observa en la mayoría de lenguas mixtecanas (Gerfen & Baker 2005, DiCanio 2012) y lenguas mazatecas (Garellek & Keating 2011, Silverman et al 1995). A veces por transcribirlo diferente, es más difícil observar los cognados.

Sabemos que este cambio con la posesión en triqui, p.ej. ja³ʔah³ 'chile' > ta³ʔah³ 'chile de...' tiene cognados fosilizados en lenguas mixtecanas pero muchas veces no ocurre con /t/ sino con /ⁿd~n/ o con /ð/ o con /t͡s/, como vemos en Amuzgo acá. Copio una tabla de datos de estos dobletes de van Doesburg et al. (entregado) acá abajo para mostrarlo.

Dobletes en cuicateco, mixteco y triqui

En la mayoría de los casos, la forma de la palabra fosilizada es la forma usada por un ente, p.ej. 'hilo' y 'telaraña.' Por qué observaríamos tantas formas diferentes de consonantes acá? Consideramos que, en varios lenguas mixtecanas, como triqui de Itunyoso y mixteco de Yoloxóchitl, las oclusivas coronales no son alveolares sino dentales (DiCanio 2010, DiCanio et al 2019). Eso incluye la africada /ts/ en Triqui, por ejemplo. Hay una relación clara entre [t̪] - [t͡θ] - [t̪s] - [ð] a través de estas lenguas pero a veces no lo vemos porque se escribe estas consonantes con letras muy distintas (t - ts/tz/dz - d).

Hay más pares interesantes entre triqui y amuzgo pero no más estoy recopilando mis observaciones acá.

Revistas de lingüística publicadas en castellano

Como muchas disciplinas académicas, hay una gran asimetría entre las revistas publicadas en castellano sobre la lingüística y las que están publicadas en inglés. El inglés funciona como una lingua franca - un idioma que muchos lingüistas usan para compartir sus ideas, observaciones e investigaciones. Hay varias  consecuencias del dominio casi completo de inglés en publicaciones sobre la lingüística. Por ejemplo, hay una gran ignorancia de revistas actuales que aceptan artículos en español - estoy tan culpable de este pecado que otros lingüistas. Cuando pienso en una revista como fonetista o como investigador de lenguas indígenas de las Américas, las primeras que me aparecen en la memoria son las más populares - Journal of Phonetics, the International Journal of American Linguistics (IJAL), the Journal of the International Phonetic Association, Language and Speech, Laboratory Phonology and Phonetica. Y si se revisan las citaciones en tales revistas se observa que la mayoría viene de las ya mencionadas.

Este patrón de auto-citación entre revistas publicadas en inglés aumenta sus estatus en las métricas de citación internacionales. Si hay más citaciones, hay más estatus para una revista dada según las métricas de evaluación internacionales. El reportaje de citación de revistas (Journal Citation Reports - JCR) depende del número de citaciones atribuidas a la revista y el número de publicaciones con un índice alto que cita una publicación en la revista dada. Actualmente hay pocas revistas publicadas en castellano que aparecen en ese reportaje en la lingüística. Eso es una gran problema. En varios países (como los EEUU) las decisiones para titulación requieren publicaciones aceptadas en revistas que aparecen en el JCR. Esta dinámica subvalora publicaciones de lingüística en castellano. Y si se estudia un idioma indígena de las Américas en una región donde la lingua franca no es inglés sino castellano, portugués o francés, resulta que una publicación escrita en inglés no será leída ni por miembros de la población (muchas veces multilingüe) que habla el idioma ni por una comunidad científica en la región donde se lo habla. El/la lingüista que habla inglés entonces necesita escoger a publicar en una revista con estatus científico o en una revista accesible. El/la lingüista que habla español (o portugués) como idioma nativa necesita escoger a publicar en una revista accesible o producir un artículo en inglés que podrá necesitar varias revisiones de lenguaje (véase este artículo reciente para una discusión).

Cómo podremos cambiar esta situación? Si hay más lingüistas que publican y citan revistas que ya existen en partes de Latinoamérica, podremos empezar a cambiar su número de citaciones y elevar sus estatus en las métricas. He recopilado una lista de revistas que aceptan publicaciones en castellano o en portugués. No menciono revistas específicamente que aceptan publicaciones solamente de las lenguas romanas - mi enfoque acá es mostrar que hay más ámbitos para la divulgación de la lingüística que debemos considerar.

Revistas en castellano que publican artículos de lingüística

1. Boletim do museu paraense Emílio Goeldi - Ciências Humanas (em Português) - La misión de la Revista es publicar trabajos originales en las áreas de antropología, arqueología, lingüística indígena y disciplinas relacionadas. Admite contribuciones en portugués, español, inglés y francés para las siguientes secciones: artículos científicos, artículos de revisión, notas de investigación, memoria, reseñas bibliográficas, tesis de maestría y doctorado.

2. Cadernos de etnolingüística (em Português) - uma publicação eletrônica destinada a divulgar contribuições originais sobre línguas indígenas sul-americanas, incluindo artigos, resenhas, squibs, notas curtas e documentos inéditos (ou de circulação até o momento limitada).

3. Cuadernos de lingüística en el colegio de México
una revista electrónica de publicación continua, cuyo objetivo es difundir y promover la investigación lingüística acerca de diversas lenguas y sin preferencia por algún marco teórico en particular. Se busca así que los trabajos publicados contribuyan a nuestro entendimiento de las lenguas naturales, ya sea desde un punto de vista teórico o puramente descriptivo.

4. Estudios filológicos (Chile) - is a biannual publication of the Universidad Austral de Chile, Facultad de Filosofía y Humanidades, Instituto de Lingüística y Literatura. It hosts specialized studies in linguistics and literature, and related areas, especially issues relating to the Spanish language and Spanish and Latin American literatures.

5. Forma y Función (Colombia) - La revista Forma y Función está adscrita al Departamento de Lingüística de la Universidad Nacional de Colombia, sede Bogotá. Su objetivo es la divulgación de estudios sobre el lenguaje desde una variedad de perspectivas teóricas y metodológicas que corresponden a los diversos campos de la lingüística.

6. Estudios de fonética experimental (Cataluña) -  publica artículos de investigación original relacionados con cualquier rama de la fonética experimental (articulatoria, acústica, perceptiva, aplicada) y de la fonología de laboratorio. También publica contribuciones sobre aspectos teóricos de la fonética, descripciones de inventarios fonéticos y revisiones de libros sobre fonética y fonología de laboratorio. Se publican artículos escritos en inglés, francés e italiano, así como en español, catalán, portugués y gallego.

7. Lexis (Perú) - Lexis es una de las principales revistas de lingüística y literatura que se publican en Hispanoamérica. La revista acoge trabajos originales en los diversos campos de la lingüística, de la teoría y crítica literarias, de la hispanística y los estudios amerindios. 
Lexis está abierta a trabajos de investigadores peruanos y extranjeros.

8. LIAMES: Línguas Indígenas Americanas (em Português) - uma publicação semestral, editada pela área de Linguística Antropológica (Línguas Indígenas) / Centro de Estudos de Línguas e Culturas Ameríndias (CELCAM) do Departamento de Linguística, Instituto de Estudos da Linguagem / UNICAMP. Seu principal objetivo é propiciar aos pesquisadores da área a publicação de artigos de pesquisa e reflexão acadêmicas, estudos analíticos e resenhas que, por sua temática, versem sobre a investigação e documentação de línguas indígenas americanas, elaborados segundo distintas abordagens teóricas. 

9. Lingüística Mexicana - Nueva Época: es una revista científica cuyo objetivo es la publicación de artículos inéditos de investigación relacionados con los temas, áreas y disciplinas que conforman los distintos campos de la lingüística de las diversas lenguas habladas en México y la lingüística de cualquier lengua o dialecto en contacto con una variante mexicana; los acercamientos pueden ser teóricos, descriptivos o aplicados.

10. Onomázein - Revista de lingüística, filología y traducción (Chile) - 
acoge artículos inéditos derivados de investigaciones científicas en las diferentes disciplinas de la lingüística teórica y aplicada; en filología clásica, indoeuropea, románica e hispánica; en teoría de la traducción y terminología, así como estudios destacados sobre lenguas indígenas.

11. Revista de lingüística teórica y aplicada - RLA (Chile) - tiene como objetivo difundir la investigación lingüística teórica y aplicada en el ámbito académico universitario nacional y extranjero. Los trabajos que se publican son inéditos, provenientes de las diversas áreas de investigación lingüística teórica o aplicada de preferencia escritos en español y otras lenguas como inglés, italiano, francés o portugués. 12. Revista de Filología y Lingüística de la Universidad de Costa Rica - una publicación dedicada a la difusión de artículos académicos sobre temas relevantes en las áreas de la filología, la lingüística y la literatura.

13. Signos lingüísticos (México) - una revista especializada cuyo fin es dar a conocer los resultados de investigaciones originales, rigurosas y metodológicamente consistentes relacionadas con temas de la lingüística, sociolingüística, fonología, adquisición del lenguaje, sintaxis, tanto desde un punto de vista sistemático como histórico. Teniendo un enfoque abierto, Signos Lingüísticos no se ciñe a una determinada concepción de lingüística, poniendo el énfasis en la calidad y originalidad de los trabajos publicados. Signos Lingüísticos aparece ininterrumpidamente desde 2005, previa evaluación, sólo acepta artículos inéditos, notas y reseñas sobre libros de publicación reciente.

14. Tlalocan (México) - 
una revista especializada en la documentación de fuentes y textos de tradición oral en lenguas originarias de México, además de lenguas de Guatemala y el suroeste de Estados Unidos que estén lingüísticamente emparentadas. Publica fuentes relacionadas con las culturas originarias de México y Mesoamérica, tanto documentales como recopiladas de textos orales. También se aceptan para su consideración textos en lenguas originarias emparentadas con lenguas mexicanas, sean de origen documental u oral. Se buscan textos que tengan interés etnográfico o histórico además del interés lingüístico. Se incluyen asimismo reseñas bibliográficas y notas. Tlalocan sólo recibe trabajos inéditos. Los trabajos se pueden publicar en español o en inglés.

Revistas adicionales que publican artículos escritos en español:

15. Diachronica (John Benjamins - Países Bajos) - provides a forum for the presentation and discussion of information concerning all aspects of language change in any and all languages of the globe. Contributions which combine theoretical interest and philological acumen are especially welcome.

16. International Journal of American Linguistics  (EEUU/Chicago) - The International Journal of American Linguistics (IJAL) is dedicated to the documentation and analysis of the indigenous languages of the Americas. Founded by Franz Boas and Pliny Earle Goddard in 1917, the journal focuses on the linguistics of American Indigenous languages. IJAL is an important repository for research based on field work and archival materials on the languages of North and South America.

17. Language Documentation and Description - publishes general research articles on the theory and practice of language documentation, language description, sociolinguistics, language policy, and language revitalisation, with a focus on minority and endangered languages. Also publishes Language Contexts articles with detailed information on the contexts in which languages or varieties are spoken, providing social and cultural information, such as about speaker populations, social organisation, cultural aspects, linguistic ecology, multilingualism, language vitality, and language use and transmission in the community, diaspora and cyberspace. Also publishes Language Snapshot articles providing compact overviews of one or more languages or varieties, with up-to-date key data on language facts and speakers, and current research activity. 

18. Topics in Phonological Diversity - This series provides a platform for researchers in synchronic and diachronic phonology. By bringing together detailed descriptive work on individual languages with a comparative, cross-linguistic focus, it aims to advance our understanding of the evolution and patterning of phonological systems and the role of phonology in the language system more broadly. We welcome submissions in the following areas: Phonological descriptions of individual languages; Cross-linguistic studies of synchronic and diachronic phonological phenomena; Historical phonology of languages, their groupings, or particular phenomena; Interfaces of phonology with morphology, syntax, and phonetics; Phonological variation induced by dialectal, areal, and other factors.

The boundaries of phonetics and owning language diversity

One topic that came out of a departmental forum on institutionalized white supremacy yesterday was the extent to which who we decide to cite can perpetuate racist boundaries within fields. So, I started to think about just who we cite in phonetics and what research we decide is part of the field. One major divide within phonetics is between papers which are mainly concerned with theory-building and those which investigate empirical observations from experiments or from corpus data. Many fields place the former on pedestals (at least for a time) while the latter comprise the bulk of the work that allows us to amass evidence in favor of certain perspectives. Moreoever, since there is just so much that has never been studied on the phonetics of different patterns in different languages, there is no shortage of empirically-motivated topics in phonetics. If I complete a study on the phonetics of tone in Triqui or another language that has been under-studied, my work is categorized as both a contribution to phonetics and a contribution to endangered language (or areal) research. Yet, the same allowance is often not afforded to research on minority groups in the US. A study on speech production or perception among speakers of Black English or among speakers of Puerto Rican Spanish is often not placed within the phonetics canon, but within the sociolinguistic or sociophonetic canon. As far as being part of phonetics, there is nothing inherently different in between doing speech production research on Black English or Triqui or Finnish. Yet, historically, dialectology has fallen within sociolinguistics rather than having been treated as what we might more broadly call "Language diversity." And note that once I say "language diversity", linguists kind of like to think of this as a course taught by a sociolinguist. Diversity is not under the purview of sociolinguistics though. Both phonetics and sociolinguistics can be equally focused on individual languages or interested in a diversity of languages. Research on the syntax of Black English is no more inherently research on sociolinguistics than research on the phonetics of Kera is. What binds linguistic research into sub-disciplines is the domain of study and the approach to the phenomenon, not the language. What this might mean in practice (at least in phonetics - I can't speak about other disciplines as much) is that the boundaries of the field are logically broader than currently defined. The growth of sociophonetics as a discipline has pushed quantitative phonetic research forward by forcing us to normalize discussions of language varieties in well-studied languages. However, it remains the job of sociophoneticians to tell other phoneticians that variation matters - linguists do not yet own language diversity as an issue for the entire field. Yet, a dismissal of sociophonetics has also probably kept it from being incorporated into what phoneticians would call "research on speech production and perception." I'll own that there was a time when I did not always see sociophonetics as being as rigorous as phonetics, but I no longer feel this way. It probably is also the case that by being sidelined, research on different language varieties has not undergone the same type of reviewer-ship that papers in "speech production and perception" might get. If I were to submit my own research to journals evaluating variation though, I shudder to think at how my work might fare. In other words, it's easy to elevate the importance of traditional metrics for scholarship when one is examining idealized language varieties and to under-value metrics that might be applied from a variationist standpoint. So, one way that phonetics might move forward here is to start to accept that many of our theories of production and perception that we tend to elevate are mostly not informed by any work on language diversity and, in fact, we know very little. The implications of this are as huge as the number of different languages and varieties and dialects and communities that have not been studied. We all own language diversity.

A problem like morphophonology

A problem like morphophonology
(sung to How do you solve a problem like Maria? from The Sound of Music)

It might look like any morpheme but then change the root, you see
It can lenite, subtract, or mutate and the morpheme is not free
It can even copy pieces from the stem too, if need be
It isn't quite a part of the morphology!

It has alternations looking like some well-regarded rules
Were it general we'd analyze with well-respected tools
But once we see it's limited it means we're all just fools
It isn't just a part of the phonology!

But you'd be mistaken if you believed we're outdone.
It's actually... quite some fun.

How do you solve a problem like morphonology?
How do you catch a morph and pin it down?
How do you solve a problem like morphonology?
Can vowel deletion derive a noun?

Many a time you think it's in the lexicon.
Many a morph you might misunderstand
But how do you pin it down, and account for all the sounds
How do get the pointing little hand?

Oh, how do you solve a problem like morphophonology?
How do you hold a morpheme in your hand?

What's universal in phonetics?

As a fieldworker, I'm often struck by how many linguistic patterns I've observed that just "shouldn't" occur. Linguistics often propels itself as a field by asserting theories that are both too strong and too myopic. The thinking goes that one should assume universality first and then adjust accordingly afterwards (or unfortunately, ignore exceptions and continue on).

In phonetics, there has been a long history around the notion of universalism. Jakobson, Fant, and Halle (1961) assumed that one needed only distinctive features to characterize cross-linguistic differences. Once you got features down, you could just assume that all speakers had the same sort of mapping from features to articulation. This idea persisted into the 1970's (at least among phonologists), but began to break down in the 1980's - 1990's with Pat Keating's work on voicing (1984), Doug Whalen's discussion of coarticulation (1990), and Kingston & Diehl's discussion about "automatic" and "controlled" phonetics (1994). The emerging consensus from this earlier work and the resulting evolution of laboratory phonology was that phonetic patterns are closely controlled by speakers and many patterns are language-specific.1

Ladd's (2014) book provides a nice overview of many of these ideas - in particular the view that "Phonologists want their descriptions to account for the phonetic detail of utterances. Yet most are reluctant to consider the use of formalisms involving continuous mathematics and quantitative variables, and without such formalisms, it is doubtful that any theory can deal adequately with all aspects of the linguistic use of sound." (p.51)

If we fast-forward to the present day, the landscape of phonetics and phonology is quite different than what it used to be. I think most laboratory phonologists (and most phonologists nowadays are laboratory phonologists) would agree that representations reflect distributions of productions in some way and that the statistical and articulatory details can vary in a gradient way across languages.

With this in mind, what is left of phonetic universals? There are certainly several universals regarding phonological inventories that could be discussed (see Gordon's recent 2016 book on the topic). But what of phonetic patterns that are best captured quantitatively? What are the universals and near universals? I thought I would start to collect a list of these here as a way to organize my thoughts and to challenge/question my assumptions. I invite anyone to propose additional things here too.

1. Dorsal stops (almost always) have longer VOT (voice onset time) than coronal or labial stops
On the basis of looking at 18 different languages, Cho and Ladefoged (1999) first noted that, after one adjusts for the same laryngeal category (voiced, voiceless, voiceless aspirated), dorsal stops will tend to have a longer VOT than coronal or labial stops. A more recent analysis of this question is found in Chodroff et al. (2019) where the authors looked at over 100 different languages. Of the languages that they sampled, 95% displayed the dorsal > coronal pattern. This finding probably relates to a mechanical constraint on movement of the tongue dorsum. Since the dorsum has greater mass, the release portion tends to take longer (Stevens 2000). All else being equal, larger articulators usually move more slowly than smaller ones - a general principle of physiology and movement. This longer release portion delays venting of the supralaryngeal cavity which ultimately facilitates aerodynamic conditions for voicing.

Chodroff et al.'s sampling revealed another near universal - VOT is strongly correlated within a particular language. That is, if a language tends to have very short lag VOT values for one stop consonant, it has very short lag VOT values for all the others too. This finding is interesting since it suggests that speakers and languages produce identical laryngeal gestures regardless of the supralaryngeal constriction. There is some physiological evidence for this universal (Munhall & Löfqvist 1992).

2. All languages have utterance-final lengthening.

Though languages tend to vary in the extent to which words are lengthened at phrase-final or utterance-final position, it seems to have been found in every language where it has been investigated (Fletcher 2010, White et al. 2020). Even languages which lack phonological units used in intonation systems (boundary tones, pitch accents) seem to have utterance-final lengthening (DiCanio and Hatcher, 2018, DiCanio et al. 2018, in press).

There is probably a biomechanical explanation for utterance-final lengthening based on articulatory slowing at the end of utterances. As speakers are finishing utterances, their articulators gradually move more slowly (Byrd & Saltzman 2003). The scope of this effect varies across languages and it is not yet clear whether certain syllable types are more affected than others, i.e. closed syllables or syllables with short vowels might undergo less final lengthening.

3. Languages optimize the distance between vowels in articulation/acoustics.

I'll leave it open for now whether this refers just to articulatory dispersion or acoustic dispersion (there is debate around this, of course), but it seems like most languages try to optimize the height and backness of vowels. In languages with asymmetric vowel systems, e.g. /i, e, a, o/, or /i, e, ɛ, a, o, u/, the back vowels will have F1 values that often sit in-between the values for the corresponding front vowels (Becker-Kristal 2010). Becker-Kristal looked at the acoustics of over 100 different languages and found this to be a general pattern. The opposite pattern is ostensibly true, but most languages have more front vowel contrasts than back vowel contrasts.

***Edited to include new things - thanks to Eleanor Chodroff, David Kamholz, Joseph Casillas, Rory Turnbull, Claire Bowern, Carlos Wagner and various others on Twitter whose identities/names are not clear.***

4. Intrinsic F0 of high vowels

There is some discussion of this effect, but it seems to be the case that, all else being equal, high vowels will have higher F0 than low vowels (Whalen & Levitt 1995). In all languages where it has been investigated, researchers have found positive evidence for this. Whalen & Levitt note that the explanation here has to do with enhanced subglottal pressure and greater cricothyroid (CT) activitiy in the production of high vowels relative to low vowels. Ostensibly, as the tongue is raised, it exerts a pull on the larynx via the geniohyoid and hyothyroid muscles. This raises the thyroid cartilage and thus exerts pull on the cricothyroid itself (raising F0). Greater subglottal pressure would then be needed to surpass the impedance due to greater vocal fold tension.

There is a tendency, however, to not observe the effect in low F0 contexts, in particular for low tones in tone languages. I've personally wondered about this in Mixtec and Triqui languages, though it is usually quite difficult to control for glottalization, tone, and vowel quality all at once in these languages in order to investigate this question. Why might the effect not be found for low tones? One possibility is that F0 control is essentially different in a low F0 context. According to Titze's body-cover model of vocal fold vibration (1994), the thyroarytenoid (TA) muscles are more responsible for vocal fold vibration when F0 is low. Perhaps tongue raising exerts less force on the TA than it does on the CT.

5. Voiced stops are shorter in duration than voiceless stops

Voicing is hard to maintain when there is any constriction in the supraglottal cavity. Assuming no velopharyngeal port venting, supralaryngeal oral stop closure will cause a build up in pressure above the glottis which will inhibit the necessary pressure differential across the glottis required for continued voicing - the aerodynamic voicing constraint (Ohala, 1983). Thus, stops will stop voicing relatively quickly during closure. Similarly, for voiced fricatives, the necessity to maintain narrow constriction for frication and greater intra-oral air pressure relative to atmospheric air pressure is at odds with a simultaneous necessity to maintain greater subglottal pressure relative to intra-oral (supraglottal) air pressure for continued voicing. Thus, voiced fricatives will often devoice or de-fricativize (and be produced as continuants).

A consequence of the aerodynamic voicing constraint in stops is that the duration of stop voicing is limited and so, it turns out, voiced stops are shorter than voiceless ones. This has been observed since early work of Lisker (1957) (c.f. Lisker 1986 as well). It seems to be a phonetic universal. What about fricatives though? Are voiced fricatives typically shorter than voiceless ones? I think that the jury is still out on this one. While it is difficult to maintain simultaneous voicing and frication for voiced fricatives, the temporal constraints are not as clear as with stops. Yet, voiced fricatives are almost always shorter than voiceless fricatives as well. 

What's not a universal?

In thinking about ostensible phonetic universals, I am struck by many patterns that do not seem to be as universal as once believed. I am most familiar with those in the research that I have done.

6. Not a universal - word-initial strengthening

A common cross-linguistic pattern is that word-initial consonants will be produced with greater duration and/or with stronger articulations (more contact, faster velocity). Fougeron & Keating (1997) is a seminal paper observing this pattern with English speakers. It has been studied in various languages - most recently in work by Katz & Fricke (2018) and White et al. (2020). While Fougeron & Keating (1997) and subsequent work by Keating et al. (2003) do not assert that this pattern is universal, White et al. (2020) state the following (in their conclusions):

"We propose, however, that initial consonant lengthening may be likely to maintain a universal structural function because of the critical importance of word onsets for the entwined processes of speech segmentation and word recognition."

I should admit, I'm working on a paper which addresses this claim with some of my research on Yoloxóchitl Mixtec, an Otomanguean language in Mexico. The language is prefixal and has final stress. Word-initial consonants are always shorter than word medial ones and (in the paper I'm working on now at least), undergo more lenition. You don't have to take my word about this based on something not-yet-published though. The durational finding is replicated in both DiCanio et al. (2018) and DiCanio et al. (to appear). So, three different publications all with different speakers have found the effect. (I'll just mention here, because this is a blog and not a publication, that the same pattern seems to hold in Itunyoso Triqui - another Mixtecan language with final stress and prefixation. That's another paper for this summer.)

There's an interesting thing here though - most of the languages which have been studied in relation to initial strengthening are not prefixing languages. In prefixal languages, like Scottish Gaelic, parsing word-initial consonants does not help too much in word identification (Ussishkin et al. 2017). The authors state the following:

"Our results show that during the process of spoken word recognition, listeners stick closer to the surface form until other sounds lead to an interpretation that the surface form results from the morphophonological alternation of mutation." (Ussishkin et al. 2017, p.30)

While this research does not address word-initial strengthening, it suggests that there is just something different about prefixal languages in terms of word recognition. If the goal of word-initial strengthening is to enhance cues to word segmentation, then it stands to reason that word-initial strengthening might not occur in heavily prefixing languages. At the very least, the Mixtec data show that word-initial consonant lengthening is indeed not a universal.

7. Not a universal - native listeners of a tone language are better at pitch perception than native listeners of non-tonal languages

I know, I know, you want to believe that it's true. All tone language listeners must have superpowers when it comes to perceiving pitch, right? It turns out that the evidence is quite mixed here and that the role of music experience ends up playing a big role. There are papers that have found evidence that speaking a tone language confers some benefit in pitch discrimination when listeners have to discriminate both between tonal categories and within them (Burnham et al. 1996, Hallé et al 2004, Peng et al. 2010). However, there are other papers showing no advantage (Stagray & Downs 1993, DiCanio 2012, So & Best 2010). At issue is usually the musical background of the listeners under question. In Stagray & Downs (1993), the authors chose only speakers of Mandarin who did not have musical experience and in DiCanio (2012), none of the Triqui listeners had any music experience. In So & Best (2010), the authors screened 300 Hong Kong Cantonese listeners and chose only those with (a) no knowledge of Mandarin and (b) no formal music training. Only 30/300 qualified! Many other studies finding an advantage to tone language listeners have not controlled for musical background.

So, how much does musical ability play a role in tonal discrimination? I can provide an example from some data from my 2012 paper (though this was not discussed in the paper itself). Triqui is heavily tonal, with nine lexical tones (/1, 2, 3, 4, 45, 13, 43, 32, 31/) and extensive tonal morphophonology (DiCanio 2016). One would imagine that, when presented with stimuli along a continuum between two tonal categories, e.g. falling tones 32 and 31, they might be extra careful at perceiving slight differences. It turns out that they have improved perception at perceiving between-category differences (steps 2-4, 3-5, 4-6, 5-7) than within-category differences (steps 1-3, 6-8).
Discrimination accuracy of tonal continua for Triqui and French listeners. Data from DiCanio (2012). No Triqui speaker has musical training but a subset (13/20) of the French speakers did. Discrimination is better than predicted at the end of the continuum because listeners were comparing resynthesized natural to non-resynthesized speech.
On the whole, French speakers were better at discriminating Triqui tonal pairs along the continuum than Triqui speakers were. This is quite surprising, but once we separate French speakers by their musical background, we find that the non-musicians of the bunch were worse at between-category tonal discrimination than the Triqui listeners (but better at within-category discrimination). Having some musical background (at least 2-3 years) provides a remarkable benefit to your pitch discrimination abilities. Speaking a tone language makes you good at telling apart two particular tones in your language at a categorical boundary between them, but it does not make you magically better at pitch discrimination, apparently.

There are undoubtedly many other things that could be put here both for universals and no-longer universals. I'm of course very biased here as someone that works on prosody. (I tend to be more interested in the prosodic patterns.) This is intended to be a continually-developing list of things for both my personal memory and for others to contribute to (or argue with). So, any thoughts of things to add are most welcome.
