Monday, December 2, 2019

Tutorial: Creating pretty spectrograms


Phonetic data is no longer just for papers on phonetics. Research using quantitative methods, corpus data, and experimental approaches may involve phonetic data for analytical or visualization purposes. There may also simply be a need to demonstrate visually a phonetic pattern in a linguistics paper unrelated to phonetics. For instance, descriptive grammars are stronger/clearer when phonological argumentation is accompanied with phonetic data showing patterns (Maddieson 2001, Maddieson et al. 2009). The movement to examine more phonetic data within linguistics is motivated by several factors:

a.   It is easier than ever before to show proof of one's observations.
b.   A greater focus on spoken language corpora means that one must use tools which analyze the speech signal (not just texts or transcriptions). 
c.   Laboratory phonology has been incorporated into all areas of phonology.
d.   Gradient processes within the phonetic signal are relevant to our understanding of social variation and representations in the mental lexicon.

Yet, despite these changes to the field, linguists (and especially students starting off in linguistics) often have trouble visualizing phonetic data within research. The effect of this is that one might not be able to convey one's message clearly to the audience, casting doubt on the observations. Some of the common pitfalls include: (1) The scaling parameters for displaying the acoustics are incorrect and you can not observe the relevant detail (e.g. dynamic range, F0 range, etc), (2) The text is not correctly aligned with the acoustics, (3) Too much information is displayed (another scaling problem), and (4) No scale is given.

Drawing well-labelled spectrograms is not difficult and Praat (Boersma & Weenink 2019) possesses several nice tools that allow you to visualize things rather nicely (far better than taking a screen shot of your screen). This tutorial is designed as the first (of perhaps many) which aim to improve how acoustic phonetic data is visualized.

I.  Initial steps: include a textgrid

(1) Open up the sound file that you wish to visualize. In most cases, a reader will not be able to visually inspect anything more than 6-10 segments long in an image. So, make sure that the duration that you wish to image is shorter than this. Otherwise, it is not showing much to a reader.

(2) Create a textgrid along with the sound file and segment the portions that you wish to visualize. If you are not sure about how to create a textgrid, please see the Praat manual. I have created a simple example here of myself saying the word 'ken' [kʰɛ̃n] (below).

(3) Once you have created a textgrid, select the portion of the sound file corresponding to the textgrid and then choose from the File menu "Extract selected textgrid (preserve times)." This will create a textgrid file exactly the size of the spectrogram you wish to display it with.

A spectrogram of the word 'ken.'
II. Exporting a visible spectrogram

(4) Praat does not currently allow users to export a spectrogram from a sound file - it is necessary to export a visible spectrogram.

(5) To do this, first select the portion of the sound file that you wish to visualize and click 'sel' (select). Then, from the Spectrum menu, select "Extract Visible Spectogram."

(6) You should now see a visible spectrogram in the object window of Praat.

III. Adding layers to create an image

(7) The key to creating a nice image is to add objects/details in layers. Praat allows you the ability to add in layers to an image and you may undo multiple layers at a time in the picture window.

(8) The things to understand about the picture window are that (a) it will print only in the region that you have selected and (b) it will use any presets you have chosen for Pen/Font. It does not revert to a default. Select a fairly large region for your spectrogram, perhaps a 4x6 image.

(9) Now, select the spectrogram in the object window and select "Draw:Paint..." In the dialog window, the option "Garnish" is often pre-selected for you. When you print with the Garnish button selected, Praat will print information about the sound image. You do not want it to do that since we will be adding in elements pertaining to the axes separately, ourselves. So, unselect this (see below).


(10) This should now produce a spectrogram with no margins in the picture window. That's the first step.

(11) Now, from the "Margins" menu in the picture window, select "Draw inner box." This will create margins around the box. Note that the thickness of the margin here can be adjusted under the Pen:Line Width menu in the picture window. However, Praat does not allow you to adjust things after they are drawn - you must do this before you print elements. For now, the preset - 1.0 line width - is sufficient. You should have created something like this below:


(12) Now comes the fun part - we will be adding in axes in stages. First, let's add in a y-axis. From the Margins menu, select Marks:Marks left at the bottom of the menu. We can choose to exclude dotted lines for the moment, but Praat recognizes the scale of the image, so it will know that the y-axis should be frequency in Hz.

(13) Once you have done this, select "Text left" from the Margins menu. Print "Frequency (Hz)." The resulting image should look like the one below:


(14) We can continue to add in layers this way (including duration on the x-axis), and if you so wished, we could then export this to a pdf document. However, we could also add in text.

(15) To add in text, select a portion of the image larger than the box with the spectrogram itself (see below) and then choose the textgrid file from the objects window. Deselect the "garnish" option again and click OK.



(16) The "show boundaries" option allows us to visualize the segmental boundaries that you have chosen in your spectrogram, but the default line width (1.0) is a bit narrow/thin for visualization. If you want to adjust this, choose Line width from the Pen menu and set it to something larger (like 1.5 or 1.8). Then print the textgrid.

If you want to go back to do this, just undo the print option, change the settings, and then print the textgrid again.

(17) The result should look something like below.



(18) The last step we might do is to include some acoustic information. Let's suppose we want to add in formants to our figure. Select the sound file from the object window and choose "Analyze Spectrum: To formant (burg)..." This will create a formant object in your window.

(19) Select the original box portion in the picture window again (not the entire portion with text). Now, select the formant object from your object window and click "Draw: Speckle..." and make sure you deselect the "garnish" option. This will create speckles corresponding to your formants. Be sure to set the range of the drawing option to match the range of the spectrogram, i.e. if your sound file is longer than the spectrogram you are visualizing, you will end up with formant values that do not match the image.

Note that if you lower the dynamic range, it will only draw formants within that range, i.e. 20 dB = the loudest 20 dB of the speech signal. The output of this should look as below:


(20) We could add in extra layers, e.g. duration on an x-axis under the text, F0 data on the axis to the right of the spectrogram, etc. However, we'll just stop here because I think you probably get the gist of this. The final exported pdf always looks nicer than what appears in the Praat picture window (see below). You can now add in labels (arrows, text) using other software.
References:
Boersma, P. and Weenink, D. (2019). Praat: doing phonetics by computer (version 6.1). Computer program. Retrieved from http://www.praat.org/.

Maddieson, I. (2001). Phonetic fieldwork. In Newman, P. and Ratliff, M., editors, Linguistic Fieldwork, pages 211–229. Cambridge University Press.

Maddieson, I., Avelino, H., and O’Connor, L. (2009). The Phonetic Structures of Oaxaca Chontal. International Journal of American Linguistics, 75(1):69–103.

3 comments:

  1. Hello Christian. I hope my message finds you in good health. I have been trying to create an image just like the one you did in this article. However, the image I create does not allign as seamless as it does in your example. I add the spectogram in a 4X6 frame. Then draw inner box and add scaling. To add the text grid, I extend the selected box within Praat Picture just like you did. However, the text grid overlaps just a little bit with the spectogram. I have tried different size option with one or two tiers in text grid, never was able to create a seamless image. Do you have any guesses why this might be? Thank you in advance.

    ReplyDelete
    Replies
    1. Sorry for the late reply. I think you have to extend it just a bit past where you normally would. Otherwise it always overlaps just a little bit. I wish there was a better solution though. I'm glad my tutorial was useful.

      Delete
  2. You can also use https://github.com/Christoph-Lauer/Sonogram

    ReplyDelete