Say WHAT? Run that by me again…

Music Training for the Development of Speech Segmentation

Article Title: Music Training for the Development of Speech Segmentation

Journal: Cerbral Cortex, September, 2013.

Authors: François Clément, Julie Chobert, Mireille Besson, Daniele Schön.

Doi: 10.1093/cercor/bhs180

Type of study: Longitudinal causal study using behavioral and electrophysiological measures

Purpose: To examine the influence of music training on speech segmentation in eight year old children.

Procedure: The tested children were pseudo-randomly assigned to either a music or a painting group, and given two years of training. They were tested before training, at one year, and after two years on their ability to extract words from a continuous flow of nonsense syllables.

Results: Researchers found improved speech segmentation skills for the musically trained group only.

My thoughts on the article: First of all, let me say that is an article about a subject whose surface I am just beginning to scratch and probably will never fully comprehend; nevertheless, I am fascinated by the work of these French researchers who have been exploring the interrelation of music and speech for many years. They have developed an artificial language, which they revise and use in different ways to determine how humans learn to distinguish words from continuous speech. 

The language is syllabic, with a fixed number of nonsense syllables which are arranged into “words”, with each word consisting of precisely three syllables. The words have been assigned tones, which are used consistently throughout  experiments, and sung by a synthesizer in a continuous stream, ( in varying order ) with no breaks between words to establish segmentation. With no audible segmentation and no semantic cues, how on earth would a listener recognize words in a garbled stream of “gysigipygygisisipysypymi”? According to the researchers’ conclusion, participants are able to perceive words through the integration of pitch with the statistical properties of the speech structure.

Let me explain further. The researchers’ artificial language possesses certain statistical properties: namely, in a continuous stream of speech with words in random order, syllables that appear next to each other within words will occur together more frequently than syllables that are separated by word boundaries. This is called “transitional probabilities”, and it is true for tones as well. In the words Gy-si-gi and Py-gy-gi ( for example) “si” and “gi” will occur together more often than “gi” and “Py”. So theoretically, listeners’ brains could come to recognize syllabic patterns that occur more frequently. Still, it is dubious how accurately a listener could guess, and indeed previous research showed that this was a difficult task to achieve when the continuous stream of speech was spoken, rather than sung.However, Clément et al have found that accuracy increases when the continuous stream of speech is sung rather than spoken. When listeners associate a certain syllable with a pitch (or tone), they can follow the melodic contour and unconsciously remember as “words” those frequently occurring syllable sequences.

The researchers had previously done multiple studies using adult participants, but had never worked with children, whose brains are more plastic. In this particular study, they deliberately focused on children, and chose a longitudinal study in an effort to prove causality. Here is how they went about it:

Thirty-seven 8 year old native French speaking school children were divided into two groups: a music group and a painting group, with care taken to ensure that children from each group represented similar socioeconomic backgrounds, and that no child had previous experience with either music or painting. Along the way, children either moved or had problems with attentiveness and the number of participants shrank to twenty four. The children were tested at the beginning of the experiment by entering a private booth and listening to the stream of sung speech. They were then presented with pairs of spoken words, one of which was an actual “word” from the artificial language, and the other a “non-word” created from the last syllable of a word combined with the first two syllables of another word, or vice-versa. Children were to decide which of the two “words” sounded familiar, based on the sung speech they had just heard. After this initial test, the two groups began two years of lessons. The music group had 45 minute lessons twice a week during the first year, and once a week during the second. The painting group also had 45 minute lessons, and followed the same pattern.

After the end of the first year, the two groups were tested again, and the results showed that the music group had improved in their ability to recognize the “familiar” words, while the painting group’s ability had actually decreased. At the end of two years, the music group showed yet another jump in performance, and the painting group’s performance also increased, although it hardly varied from the score they achieved two years before, which was deemed at “chance level”. Take a look at the following graph: the solid line shows the number of correct responses given by the music group over the two year period, while the dotted line shows the painting group’s lack of progress.

Screenshot 2015-10-13 18.40.06

You can also see the alignment of pitch and syllables, and imagine what a strange experience it must have been for eight year old children to listen to this, alone in a sound booth.

The researchers were particularly interested that the children from the music group were impressively accurate in distinguishing “words” even though the continuous sung sequence was actually designed to contain a higher percentage of “non-words”.  The graph on the left shows the progress of the music group compared to the painting group at the one year mark, but after two years, the music group’s percentage of correct responses actually increased to nearly 75% !  Allow me to add that exclamation mark, since this is not a formal paper and I personally find that amazing. Having had years of piano lessons as well as music theory and music appreciation classes, I would love to volunteer in just such a study, especially if it involved more free music lessons! And it’s not the kind of comment that one includes in formal article reviews, but hey: those kids lucked out. They got two years of training in skills that fostered their creativity and ( I am certain, though I cannot prove it empirically ) made them happier, more interesting individuals. I only hope that the children in the two groups didn’t know the study results; I would hate to have the painting kids think of themselves as losers who got “the wrong answers”. As long as this was not the case, then the painting kids absolutely lucked out too, with free painting lessons. Aside from the creepiness of listening to strange syllabic sequences by oneself in a booth, this was a win-win situation for the children, and I hope their parents appreciated the opportunity.

Finally, the researchers conclude their study with confident pronouncements: musically trained children are able to successfully determine word boundaries, showing that music can indeed play an important role in language acquisition. And I believe so, too. Yes, music is fun and motivating for language students, but its influence is much deeper. I look forward to following the further adventures of these French cognitive scientists and to reading about other cutting edge research related to the music/speech connection.


Single Post Navigation

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: