Why do I sound the way I do?

Professor Sophie Scott explains how and why we express emotions in our voices

The human voice is unparalleled in nature for its complexity. When we speak or sing we co-ordinate a large number of muscles in our chest, throat, mouth and face to produce a sequence of different sounds which others can understand. Furthermore, when we talk, we don’t simply transmit linguistic information, but also pitch information that can alter the meanings of words - for example, the difference between “I LOVE you” and “I love YOU???” is conveyed by how we control the melody of our speech.

All languages of the world use pitch variation to give their speech melody, and of course these pitch variations also give the singing voice its tune. Furthermore, our voices provide a wealth of other information about us, from our sex, age, social background, our geographical origins, our health and our mood. These different factors depend partly on the anatomy of our voices – for example, adult men have a longer vocal tract and thicker vocal chords than adult women, so their voices sound deeper and partly on how we use our voices – for example, different regional accents of British English tend to have different intonation patterns, which give the speech different melodies.

In my work, I am very interested in how our brains tease apart these different kinds of information in the voice, and this has led me to study how we express emotion in our voices. There are some very characteristic ways in which we change our voices when we are in different moods. When we are angry, our voices become very tense, reflecting the ways that the muscles are tensed in the lips and jaw. Our intonation becomes very low and very flat, as we speak through our clenched teeth.

When we are happy, our intonation tends to be very wide-ranging, with no tension. The claim that one should smile on the phone because listeners can ‘hear’ a smile has a basis in truth – we really can hear a difference when a happy speaker is smiling. When we are happy and amused, we very often will laugh; a curious sound in which there are rapid spasms in the muscles of the chest, causing lots of little expulsions of air. Laughter is highly recognizable and seems to be found in all human cultures, and indeed can also be seen in chimpanzees, suggesting that it has an old evolutionary heritage. Laughter is highly infectious, and seems to have an important bonding mechanism. Indeed, it has even been suggested that humans came together in groups to laugh together before they could talk to each other.

When we are disgusted, our voices show characteristic patterns of downwards inflections, and this has also been associated with expressions of contempt – the late John Thaw was particularly good at this as Inspector Morse. If really disgusted, we will also frequently produce emotional sounds like ‘yeuch’, which appear to closely correlate with a disgusted facial expression.

When we are scared, our voices can sound very wobbly, as the autonomic fight or flight response affects our vocal chords and make it hard for us to control the pitch of our voices. When we are really scared, we may scream – a high pitched loud sound which is very recognizable and attention grabbing.

When we are sad our voices become very low in pitch, very quiet and very flat in melody. When we are really sad, we may weep, and there are spasms in the chest wall, similar to those seen in laughter, but with very different voicing. So when we are in the grip of an emotion, our voices will very clearly reveal this – indeed it is sometimes claimed that our voices will betray our true emotions. We can keep a poker face, but can we keep a ‘poker voice’?

Brain responses to sounds and emotional vocalisations
Regions in red show responses to sounds, regions in yellow show responses to emotional vocalisations in the right side of the brain. x is the Cartesian co-ordinate.

Interestingly, we find that at a neural level, emotional processing in the voice is rather distinct from linguistic processing – that is, we process the words that someone is saying rather differently from the emotional colour of their voice. Very broadly, the left side of the brain is interested in pulling out the linguistic content of what is said, and the right side is interested in the emotional information, as well as the person-specific information, and the speech melody. Interestingly, the right side of the brain is also sensitive to musical information, such as scale structure and melody. This may reflect some shared properties between the processing of pitch in voices and the processing of pitch in music.

We have also found that specific emotions recruit different brain areas. A very small area of the brain, called the amygdala, is very important in processing emotional information from the voice and face, especially fear and anger. Patients with damage to their amygdala can explain what makes them scared, but if you play them the sound of someone screaming, they cannot identify this sound. A patient I worked with said that when her son went rock climbing, she was frightened, but when she heard someone screaming, she would describe them as sounding ‘shocked’. In a recent study, we used brain imaging techniques to see what parts of the brain are recruited when people hear emotional sounds. We found that when people hear the sound of laughter, we see activation in the same parts of the brain that people activate to produce a smile. This suggests that the infectiousness of laughter is a result of very tight coupling between the brain areas which perceive laughter and those that produce it.

We have been probably been talking to each other for 100,000 years and now science is making good progress in unpacking how we colour our voices with emotion and meaning, and how our brains are able to decode this information.

Sophie Scott is a Professor of Cognitive Neuroscience at University College London, where she is the leader of the Speech Communication Group. Her work covers the way the human brain processes speech perception and production, and how our brains decode different kinds of information from the voice. She is particularly interested in how these processes are affected by hearing loss and brain damage. She is funded by the Wellcome Trust, grant number WT074414MA.

References and further reading

  • Karpf, A (2006) The Human Voice: The Story of a Remarkable Talent. Bloomsbury Publishing, London UK
  • Murray, I. R & Arnott, J. L. (1993). Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. Journal of the Acoustical Society of America, 93, 1097-1108
  • Warren JE, Sauter DA, Eisner F, Wiland J, Dresner MA, Wise RJ, Rosen S, Scott SK. (2006) Positive emotions preferentially engage an auditory-motor "mirror" system. J Neurosci. 26(50):13067-75
  • Scott, S. K., Young, A. W., Calder, A. J., Hellawell, D. J., Aggleton, J. P., Johnson, M., (1997) Impaired auditory recognition of fear and anger following bilateral amygdala lesions. Nature, 385, 254-257
Laughter is highly infectious, and seems to have an important bonding mechanism. Indeed, it has even been suggested that humans came together in groups to laugh together before they could talk to each other.

Deafness Research UK has awarded over £9 million in research grants. To see what we've achieved, so far, click here