Emotional prosody
This article's lead section may be too short to adequately summarize the key points. (February 2020) |
Emotional prosody or affective prosody is the various non-verbal aspects of
Emotional prosody in speech is perceived or decoded slightly worse than
Production of vocal emotion
Studies have found that some emotions, such as fear, joy and anger, are portrayed at a higher frequency than emotions such as sadness.[4]
- Anger: Anger can be divided into two types: "anger" and "hot anger". In comparison to neutral speech, anger is produced with a lower pitch, higher intensity, more energy (500 Hz) across the vocalization, higher first formant (first sound produced) and faster attack times at voice onset (the start of speech). "Hot anger", in contrast, is produced with a higher, more varied pitch, and even greater energy (2000 Hz).[5]
- Disgust: In comparison to neutral speech, disgust is produced with a lower, downward directed pitch, with energy (500 Hz), lower first formant, and fast attack times similar to anger. Less variation and shorter durations are also characteristics of disgust.[5]
- Fear: Fear can be divided into two types: "panic" and "anxiety". In comparison to neutral speech, fearful emotions have a higher pitch, little variation, lower energy, and a faster speech rate with more pauses.[5]
- Sadness: In comparison to neutral speech, sad emotions are produced with a higher pitch, less intensity but more vocal energy (2000 Hz), longer duration with more pauses, and a lower first formant.[5]
Perception of vocal emotion
Decoding emotions in speech includes three stages: determining acoustic features, creating meaningful connections with these features, and processing the acoustic patterns in relation to the connections established. In the processing stage, connections with basic emotional knowledge is stored separately in memory network specific to associations. These associations can be used to form a baseline for emotional expressions encountered in the future. Emotional meanings of speech are implicitly and automatically registered after the circumstances, importance and other surrounding details of an event have been analyzed.[6]
On average, listeners are able to perceive intended emotions exhibited to them at a rate significantly better than chance (chance=approximately 10%).[5] However, error rates are also high. This is partly due to the observation that listeners are more accurate at emotional inference from particular voices and perceive some emotions better than others.[4] Vocal expressions of anger and sadness are perceived most easily, fear and happiness are only moderately well-perceived, and disgust has low perceptibility.[3][self-published source?]
Vocal emotions and the brain
Language can be split into two components: the verbal and vocal channels. The verbal channel is the semantic content made by the speaker's chosen words. In the verbal channel, the semantic content of the speakers words determines the meaning of the sentence. The way a sentence is spoken, however, can change its meaning which is the vocal channel. This channel of language conveys emotions felt by the speaker and gives us as listeners a better idea of the intended meaning. Nuances in this channel are expressed through intonation, intensity, a rhythm which combined for prosody. Usually these channels convey the same emotion, but sometimes they differ. Sarcasm and irony are two forms of humor based on this incongruent style.[7]
Neurological processes integrating verbal and vocal (prosodic) components are relatively unclear. However, it is assumed that verbal content and vocal are processed in different hemispheres of the
Impairment of emotion recognition
Deficits in expressing and understanding prosody, caused by right hemisphere lesions, are known as
It has been found that it gets increasingly difficult to recognize vocal expressions of emotion with increasing age. Older adults have slightly more difficulty labeling vocal expressions of emotion, particularly sadness and anger than young adults but have a much greater difficulty integrating vocal emotions and corresponding facial expressions. A possible explanation for this difficulty is that combining two sources of emotion requires greater activation of emotion areas of the brain, in which adults show decreased volume and activity. Another possible explanation is that hearing loss could have led to a mishearing of vocal expressions. High frequency hearing loss is known to begin occurring around the age of 50, particularly in men.[9]
Because the right hemisphere of the brain is associated with prosody, patients with right hemisphere lesions have difficulty varying speech patterns to convey emotion. Their speech may therefore sound monotonous. In addition, people with right-hemisphere damage have been studied to be impaired when it comes to identifying the emotion in intoned sentences.
Difficulty in decoding both syntactic and affective prosody is also found in people with
Non-linguistic emotional prosody
Emotional states such as happiness, sadness, anger, and disgust can be determined solely based on the acoustic structure of a non-linguistic speech act. These acts can be grunts,
In addition, it has been proven that emotion can be expressed in non-linguistic vocalizations differently than in speech. As Laukka et al. state: Speech requires highly precise and coordinated movement of the articulators (e.g.,
) in order to transmit linguistic information, whereas non-linguistic vocalizations are not constrained by linguistic codes and thus do not require such precise articulations. This entails that non-linguistic vocalizations can exhibit larger ranges for many acoustic features than prosodic expressions.In their study, actors were instructed to vocalize an array of different emotions without words. The study showed that listeners could identify a wide range of positive and negative emotions above chance. However, emotions like guilt and pride were less easily recognized.[11]
In a 2015 study by Verena Kersken, Klaus Zuberbühler and Juan-Carlos Gomez, non-linguistic vocalizations of infants were presented to adults to see if the adults could distinguish from infant vocalizations indicating requests for help, pointing to an object, or indicating an event. Infants show different prosodic elements in crying, depending on what they are crying for. They also have differing outbursts for positive and negative emotional states. Decipherment ability of this information was determined to be applicable across cultures and independent of the adult's level of experience with infants.
Sex differences
Men and women differ in both how they use language and also how they understand it. It is known that there is a difference in the rate of speech, the range of pitch, and the duration of speech, and pitch slope (Fitzsimmons et al.). For example, "In a study of relationship of spectral and prosodic signs, it was established that the dependence of pitch and duration differed in men and women uttering the sentences in affirmative and inquisitive intonation. Tempo of speech, pitch range, and pitch steepness differ between the genders" (Nesic et al.). One such illustration is how women are more likely to speak faster, elongate the ends of words, and raise their pitch at the end of sentences.
Women and men are also different in how they neurologically process emotional prosody. In an fMRI study, men showed a stronger activation in more cortical areas than female subjects when processing the meaning or manner of an emotional phrase. In the manner task, men had more activation in the bilateral
Considerations
Most research regarding vocal expression of emotion has been studied through the use of synthetic speech or portrayals of emotion by professional actors. Little research has been done with spontaneous, "natural" speech samples. These artificial speech samples have been considered to be close to natural speech but specifically portrayals by actors may be influenced stereotypes of emotional vocal expression and may exhibit intensified characteristics of speech skewing listeners perceptions. Another consideration lies in listeners individual perceptions. Studies typically take the average of responses but few examine individual differences in great depth. This may provide a better insight into the vocal expressions of emotions.[5]
See also
References
- PMID 9527153.
- PMID 29615944.
- ^ a b "The Social and Emotional Voice" (PDF). Archived from the original (PDF) on 3 February 2014. Retrieved 29 March 2012.
- ^ S2CID 18785659.
- ^ PMID 20437296.
- PMID 22087275.
- ^ S2CID 143618738.
- PMID 2438386.
- S2CID 205555217.
- ^ Hoekert, L. M. (2009). "Impaired recognition and expression of emotional prosody in schizophrenia: review and meta-analysis" (PDF). Beyond what is being said: emotional prosody.
- PMID 23914178.
- S2CID 45581597.