Lip reading
Lip reading, also known as speechreading, is a technique of understanding a limited range of speech by visually interpreting the movements of the lips, face and tongue without sound. Estimates of the range of lip reading vary, with some figures as low as 30% because lip reading relies on context, language knowledge, and any residual hearing.[1] Although lip reading is used most extensively by deaf and hard-of-hearing people, most people with normal hearing process some speech information from sight of the moving mouth.[2]
Process
Although speech perception is considered to be an auditory skill, it is intrinsically multimodal, since producing speech requires the speaker to make movements of the lips, teeth and tongue which are often visible in face-to-face communication. Information from the lips and face supports aural comprehension[3] and most fluent listeners of a language are sensitive to seen speech actions (see McGurk effect). The extent to which people make use of seen speech actions varies with the visibility of the speech action and the knowledge and skill of the perceiver.
Phonemes and visemes
The
Co-articulation
Visemes can be captured as still images, but speech unfolds in time. The smooth articulation of speech sounds in sequence can mean that mouth patterns may be 'shaped' by an adjacent phoneme: the 'th' sound in 'tooth' and in 'teeth' appears very different because of the vocalic context. This feature of dynamic speech-reading affects lip-reading 'beyond the viseme'.[5]
How can it 'work' with so few visemes?
While visemes offer a useful starting point for understanding lipreading, spoken distinctions within a viseme can be distinguished and can help support identification.[6] Moreover, the statistical distribution of phonemes within the lexicon of a language is uneven. While there are clusters of words which are phonemically similar to each other ('lexical neighbors', such as spit/sip/sit/stick...etc.), others are unlike all other words: they are 'unique' in terms of the distribution of their phonemes ('umbrella' may be an example). Skilled users of the language bring this knowledge to bear when interpreting speech, so it is generally harder to identify a heard word with many lexical neighbors than one with few neighbors. Applying this insight to seen speech, some words in the language can be unambiguously lip-read even when they contain few visemes - simply because no other words could possibly 'fit'.[7]
Variation in readability and skill
Many factors affect the visibility of a speaking face, including illumination, movement of the head/camera, frame-rate of the moving image and distance from the viewer (see e.g.[8]). Head movement that accompanies normal speech can also improve lip-reading, independently of oral actions.[9] However, when lip-reading connected speech, the viewer's knowledge of the spoken language, familiarity with the speaker and style of speech, and the context of the lip-read material[10] are as important as the visibility of the speaker. While most hearing people are sensitive to seen speech, there is great variability in individual speechreading skill. Good lipreaders are often more accurate than poor lipreaders at identifying phonemes from visual speech.
A simple visemic measure of 'lipreadability' has been questioned by some researchers.
Lipreading and language learning in hearing infants and children
The first few months
Seeing the mouth plays a role in the very young infant's early sensitivity to speech, and prepares them to become speakers at 1 – 2 years. In order to imitate, a baby must learn to shape their lips in accordance with the sounds they hear; seeing the speaker may help them to do this.[16] Newborns imitate adult mouth movements such as sticking out the tongue or opening the mouth, which could be a precursor to further imitation and later language learning.[17] Infants are disturbed when audiovisual speech of a familiar speaker is desynchronized [18] and tend to show different looking patterns for familiar than for unfamiliar faces when matched to (recorded) voices.[19] Infants are sensitive to McGurk illusions months before they have learned to speak.[20][21] These studies and many more point to a role for vision in the development of sensitivity to (auditory) speech in the first half-year of life.
The next six months; a role in learning a native language
Until around six months of age, most hearing infants are sensitive to a wide range of speech gestures - including ones that can be seen on the mouth - which may or may not later be part of the phonology of their native language. But in the second six months of life, the hearing infant shows perceptual narrowing for the phonetic structure of their own language - and may lose the early sensitivity to mouth patterns that are not useful. The speech sounds /v/ and /b/ which are visemically distinctive in English but not in Castilian Spanish are accurately distinguished in Spanish-exposed and English-exposed babies up to the age of around 6 months. However, older Spanish-exposed infants lose the ability to 'see' this distinction, while it is retained for English-exposed infants.[22] Such studies suggest that rather than hearing and vision developing in independent ways in infancy, multimodal processing is the rule, not the exception, in (language) development of the infant brain.[23]
Early language production: one to two years
Given the many studies indicating a role for vision in the development of language in the pre-lingual infant, the effects of congenital blindness on language development are surprisingly small. 18-month-olds learn new words more readily when they hear them, and do not learn them when they are shown the speech movements without hearing.[24] However, children blind from birth can confuse /m/ and /n/ in their own early production of English words – a confusion rarely seen in sighted hearing children, since /m/ and /n/ are visibly distinctive, but auditorially confusable.[25] The role of vision in children aged 1–2 years may be less critical to the production of their native language, since, by that age, they have attained the skills they need to identify and imitate speech sounds. However, hearing a non-native language can shift the child's attention to visual and auditory engagement by way of lipreading and listening in order to process, understand and produce speech.[26]
In childhood
Studies with pre-lingual infants and children use indirect, non-verbal measures to indicate sensitivity to seen speech. Explicit lip-reading can be reliably tested in hearing preschoolers by asking them to 'say aloud what I say silently'.[27] In school-age children, lipreading of familiar closed-set words such as number words can be readily elicited.[28] Individual differences in lip-reading skill, as tested by asking the child to 'speak the word that you lip-read', or by matching a lip-read utterance to a picture,[29] show a relationship between lip-reading skill and age.[30][31]
In hearing adults: lifespan considerations
While lip-reading silent speech poses a challenge for most hearing people, adding sight of the speaker to heard speech improves speech processing under many conditions. The mechanisms for this, and the precise ways in which lip-reading helps, are topics of current research.[32] Seeing the speaker helps at all levels of speech processing from phonetic feature discrimination to interpretation of pragmatic utterances.[33] The positive effects of adding vision to heard speech are greater in noisy than quiet environments,[34] where by making speech perception easier, seeing the speaker can free up cognitive resources, enabling deeper processing of speech content.
As hearing becomes less reliable in old-age, people may tend to rely more on lip-reading, and are encouraged to do so. However, greater reliance on lip-reading may not always make good the effects of age-related hearing loss. Cognitive decline in aging may be preceded by and/or associated with measurable hearing loss.[35][36] Thus lipreading may not always be able to fully compensate for the combined hearing and cognitive age-related decrements.
In specific (hearing) populations
A number of studies report anomalies of lipreading in populations with distinctive developmental disorders.
Deafness
Debate has raged for hundreds of years over the role of lip-reading ('oralism') compared with other communication methods (most recently, total communication) in the education of deaf people. The extent to which one or other approach is beneficial depends on a range of factors, including level of hearing loss of the deaf person, age of hearing loss, parental involvement and parental language(s). Then there is a question concerning the aims of the deaf person and their community and carers. Is the aim of education to enhance communication generally, to develop sign language as a first language, or to develop skills in the spoken language of the hearing community? Researchers now focus on which aspects of language and communication may be best delivered by what means and in which contexts, given the hearing status of the child and her family, and their educational plans.[43] Bimodal bilingualism (proficiency in both speech and sign language) is one dominant current approach in language education for the deaf child.[44]
Deaf people are often better lip-readers than people with normal hearing.
In connection with lipreading and literacy development, children born deaf typically show delayed development of literacy skills[50] which can reflect difficulties in acquiring elements of the spoken language.[51] In particular, reliable phoneme-grapheme mapping may be more difficult for deaf children, who need to be skilled speech-readers in order to master this necessary step in literacy acquisition. Lip-reading skill is associated with literacy abilities in deaf adults and children[52][53] and training in lipreading may help to develop literacy skills.[54]
Teaching and training
The aim of teaching and training in lipreading is to develop awareness of the nature of lipreading, and to practice ways of improving the ability to perceive speech 'by eye'.
Trainers recognise that lipreading is an inexact art. Students are taught to watch the lips, tongue and jaw movements, to follow the stress and rhythm of language, to use their residual hearing, with or without hearing aids, to watch expression and body language, and to use their ability to reason and deduce. They are taught the lipreaders' alphabet, groups of sounds that look alike on the lips (visemes) like p, b, m, or f, v. The aim is to get the gist, so as to have the confidence to join in conversation and avoid the damaging social isolation that often accompanies hearing loss. Lipreading classes are recommended for anyone who struggles to hear in noise, and help to adjust to hearing loss.
Tests
Most tests of lipreading were devised to measure individual differences in performing specific speech-processing tasks and to detect changes in performance following training. Lipreading tests have been used with relatively small groups in experimental settings, or as clinical indicators with individual patients and clients. That is, most lipreading tests to date have limited validity as markers of lipreading skill in the general population.[60]
Lipreading and lip-speaking by machine
Uses for machine lipreading could include automated lipreading of video-only records, automated lipreading of speakers with damaged vocal tracts, and speech processing in face-to-face video (i.e. from videophone data). Automated lipreading may help in processing noisy or unfamiliar speech.
The brain
Following the discovery that
References
- PMID 21786870.
- PMID 18821117.
- PMID 5808871.
- ^ Sam Loyd's Cyclopedia of Puzzles, 1914
- PMID 7162162.
- PMID 26217249.
- PMID 20211120.
- PMID 21842332.
- PMID 15462626.
- PMID 25863923.
- PMID 26217249.
- PMID 9407662.
- ^ Feld J1, Sommers M 2011 There Goes the Neighborhood: Lipreading and the Structure of the Mental Lexicon. Speech Commun. Feb;53(2):220-228
- PMID 24129010.
- PMID 19717657.
- ^ "HuffPost - Breaking News, U.S. and World News". HuffPost. Retrieved 2020-10-11.
- S2CID 30956329.
- ^ Dodd B.1976 Lip reading in infants: attention to speech presented in- and out-of-synchrony. Cognitive Psychology Oct;11(4):478-84
- S2CID 1226796.
- PMID 15549685.
- PMID 9136265.
- PMID 19541648.
- S2CID 14289579.
- ^ Havy, M., Foroud, A., Fais, L., & Werker, J.F. (in press; online January 26, 2017). The role of auditory and visual speech in word-learning at 18 months and in adulthood. Child Development. (Pre-print version)
- ^ Mills, A.E. 1987 The development of phonology in the blind child. In B.Dodd & R.Campbell(Eds) Hearing by Eye: the psychology of lipreading, Hove UK, Lawrence Erlbaum Associates
- PMID 22307596.
- PMID 18608607.)
{{cite journal}}
: CS1 maint: numeric names: authors list (link - ^ Dodd B. 1987 The acquisition of lipreading skills by normally hearing children. In B.Dodd & R.Campbell (Eds) Hearing by Eye, Erlbaum NJ pp163-176
- PMID 18829049.
- PMID 23275416.
- PMID 24129010.
- PMID 25890390.
- PMID 17827105.
- .
- S2CID 5327755.
- PMID 25986155.
- PMID 17683453.
- PMID 21790542.
- PMID 24847297.
- S2CID 9125298.
- PMID 24904454.
- S2CID 34877573.)
{{cite journal}}
: CS1 maint: numeric names: authors list (link - ^ "Hands & Voices :: Articles".
- S2CID 146626144.
- PMID 10723205.
- PMID 15809542.)
{{cite journal}}
: CS1 maint: numeric names: authors list (link - PMID 28223951.
- ^ "Communication support for deaf people". 2015-11-24.
- ^ "Lipspeaker UK - Communication services for deaf & hard of hearing people".
- ^ "Reading and dyslexia in deaf children | Nuffield Foundation". 19 November 2019.
- PMID 17566067.
- S2CID 34877573.
- PMID 20570282.
- PMID 23275416.
- PMID 7120965.
- PMID 20724357.
- ^ "Lipreading Alphabet: Round Vowels". Archived from the original on 2014-06-23. Retrieved 2014-06-23.
- PMID 35316072.
- ^ "Campaigns and influencing".
- PMID 21786870.
- ^ "Home > Rachel-Walker > USC Dana and David Dornsife College of Letters, Arts and Sciences" (PDF).
- ^ "Rule-Based Visual Speech Synthesis". 1995. pp. 299–302.
- S2CID 17406145.
- ^ "Visual Speech Synthesis - UEA".
- ^ "Lip-reading computer can distinguish languages".
- ^ Archived at Ghostarchive and the Wayback Machine: "Video to Text: Lip reading and word spotting". YouTube.
- ^ Hickey, Shane (2016-04-24). "The innovators: Can computers be taught to lip-read?". The Guardian.
- ^ "Google's DeepMind AI can lip-read TV shows better than a pro".
- S2CID 15211759.
- ^ Luettin, Juergen; Thacker, Neil A.; Beet, Steve W. "Speaker Identification by Lipreading" (PDF). Applied Science and Engineering Laboratories.
- ^ http://www.planetbiometrics.com-article-details-i-2250[permanent dead link]
- PMID 9110978.
- PMID 25520611.
- PMID 17218482.
- PMID 11587893.
- PMID 23644583.
- S2CID 9035639.
- PMID 15647358.
- ^ Hall DA1, Fussell C, Summerfield AQ. 2005 Reading fluent speech from talking faces: typical brain networks and individual differences.J. Cogn Neurosci. 17(6):939-53.
- PMID 20853377.
- PMID 18249420.
Bibliography
- D.Stork and M.Henneke (Eds) (1996) Speechreading by Humans and machines: Models Systems and Applications. Nato ASI series F Computer and Systems sciences Vol 150. Springer, Berlin Germany
- E.Bailly, P.Perrier and E.Vatikiotis-Bateson (Eds)(2012) Audiovisual Speech processing, Cambridge University press, Cambridge UK
- Hearing By Eye (1987), B.Dodd and R.Campbell (Eds), Erlbaum Asstes, Hillsdale NJ, USA; Hearing by Eye II, (1997) R.Campbell, B.Dodd and D.Burnham (Eds), Psychology Press, Hove UK
- D. W. Massaro (1987, reprinted 2014) Speech perception by ear and by eye, Lawrence Erlbaum Associates, Hillsdale NJ
Further reading
- Dan Nosowitz (18 Feb 2020). "What Is the Hardest Language in the World to Lipread?". Atlas Obscura.
- Laura Ringham (2012). "Why it's time to recognise the value of lipreading and managing hearing loss support (Action on Hearing Loss, full report)" (PDF).
See also
- Automated Lip Reading (ALR)
External links
- Scottish Sensory Centre 2005: workshop on lipreading [1]
- Lipreading Classes in Scotland: the way forward. 2015 Report
- AVISA; International Speech Communication Association special interest group focussed on lip-reading and audiovisual speech
- Speechreading for information gathering: a survey of scientific sources [2]