LipNet is a deep neural network for visual speech recognition. It was created by Yannis Assael, , and Nando de Freitas, researchers from the University of Oxford. The technique, outlined in a paper in November 2016, is able to decode text from the movement of a speaker's mouth. Traditional visual speech recognition approaches separated the problem into two stages: designing or learning visual features, and prediction. LipNet was the first end-to-end sentence-level lipreading model that learned spatiotemporal visual features and a sequence model simultaneously. Audio-visual speech recognition has enormous practical potential, with applications in improved hearing aids, medical applications, such as improving the recovery and wellbeing of critically ill patients, and speech recognition in n
Property | Value |
---|---|
dbo:abstract |
|
dbo:wikiPageExternalLink | |
dbo:wikiPageID |
|
dbo:wikiPageLength |
|
dbo:wikiPageRevisionID |
|
dbo:wikiPageWikiLink | |
dbp:wikiPageUsesTemplate | |
dcterms:subject | |
rdfs:comment |
|
rdfs:label |
|
owl:sameAs | |
prov:wasDerivedFrom | |
foaf:isPrimaryTopicOf | |
is dbo:wikiPageWikiLink of | |
is foaf:primaryTopic of |