An Encoder-decoder based analysis of Lip Reading Using Various Deep Learning Approaches for Bangla Word Level Corpus

Abstract
Lip reading is the decoding of speech from a speaker's lip movements. It greatly helps in environments where background noise makes hearing rather challenging. The application extends to a wider zone including analysis of surveillance videos, silent dictation in public places, silent movie processing, hearing aids and more. Human performance in lip-reading is fairly poor, proving it to be a difficult task indeed. Hence, the focus has been shifted to machine learning. The art of lip-reading hasn't been applied to Bangla speeches as of now, and that is exactly what this project is trying to cover. Here, various approaches such as MobileNet, Inception, Custom models, LSTM etc. are explored in the process of implementing word-level prediction on our very own Bangla dataset
Description
Keywords
Citation
Department Name
Electrical and Computer Engineering
Publisher
North South University
Printed Thesis
DOI
ISSN
ISBN