Simulating the audio-visual asynchrony (AVA) is just one of those essential issues to be researched in the use of video signal along with the audio signal in the speech processing applications. AVA analysis dealswith the estimation of theasynchrony between audio andvisual speech signal produced during the articulation of phonemes and allophones. Just a few works of literature have discussed this specific dilemma that immediately reflects more exploration is needed to tackle this open research issue. An audio-visualMalayalam speech database containing of 50 phonemes along with 106 allophones of five indigenous speakers has been created. The listed visual information is made up of the complete facial area recorded in a frontal perspective. Time annotation of the audio and video signals is performed manually. Duration of audio signal and video signal of every phonemes and allophones are estimated from the time annotated audio visual database. Asynchrony is then estimated as their differences. Asynchrony analysis was performed individually for phonemes and allophones to underline the coarticulation effect. Multi modal speech recognition has greater accuracy than audio only speech recognition, especially in noisy environment. AVA plays a vital role in applications like multi modal speech recognition and synthesis, automatic redubbing, etc.
CITATION STYLE
Bibish Kumar, K. T., John, S., Muraleedharan, K. M., & Sunil Kumar, R. K. (2019). Audio-visual asynchrony in malayalam phonemes and allophones. International Journal of Recent Technology and Engineering, 8(3), 8359–8362. https://doi.org/10.35940/ijrte.C6468.098319
Mendeley helps you to discover research relevant for your work.