A Study of Data Augmentation for ASR Robustness in Low Bit Rate Contact Center Recordings Including Packet Losses

3Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Client conversations in contact centers are nowadays routinely recorded for a number of reasons—in many cases, just because it is required by current legislation. However, even if not required, conversations between customers and agents can be a valuable source of information about clients or future clients, call center agents, markets trends, etc. Analyzing these recordings provides an excellent opportunity to be aware about the business and its possibilities. The current state of the art in Automatic Speech Recognition (ASR) allows this information to be effectively extracted and used. However, conversations are usually stored in highly compressed ways to save space and typically contain packet losses that produce short interruptions in the speech signal due to the common use of Voice-over-IP (VoIP) in these systems. These effects, and especially the last one, have a negative impact on ASR performance. This article presents an extensive study on the importance of these effects on modern ASR systems and the effectiveness of using several techniques of data augmentation to increase their robustness. In addition, ITU-T G.711, a well-known Packet Loss Concealment (PLC) method is applied in combination with data augmentation techniques to analyze ASR performance improvement on signals affected by packet losses.

References Powered by Scopus

Specaugment: A simple data augmentation method for automatic speech recognition

2454Citations
N/AReaders
Get full text

ESPNet: End-to-end speech processing toolkit

1217Citations
N/AReaders
Get full text

A study on data augmentation of reverberant speech for robust speech recognition

774Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Exploring the Impact of Data Augmentation Techniques on Automatic Speech Recognition System Development: A Comparative Study

3Citations
N/AReaders
Get full text

Whispered Speech Recognition Based on Audio Data Augmentation and Inverse Filtering

1Citations
N/AReaders
Get full text

Exploring the Impact of Data Augmentation Techniques on Emotional Speech Recognition

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Fernández-Gallego, M. P., & Toledano, D. T. (2022). A Study of Data Augmentation for ASR Robustness in Low Bit Rate Contact Center Recordings Including Packet Losses. Applied Sciences (Switzerland), 12(3). https://doi.org/10.3390/app12031580

Readers over time

‘22‘23‘2402468

Readers' Seniority

Tooltip

Lecturer / Post doc 1

50%

PhD / Post grad / Masters / Doc 1

50%

Readers' Discipline

Tooltip

Engineering 3

100%

Save time finding and organizing research with Mendeley

Sign up for free
0