The margarita dialogue corpus: A data set for time-offset interactions and unstructured dialogue systems

3Citations
Citations of this article
61Readers
Mendeley users who have this article in their library.

Abstract

Time-Offset Interaction Applications (TOIAs) are systems that simulate face-to-face conversations between humans and digital human avatars recorded in the past. Developing a well-functioning TOIA involves several research areas: artificial intelligence, human-computer interaction, natural language processing, question answering, and dialogue systems. The first challenges are to define a sensible methodology for data collection and to create useful data sets for training the system to retrieve the best answer to a user's question. In this paper, we present three main contributions: a methodology for creating the knowledge base for a TOIA, a dialogue corpus, and baselines for single-turn answer retrieval. We develop the methodology using a two-step strategy. First, we let the avatar maker list pairs by intuition, guessing what possible questions a user may ask to the avatar. Second, we record actual dialogues between random individuals and the avatar-maker. We make the Margarita Dialogue Corpus available to the research community. This corpus comprises the knowledge base in text format, the video clips for each answer, and the annotated dialogues.

References Powered by Scopus

Supervised learning of universal sentence representations from natural language inference data

1506Citations
N/AReaders
Get full text

Guidelines for human-AI interaction

878Citations
N/AReaders
Get full text

From Eliza to XiaoIce: challenges and opportunities with social chatbots

451Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Tell Me More, Tell Me More: AI-Generated Question Suggestions for the Creation of Interactive Video Recordings

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Chierici, A. M., Habash, N., & Bicec, M. (2020). The margarita dialogue corpus: A data set for time-offset interactions and unstructured dialogue systems. In LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings (pp. 476–484). European Language Resources Association (ELRA).

Readers over time

‘20‘21‘22‘23‘24‘2507142128

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 18

72%

Researcher 4

16%

Lecturer / Post doc 2

8%

Professor / Associate Prof. 1

4%

Readers' Discipline

Tooltip

Computer Science 20

71%

Linguistics 5

18%

Social Sciences 2

7%

Philosophy 1

4%

Save time finding and organizing research with Mendeley

Sign up for free
0