MMChat: Multi-Modal Chat Dataset on Social Media

Zheng, Yinhe; Chen, Guanyi; Liu, Xin; Lin, Ke

doi:https://doi.org/10.48550/arXiv.2108.07154

MMChat: Multi-Modal Chat Dataset on Social Media

DSpace/Manakin Repository

MMChat: Multi-Modal Chat Dataset on Social Media

Zheng, Yinhe; Chen, Guanyi; Liu, Xin; Lin, Ke

(2021) Utrecht University Repository

(Preprint)

Abstract

Incorporating multi-modal contexts in conversation is an important step for developing more engaging dialogue systems. In this work, we explore this direction by introducing MMChat: a large scale multi-modal dialogue corpus (32.4M raw dialogues and 120.84K filtered dialogues). Unlike previous corpora that are crowd-sourced or collected from fictitious movies, MMChat ... read more

Download/Full Text

Open Access version via Utrecht University Repository

Preprint

Keywords: cs.CL, cs.CV

DOI: https://doi.org/10.48550/arXiv.2108.07154

Publisher: arXiv

See more statistics about this item