HL Dataset: Grounding High-Level Linguistic Concepts in Vision

Cafagna, M; van Deemter, K; Gatt, A

doi:https://doi.org/10.18653/v1/2023.inlg-main.21

HL Dataset: Grounding High-Level Linguistic Concepts in Vision

DSpace/Manakin Repository

HL Dataset: Grounding High-Level Linguistic Concepts in Vision

Cafagna, M; van Deemter, K; Gatt, A

(2023) Proceedings of the 16th International Natural Language Generation Conference (INLG'23)

(Part of book)

Abstract

Current captioning datasets focus on object-centric captions, describing the visible objects in the image, often ending up stating the obvious (for humans), e.g. “people eating food in a park”. Although these datasets are useful to evaluate the ability of Vision & Language models to recognize and describe visual content, they ... read more

Download/Full Text

Open Access version via Utrecht University Repository

Publisher version

Keywords: vision and language, natural language generation

DOI: https://doi.org/10.18653/v1/2023.inlg-main.21

Publisher: Association for Computational Linguistics

(Peer reviewed)

See more statistics about this item