Deriving Entity-Specific Embeddings From Multi-Entity Sequences

verfasst von

Connor Heaton, Prasenjit Mitra

Abstract

Underpinning much of the recent progress in deep learning is the transformer architecture, which takes as input a sequence of embeddings E and emits an updated sequence of embeddings E^′

. A special [CLS] embedding is often included in this sequence, serving as a description of the sequence once processed and used as the basis for subsequent sequence-level tasks. The processed [CLS] embedding loses utility, however, when the model is presented with a multi-entity sequence and asked to perform an entity-specific task. When processing a multi-speaker dialogue, for example, the [CLS] embedding describes the entire dialogue, not any individual utterance/speaker. Existing methods toward entity-specific prediction involve redundant computation or post-processing outside of the transformer. We present a novel methodology for deriving entity-specific embeddings from a multi-entity sequence completely within the transformer, with a loose definition of entity amenable to many problem spaces. To show the generic applicability of our method, we apply it to widely different tasks: emotion recognition in conversation and player performance projection in baseball and show that it can be used to achieve SOTA in both. Code can be found at github.com/c-heat16/EntitySpecificEmbeddings.

Organisationseinheit(en)

Forschungszentrum L3S

Externe Organisation(en)

Pennsylvania State University

Typ

Aufsatz in Konferenzband

Seiten

4675-4684

Anzahl der Seiten

Publikationsdatum

2024

Publikationsstatus

Veröffentlicht

Peer-reviewed

ASJC Scopus Sachgebiete

Theoretische Informatik, Theoretische Informatik und Mathematik, Angewandte Informatik

Elektronische Version(en)

https://aclanthology.org/2024.lrec-main.418/ (Zugang: Offen)

BibTeX

@inproceedings{4281fa2ea76f4eeb9f3bd5afd0c24b6a,
title = "Deriving Entity-Specific Embeddings From Multi-Entity Sequences",
abstract = "Underpinning much of the recent progress in deep learning is the transformer architecture, which takes as input a sequence of embeddings E and emits an updated sequence of embeddings E′. A special [CLS] embedding is often included in this sequence, serving as a description of the sequence once processed and used as the basis for subsequent sequence-level tasks. The processed [CLS] embedding loses utility, however, when the model is presented with a multi-entity sequence and asked to perform an entity-specific task. When processing a multi-speaker dialogue, for example, the [CLS] embedding describes the entire dialogue, not any individual utterance/speaker. Existing methods toward entity-specific prediction involve redundant computation or post-processing outside of the transformer. We present a novel methodology for deriving entity-specific embeddings from a multi-entity sequence completely within the transformer, with a loose definition of entity amenable to many problem spaces. To show the generic applicability of our method, we apply it to widely different tasks: emotion recognition in conversation and player performance projection in baseball and show that it can be used to achieve SOTA in both. Code can be found at https://github.com/c-heat16/EntitySpecificEmbeddings.",
keywords = "Emotion Recognition, Representation Learning, Sequential Modeling",
author = "Connor Heaton and Prasenjit Mitra",
note = "Publisher Copyright: {\textcopyright} 2024 ELRA Language Resource Association: CC BY-NC 4.0.; Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 ; Conference date: 20-05-2024 Through 25-05-2024",
year = "2024",
language = "English",
pages = "4675--4684",
editor = "Nicoletta Calzolari and Min-Yen Kan and Veronique Hoste and Alessandro Lenci and Sakriani Sakti and Nianwen Xue",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation",
publisher = "European Language Resources Association (ELRA)",
}