Deriving Entity-Specific Embeddings From Multi-Entity Sequences

authored by
Connor Heaton, Prasenjit Mitra
Abstract

Underpinning much of the recent progress in deep learning is the transformer architecture, which takes as input a sequence of embeddings E and emits an updated sequence of embeddings E

. A special [CLS] embedding is often included in this sequence, serving as a description of the sequence once processed and used as the basis for subsequent sequence-level tasks. The processed [CLS] embedding loses utility, however, when the model is presented with a multi-entity sequence and asked to perform an entity-specific task. When processing a multi-speaker dialogue, for example, the [CLS] embedding describes the entire dialogue, not any individual utterance/speaker. Existing methods toward entity-specific prediction involve redundant computation or post-processing outside of the transformer. We present a novel methodology for deriving entity-specific embeddings from a multi-entity sequence completely within the transformer, with a loose definition of entity amenable to many problem spaces. To show the generic applicability of our method, we apply it to widely different tasks: emotion recognition in conversation and player performance projection in baseball and show that it can be used to achieve SOTA in both. Code can be found at github.com/c-heat16/EntitySpecificEmbeddings.

Organisation(s)
L3S Research Centre
External Organisation(s)
Pennsylvania State University
Type
Conference contribution
Pages
4675-4684
No. of pages
10
Publication date
2024
Publication status
Published
Peer reviewed
Yes
ASJC Scopus subject areas
Theoretical Computer Science, Computational Theory and Mathematics, Computer Science Applications
Electronic version(s)
https://aclanthology.org/2024.lrec-main.418/ (Access: Open)