Deriving Entity-Specific Embeddings From Multi-Entity Sequences
- verfasst von
- Connor Heaton, Prasenjit Mitra
- Abstract
Underpinning much of the recent progress in deep learning is the transformer architecture, which takes as input a sequence of embeddings E and emits an updated sequence of embeddings E′
. A special [CLS] embedding is often included in this sequence, serving as a description of the sequence once processed and used as the basis for subsequent sequence-level tasks. The processed [CLS] embedding loses utility, however, when the model is presented with a multi-entity sequence and asked to perform an entity-specific task. When processing a multi-speaker dialogue, for example, the [CLS] embedding describes the entire dialogue, not any individual utterance/speaker. Existing methods toward entity-specific prediction involve redundant computation or post-processing outside of the transformer. We present a novel methodology for deriving entity-specific embeddings from a multi-entity sequence completely within the transformer, with a loose definition of entity amenable to many problem spaces. To show the generic applicability of our method, we apply it to widely different tasks: emotion recognition in conversation and player performance projection in baseball and show that it can be used to achieve SOTA in both. Code can be found at github.com/c-heat16/EntitySpecificEmbeddings.
- Organisationseinheit(en)
-
Forschungszentrum L3S
- Externe Organisation(en)
-
Pennsylvania State University
- Typ
- Aufsatz in Konferenzband
- Seiten
- 4675-4684
- Anzahl der Seiten
- 10
- Publikationsdatum
- 2024
- Publikationsstatus
- Veröffentlicht
- Peer-reviewed
- Ja
- ASJC Scopus Sachgebiete
- Theoretische Informatik, Theoretische Informatik und Mathematik, Angewandte Informatik
- Elektronische Version(en)
-
https://aclanthology.org/2024.lrec-main.418/ (Zugang:
Offen)