Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU
Incremental processing allows interactive systems to respond based on partial inputs, which is a desirable property e.g. in dialogue agents. The currently popular Transformer architecture inherently processes sequences as a whole, abstracting away the notion of time. Recent work attempts to apply Tr...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Incremental processing allows interactive systems to respond based on partial
inputs, which is a desirable property e.g. in dialogue agents. The currently
popular Transformer architecture inherently processes sequences as a whole,
abstracting away the notion of time. Recent work attempts to apply Transformers
incrementally via restart-incrementality by repeatedly feeding, to an unchanged
model, increasingly longer input prefixes to produce partial outputs. However,
this approach is computationally costly and does not scale efficiently for long
sequences. In parallel, we witness efforts to make Transformers more efficient,
e.g. the Linear Transformer (LT) with a recurrence mechanism. In this work, we
examine the feasibility of LT for incremental NLU in English. Our results show
that the recurrent LT model has better incremental performance and faster
inference speed compared to the standard Transformer and LT with
restart-incrementality, at the cost of part of the non-incremental (full
sequence) quality. We show that the performance drop can be mitigated by
training the model to wait for right context before committing to an output and
that training with input prefixes is beneficial for delivering correct partial
outputs. |
---|---|
DOI: | 10.48550/arxiv.2109.07364 |