Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain
Existing data-to-text generation efforts mainly focus on generating a coherent text from non-linguistic input data, such as tables and attribute-value pairs, but overlook that different application scenarios may require texts of different styles. Inspired by this, we define a new task, namely styliz...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Existing data-to-text generation efforts mainly focus on generating a
coherent text from non-linguistic input data, such as tables and
attribute-value pairs, but overlook that different application scenarios may
require texts of different styles. Inspired by this, we define a new task,
namely stylized data-to-text generation, whose aim is to generate coherent text
for the given non-linguistic data according to a specific style. This task is
non-trivial, due to three challenges: the logic of the generated text,
unstructured style reference, and biased training samples. To address these
challenges, we propose a novel stylized data-to-text generation model, named
StyleD2T, comprising three components: logic planning-enhanced data embedding,
mask-based style embedding, and unbiased stylized text generation. In the first
component, we introduce a graph-guided logic planner for attribute organization
to ensure the logic of generated text. In the second component, we devise
feature-level mask-based style embedding to extract the essential style signal
from the given unstructured style reference. In the last one, pseudo triplet
augmentation is utilized to achieve unbiased text generation, and a
multi-condition based confidence assignment function is designed to ensure the
quality of pseudo samples. Extensive experiments on a newly collected dataset
from Taobao have been conducted, and the results show the superiority of our
model over existing methods. |
---|---|
DOI: | 10.48550/arxiv.2305.03256 |