Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning
Recent research shows synthetic data as a source of supervision helps pretrained language models (PLM) transfer learning to new target tasks/domains. However, this idea is less explored for spatial language. We provide two new data resources on multiple spatial language processing tasks. The first d...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recent research shows synthetic data as a source of supervision helps
pretrained language models (PLM) transfer learning to new target tasks/domains.
However, this idea is less explored for spatial language. We provide two new
data resources on multiple spatial language processing tasks. The first dataset
is synthesized for transfer learning on spatial question answering (SQA) and
spatial role labeling (SpRL). Compared to previous SQA datasets, we include a
larger variety of spatial relation types and spatial expressions. Our data
generation process is easily extendable with new spatial expression lexicons.
The second one is a real-world SQA dataset with human-generated questions built
on an existing corpus with SPRL annotations. This dataset can be used to
evaluate spatial language processing models in realistic situations. We show
pretraining with automatically generated data significantly improves the SOTA
results on several SQA and SPRL benchmarks, particularly when the training data
in the target domain is small. |
---|---|
DOI: | 10.48550/arxiv.2210.16952 |