Generalization Bounds of Multitask Learning From Perspective of Vector-Valued Function Learning
In this article, we study the generalization performance of multitask learning (MTL) by considering MTL as a learning process of vector-valued functions (VFs). We will answer two theoretical questions, given a small size training sample: 1) under what conditions does MTL perform better than single-t...
Gespeichert in:
Veröffentlicht in: | IEEE transaction on neural networks and learning systems 2021-05, Vol.32 (5), p.1906-1919 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this article, we study the generalization performance of multitask learning (MTL) by considering MTL as a learning process of vector-valued functions (VFs). We will answer two theoretical questions, given a small size training sample: 1) under what conditions does MTL perform better than single-task learning (STL)? And 2) under what conditions does MTL guarantee the consistency of all tasks during learning? In contrast to the conventional task-summation based MTL, the introduction of VF form enables us to detect the behavior of each task and the task-group relatedness in MTL. Specifically, the task-group relatedness examines how the success (or failure) of some tasks affects the performance of the other tasks. By deriving the specific deviation and symmetrization inequalities for VFs, we obtain a generalization bound for MTL to the upper bound of the joint probability that there is at least one task with a large generalization gap. To answer the first question, we discuss how the synergic relatedness between task groups affects the generalization performance of MTL and shows that MTL outperforms STL if almost any pair of complementary task groups is predominantly synergic. Moreover, to answer the second question, we present a sufficient condition to guarantee the consistency of each task in MTL, which requires that the function class of each task should not have high complexity. In addition, our findings provide a strategy to examine whether the task settings will enjoy the advantages of MTL. |
---|---|
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2020.2995428 |