Generalization Bounds of Multitask Learning From Perspective of Vector-Valued Function Learning

In this article, we study the generalization performance of multitask learning (MTL) by considering MTL as a learning process of vector-valued functions (VFs). We will answer two theoretical questions, given a small size training sample: 1) under what conditions does MTL perform better than single-t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems 2021-05, Vol.32 (5), p.1906-1919
Hauptverfasser:	Zhang, Chao, Tao, Dacheng, Hu, Tao, Liu, Bingchen
Format:	Artikel
Sprache:	eng
Schlagworte:	Analytical models Complexity theory Consistency Covering number generalization bound Kernel Learning Learning systems multitask learning (MTL) Probability Questions Task analysis task relatedness Upper bound Upper bounds vector-valued function learning (VFL)
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this article, we study the generalization performance of multitask learning (MTL) by considering MTL as a learning process of vector-valued functions (VFs). We will answer two theoretical questions, given a small size training sample: 1) under what conditions does MTL perform better than single-task learning (STL)? And 2) under what conditions does MTL guarantee the consistency of all tasks during learning? In contrast to the conventional task-summation based MTL, the introduction of VF form enables us to detect the behavior of each task and the task-group relatedness in MTL. Specifically, the task-group relatedness examines how the success (or failure) of some tasks affects the performance of the other tasks. By deriving the specific deviation and symmetrization inequalities for VFs, we obtain a generalization bound for MTL to the upper bound of the joint probability that there is at least one task with a large generalization gap. To answer the first question, we discuss how the synergic relatedness between task groups affects the generalization performance of MTL and shows that MTL outperforms STL if almost any pair of complementary task groups is predominantly synergic. Moreover, to answer the second question, we present a sufficient condition to guarantee the consistency of each task in MTL, which requires that the function class of each task should not have high complexity. In addition, our findings provide a strategy to examine whether the task settings will enjoy the advantages of MTL.
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2020.2995428