Query based hybrid learning models for adaptively adjusting locality

Local learning employs locality adjusting mechanisms to give local function estimation for each query, while global learning tries to capture the global distribution characteristics of the entire training set. When fitting well with local characteristics of each individual region, the locality param...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yuanchun Zhu, Guyue Mi, Ying Tan
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Local learning employs locality adjusting mechanisms to give local function estimation for each query, while global learning tries to capture the global distribution characteristics of the entire training set. When fitting well with local characteristics of each individual region, the locality parameter may help local learning to improve performance. However, the real data distribution is impossible to get for a real-world problem, and thus an optimal locality is hard to get for each query. In addition, it is quite time-consuming to build an independent local model for each query. To solve these problems, we present strategies for estimating and tuning locality according to local distribution. Based on local distribution estimation, global learning and local learning are combined to achieve a good compromise between capacity and locality. In addition, multi-objective learning principles for the combination are also given. In implementation, a unique global model is first built on the entire training set based on empirical minimization principle. For each query, it is measured that whether the global model can well fit the vicinity space of the query. When an uneven local distribution is found, the locality of the model is tuned, and a specific local model will be built on the local region. To investigate the performance of hybrid models, we apply them to a typical learning problem-spam filtering, in which data are always found to be unevenly distributed. Experiments were conducted on five real-world corpora, namely PU1, PU2, PU3, PUA, and TREC07. It is shown that the hybrid models can achieve a better compromise between capacity and locality, and hybrid models outperform both global learning and local learning.
ISSN:2161-4393
2161-4407
DOI:10.1109/IJCNN.2012.6252422