Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification

Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SIGIR forum 2024-08, Vol.58 (1), p.1-2
1. Verfasser:	Marcuzzi, Federico
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for models that exhibit accuracy and robustness during the operational phase has grown exponentially. One crucial factor that profoundly shapes the quality of machine learning models revolves around the training data they rely upon and the input data encountered at the operational phase. Therefore, the development of data-aware algorithms is of paramount importance in achieving high-quality machine-learning models. This thesis contributes to this overarching objective by delving into the development of data-aware algorithms, emphasizing the importance of this awareness during both the training and operational phases of machine learning models. The research presented in this thesis focuses on two primary domains. The first domain is information retrieval, with a particular emphasis on enhancing both the efficiency of learning-to-rank learning algorithms and the effectiveness of the learned models in solving ranking tasks. The thesis includes three works in this domain: Marcuzzi et al. [2022] provides a novel algorithm to detect and remove consistent-outliers documents from the training data. In Marcuzzi et al. [2023], we designed a new learning algorithm that handles the problem of gradient incoherencies affecting LambdaRank-based algorithms. Finally, in Lucchese et al. [2023], we designed a new sampling function for the Selective Gradient Boosting algorithm to exploit the most useful low-ranked non-relevant document. The second domain is adversarial machine learning, which focuses on increasing the robustness of binary classifiers against adversarial inputs encountered at the operational phase. Furthermore, the research in this domain focuses on providing certifiable models to efficiently assess robustness against adversarial machine learning attacks. In this regard, in Calzavara et al. [2021], we designed a novel robust learning algorithm to train ensembles of decision trees robust to evasion attacks along with its polynomial robustness-certification algorithm designed to compute a robustness lower bound. Finally, in Calzavara et al. [2022], we provided a new evaluation metric named Resilience to better access the security of machine learning models. Awarded by: Univ
ISSN:	0163-5840
DOI:	10.1145/3687273.3687297