Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification

Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SIGIR forum 2024-08, Vol.58 (1), p.1-2
1. Verfasser: Marcuzzi, Federico
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2
container_issue 1
container_start_page 1
container_title SIGIR forum
container_volume 58
creator Marcuzzi, Federico
description Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for models that exhibit accuracy and robustness during the operational phase has grown exponentially. One crucial factor that profoundly shapes the quality of machine learning models revolves around the training data they rely upon and the input data encountered at the operational phase. Therefore, the development of data-aware algorithms is of paramount importance in achieving high-quality machine-learning models. This thesis contributes to this overarching objective by delving into the development of data-aware algorithms, emphasizing the importance of this awareness during both the training and operational phases of machine learning models. The research presented in this thesis focuses on two primary domains. The first domain is information retrieval, with a particular emphasis on enhancing both the efficiency of learning-to-rank learning algorithms and the effectiveness of the learned models in solving ranking tasks. The thesis includes three works in this domain: Marcuzzi et al. [2022] provides a novel algorithm to detect and remove consistent-outliers documents from the training data. In Marcuzzi et al. [2023], we designed a new learning algorithm that handles the problem of gradient incoherencies affecting LambdaRank-based algorithms. Finally, in Lucchese et al. [2023], we designed a new sampling function for the Selective Gradient Boosting algorithm to exploit the most useful low-ranked non-relevant document. The second domain is adversarial machine learning, which focuses on increasing the robustness of binary classifiers against adversarial inputs encountered at the operational phase. Furthermore, the research in this domain focuses on providing certifiable models to efficiently assess robustness against adversarial machine learning attacks. In this regard, in Calzavara et al. [2021], we designed a novel robust learning algorithm to train ensembles of decision trees robust to evasion attacks along with its polynomial robustness-certification algorithm designed to compute a robustness lower bound. Finally, in Calzavara et al. [2022], we provided a new evaluation metric named Resilience to better access the security of machine learning models. Awarded by: Univ
doi_str_mv 10.1145/3687273.3687297
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3687273_3687297</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3687297</sourcerecordid><originalsourceid>FETCH-LOGICAL-a597-a1803b35f7d08ebf755588948d491e362b011029b878790976f37376cd301a873</originalsourceid><addsrcrecordid>eNo9kMtOwzAQRb0AiVJYI7HyBzTtTBzH9rKKCkWKhFS6j5zELoY8kB2Q-HsSGljd0cw9sziE3CGsERO-YakUsWDr31TigiwAUxZxmcAVuQ7hDQAlcrUgLztrTTW4L7Oi4-gqZ7phRXVX00NffoaB5kb7znUnum1OvXfDaxuo7T096O59Wk_VrNEhuJHWg-u7G3JpdRPM7ZxLcnzYHbN9lD8_PmXbPNJciUijBFYybkUN0pRWcM6lVImsE4WGpXEJiBCrUgopFCiRWiaYSKuaAWop2JJszm8r34fgjS0-vGu1_y4QiklDMWsoZg0jcX8mdNX-l_-OP3nzWKo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification</title><source>ACM Digital Library Complete</source><creator>Marcuzzi, Federico</creator><creatorcontrib>Marcuzzi, Federico</creatorcontrib><description>Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for models that exhibit accuracy and robustness during the operational phase has grown exponentially. One crucial factor that profoundly shapes the quality of machine learning models revolves around the training data they rely upon and the input data encountered at the operational phase. Therefore, the development of data-aware algorithms is of paramount importance in achieving high-quality machine-learning models. This thesis contributes to this overarching objective by delving into the development of data-aware algorithms, emphasizing the importance of this awareness during both the training and operational phases of machine learning models. The research presented in this thesis focuses on two primary domains. The first domain is information retrieval, with a particular emphasis on enhancing both the efficiency of learning-to-rank learning algorithms and the effectiveness of the learned models in solving ranking tasks. The thesis includes three works in this domain: Marcuzzi et al. [2022] provides a novel algorithm to detect and remove consistent-outliers documents from the training data. In Marcuzzi et al. [2023], we designed a new learning algorithm that handles the problem of gradient incoherencies affecting LambdaRank-based algorithms. Finally, in Lucchese et al. [2023], we designed a new sampling function for the Selective Gradient Boosting algorithm to exploit the most useful low-ranked non-relevant document. The second domain is adversarial machine learning, which focuses on increasing the robustness of binary classifiers against adversarial inputs encountered at the operational phase. Furthermore, the research in this domain focuses on providing certifiable models to efficiently assess robustness against adversarial machine learning attacks. In this regard, in Calzavara et al. [2021], we designed a novel robust learning algorithm to train ensembles of decision trees robust to evasion attacks along with its polynomial robustness-certification algorithm designed to compute a robustness lower bound. Finally, in Calzavara et al. [2022], we provided a new evaluation metric named Resilience to better access the security of machine learning models. Awarded by: Università Ca' Foscari di Venezia, Venice, Italy on 19 April 2024. Supervised by: Claudio Lucchese. Available at: https://federicomarcuzzi.github.io/resources/thesis_phd.pdf.</description><identifier>ISSN: 0163-5840</identifier><identifier>DOI: 10.1145/3687273.3687297</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><ispartof>SIGIR forum, 2024-08, Vol.58 (1), p.1-2</ispartof><rights>Copyright is held by the owner/author(s)</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3687273.3687297$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2282,27924,27925,40196,76100</link.rule.ids></links><search><creatorcontrib>Marcuzzi, Federico</creatorcontrib><title>Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification</title><title>SIGIR forum</title><addtitle>ACM SIGIR</addtitle><description>Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for models that exhibit accuracy and robustness during the operational phase has grown exponentially. One crucial factor that profoundly shapes the quality of machine learning models revolves around the training data they rely upon and the input data encountered at the operational phase. Therefore, the development of data-aware algorithms is of paramount importance in achieving high-quality machine-learning models. This thesis contributes to this overarching objective by delving into the development of data-aware algorithms, emphasizing the importance of this awareness during both the training and operational phases of machine learning models. The research presented in this thesis focuses on two primary domains. The first domain is information retrieval, with a particular emphasis on enhancing both the efficiency of learning-to-rank learning algorithms and the effectiveness of the learned models in solving ranking tasks. The thesis includes three works in this domain: Marcuzzi et al. [2022] provides a novel algorithm to detect and remove consistent-outliers documents from the training data. In Marcuzzi et al. [2023], we designed a new learning algorithm that handles the problem of gradient incoherencies affecting LambdaRank-based algorithms. Finally, in Lucchese et al. [2023], we designed a new sampling function for the Selective Gradient Boosting algorithm to exploit the most useful low-ranked non-relevant document. The second domain is adversarial machine learning, which focuses on increasing the robustness of binary classifiers against adversarial inputs encountered at the operational phase. Furthermore, the research in this domain focuses on providing certifiable models to efficiently assess robustness against adversarial machine learning attacks. In this regard, in Calzavara et al. [2021], we designed a novel robust learning algorithm to train ensembles of decision trees robust to evasion attacks along with its polynomial robustness-certification algorithm designed to compute a robustness lower bound. Finally, in Calzavara et al. [2022], we provided a new evaluation metric named Resilience to better access the security of machine learning models. Awarded by: Università Ca' Foscari di Venezia, Venice, Italy on 19 April 2024. Supervised by: Claudio Lucchese. Available at: https://federicomarcuzzi.github.io/resources/thesis_phd.pdf.</description><issn>0163-5840</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9kMtOwzAQRb0AiVJYI7HyBzTtTBzH9rKKCkWKhFS6j5zELoY8kB2Q-HsSGljd0cw9sziE3CGsERO-YakUsWDr31TigiwAUxZxmcAVuQ7hDQAlcrUgLztrTTW4L7Oi4-gqZ7phRXVX00NffoaB5kb7znUnum1OvXfDaxuo7T096O59Wk_VrNEhuJHWg-u7G3JpdRPM7ZxLcnzYHbN9lD8_PmXbPNJciUijBFYybkUN0pRWcM6lVImsE4WGpXEJiBCrUgopFCiRWiaYSKuaAWop2JJszm8r34fgjS0-vGu1_y4QiklDMWsoZg0jcX8mdNX-l_-OP3nzWKo</recordid><startdate>20240807</startdate><enddate>20240807</enddate><creator>Marcuzzi, Federico</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240807</creationdate><title>Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification</title><author>Marcuzzi, Federico</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a597-a1803b35f7d08ebf755588948d491e362b011029b878790976f37376cd301a873</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Marcuzzi, Federico</creatorcontrib><collection>CrossRef</collection><jtitle>SIGIR forum</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Marcuzzi, Federico</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification</atitle><jtitle>SIGIR forum</jtitle><stitle>ACM SIGIR</stitle><date>2024-08-07</date><risdate>2024</risdate><volume>58</volume><issue>1</issue><spage>1</spage><epage>2</epage><pages>1-2</pages><issn>0163-5840</issn><abstract>Over the past decade, machine learning has gained significant traction and is now deployed across diverse domains, including information systems, finance, healthcare, cybersecurity, autonomous driving, and more. As machine learning finds applications in various sensitive scenarios, the demand for models that exhibit accuracy and robustness during the operational phase has grown exponentially. One crucial factor that profoundly shapes the quality of machine learning models revolves around the training data they rely upon and the input data encountered at the operational phase. Therefore, the development of data-aware algorithms is of paramount importance in achieving high-quality machine-learning models. This thesis contributes to this overarching objective by delving into the development of data-aware algorithms, emphasizing the importance of this awareness during both the training and operational phases of machine learning models. The research presented in this thesis focuses on two primary domains. The first domain is information retrieval, with a particular emphasis on enhancing both the efficiency of learning-to-rank learning algorithms and the effectiveness of the learned models in solving ranking tasks. The thesis includes three works in this domain: Marcuzzi et al. [2022] provides a novel algorithm to detect and remove consistent-outliers documents from the training data. In Marcuzzi et al. [2023], we designed a new learning algorithm that handles the problem of gradient incoherencies affecting LambdaRank-based algorithms. Finally, in Lucchese et al. [2023], we designed a new sampling function for the Selective Gradient Boosting algorithm to exploit the most useful low-ranked non-relevant document. The second domain is adversarial machine learning, which focuses on increasing the robustness of binary classifiers against adversarial inputs encountered at the operational phase. Furthermore, the research in this domain focuses on providing certifiable models to efficiently assess robustness against adversarial machine learning attacks. In this regard, in Calzavara et al. [2021], we designed a novel robust learning algorithm to train ensembles of decision trees robust to evasion attacks along with its polynomial robustness-certification algorithm designed to compute a robustness lower bound. Finally, in Calzavara et al. [2022], we provided a new evaluation metric named Resilience to better access the security of machine learning models. Awarded by: Università Ca' Foscari di Venezia, Venice, Italy on 19 April 2024. Supervised by: Claudio Lucchese. Available at: https://federicomarcuzzi.github.io/resources/thesis_phd.pdf.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3687273.3687297</doi><tpages>2</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0163-5840
ispartof SIGIR forum, 2024-08, Vol.58 (1), p.1-2
issn 0163-5840
language eng
recordid cdi_crossref_primary_10_1145_3687273_3687297
source ACM Digital Library Complete
title Effective, Efficient, and Robust Learning Algorithms for Ranking and Classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A14%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Effective,%20Efficient,%20and%20Robust%20Learning%20Algorithms%20for%20Ranking%20and%20Classification&rft.jtitle=SIGIR%20forum&rft.au=Marcuzzi,%20Federico&rft.date=2024-08-07&rft.volume=58&rft.issue=1&rft.spage=1&rft.epage=2&rft.pages=1-2&rft.issn=0163-5840&rft_id=info:doi/10.1145/3687273.3687297&rft_dat=%3Cacm_cross%3E3687297%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true