A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis

The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ibrahim, Michael, Rozas, Heraldo, Gebraeel, Nagi, Xie, Weijun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Ibrahim, Michael
Rozas, Heraldo
Gebraeel, Nagi
Xie, Weijun
description The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be trained in a federated fashion. Moreover, due to harsh industrial settings the data often suffers from feature and label uncertainty. Therefore, we study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data. We consider the setting where the local data of each client $g$ is sampled from a unique true distribution $\mathbb{P}_g$, and the clients can only communicate with the central server. We propose a novel Mixture of Wasserstein Balls (MoWB) ambiguity set that relies on local Wasserstein balls centered at the empirical distribution of the data at each client. We study theoretical aspects of the proposed ambiguity set, deriving its out-of-sample performance guarantees and demonstrating that it naturally allows for the separability of the DR problem. Subsequently, we propose two distributed optimization algorithms for training the global FDR-SVM: i) a subgradient method-based algorithm, and ii) an alternating direction method of multipliers (ADMM)-based algorithm. We derive the optimization problems to be solved by each client and provide closed-form expressions for the computations performed by the central server during each iteration for both algorithms. Finally, we thoroughly examine the performance of the proposed algorithms in a series of numerical experiments utilizing both simulation data and popular real-world datasets.
doi_str_mv 10.48550/arxiv.2410.03877
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_03877</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_03877</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_038773</originalsourceid><addsrcrecordid>eNqFjjFuwkAQRbehQIEDpMpcIOAEELQEsGhoAkpKawxjGMl4rZlZwFfg1GxQREv1pa__9J9zrx9JbzgZjZI-yoVPvc9hLJLBZDxuu-sUUtqRoNEO5qwmnAdjX2FZNvDt86AG61DXXgx-aGteYIXbA1cEZ7YDrPhiQQh8Ab-oSqJGXMFX5BWmx5z3ga2BNRkUkX1cxLsUQ2mxwX3llbXjWgWWSt3_fHFv6WIzW77frbNa-IjSZH_22d1-8HxxAxSfUpw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis</title><source>arXiv.org</source><creator>Ibrahim, Michael ; Rozas, Heraldo ; Gebraeel, Nagi ; Xie, Weijun</creator><creatorcontrib>Ibrahim, Michael ; Rozas, Heraldo ; Gebraeel, Nagi ; Xie, Weijun</creatorcontrib><description>The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be trained in a federated fashion. Moreover, due to harsh industrial settings the data often suffers from feature and label uncertainty. Therefore, we study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data. We consider the setting where the local data of each client $g$ is sampled from a unique true distribution $\mathbb{P}_g$, and the clients can only communicate with the central server. We propose a novel Mixture of Wasserstein Balls (MoWB) ambiguity set that relies on local Wasserstein balls centered at the empirical distribution of the data at each client. We study theoretical aspects of the proposed ambiguity set, deriving its out-of-sample performance guarantees and demonstrating that it naturally allows for the separability of the DR problem. Subsequently, we propose two distributed optimization algorithms for training the global FDR-SVM: i) a subgradient method-based algorithm, and ii) an alternating direction method of multipliers (ADMM)-based algorithm. We derive the optimization problems to be solved by each client and provide closed-form expressions for the computations performed by the central server during each iteration for both algorithms. Finally, we thoroughly examine the performance of the proposed algorithms in a series of numerical experiments utilizing both simulation data and popular real-world datasets.</description><identifier>DOI: 10.48550/arxiv.2410.03877</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2024-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.03877$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.03877$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ibrahim, Michael</creatorcontrib><creatorcontrib>Rozas, Heraldo</creatorcontrib><creatorcontrib>Gebraeel, Nagi</creatorcontrib><creatorcontrib>Xie, Weijun</creatorcontrib><title>A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis</title><description>The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be trained in a federated fashion. Moreover, due to harsh industrial settings the data often suffers from feature and label uncertainty. Therefore, we study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data. We consider the setting where the local data of each client $g$ is sampled from a unique true distribution $\mathbb{P}_g$, and the clients can only communicate with the central server. We propose a novel Mixture of Wasserstein Balls (MoWB) ambiguity set that relies on local Wasserstein balls centered at the empirical distribution of the data at each client. We study theoretical aspects of the proposed ambiguity set, deriving its out-of-sample performance guarantees and demonstrating that it naturally allows for the separability of the DR problem. Subsequently, we propose two distributed optimization algorithms for training the global FDR-SVM: i) a subgradient method-based algorithm, and ii) an alternating direction method of multipliers (ADMM)-based algorithm. We derive the optimization problems to be solved by each client and provide closed-form expressions for the computations performed by the central server during each iteration for both algorithms. Finally, we thoroughly examine the performance of the proposed algorithms in a series of numerical experiments utilizing both simulation data and popular real-world datasets.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjjFuwkAQRbehQIEDpMpcIOAEELQEsGhoAkpKawxjGMl4rZlZwFfg1GxQREv1pa__9J9zrx9JbzgZjZI-yoVPvc9hLJLBZDxuu-sUUtqRoNEO5qwmnAdjX2FZNvDt86AG61DXXgx-aGteYIXbA1cEZ7YDrPhiQQh8Ab-oSqJGXMFX5BWmx5z3ga2BNRkUkX1cxLsUQ2mxwX3llbXjWgWWSt3_fHFv6WIzW77frbNa-IjSZH_22d1-8HxxAxSfUpw</recordid><startdate>20241004</startdate><enddate>20241004</enddate><creator>Ibrahim, Michael</creator><creator>Rozas, Heraldo</creator><creator>Gebraeel, Nagi</creator><creator>Xie, Weijun</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20241004</creationdate><title>A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis</title><author>Ibrahim, Michael ; Rozas, Heraldo ; Gebraeel, Nagi ; Xie, Weijun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_038773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Ibrahim, Michael</creatorcontrib><creatorcontrib>Rozas, Heraldo</creatorcontrib><creatorcontrib>Gebraeel, Nagi</creatorcontrib><creatorcontrib>Xie, Weijun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ibrahim, Michael</au><au>Rozas, Heraldo</au><au>Gebraeel, Nagi</au><au>Xie, Weijun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis</atitle><date>2024-10-04</date><risdate>2024</risdate><abstract>The training of classification models for fault diagnosis tasks using geographically dispersed data is a crucial task for original equipment manufacturers (OEMs) seeking to provide long-term service contracts (LTSCs) to their customers. Due to privacy and bandwidth constraints, such models must be trained in a federated fashion. Moreover, due to harsh industrial settings the data often suffers from feature and label uncertainty. Therefore, we study the problem of training a distributionally robust (DR) support vector machine (SVM) in a federated fashion over a network comprised of a central server and $G$ clients without sharing data. We consider the setting where the local data of each client $g$ is sampled from a unique true distribution $\mathbb{P}_g$, and the clients can only communicate with the central server. We propose a novel Mixture of Wasserstein Balls (MoWB) ambiguity set that relies on local Wasserstein balls centered at the empirical distribution of the data at each client. We study theoretical aspects of the proposed ambiguity set, deriving its out-of-sample performance guarantees and demonstrating that it naturally allows for the separability of the DR problem. Subsequently, we propose two distributed optimization algorithms for training the global FDR-SVM: i) a subgradient method-based algorithm, and ii) an alternating direction method of multipliers (ADMM)-based algorithm. We derive the optimization problems to be solved by each client and provide closed-form expressions for the computations performed by the central server during each iteration for both algorithms. Finally, we thoroughly examine the performance of the proposed algorithms in a series of numerical experiments utilizing both simulation data and popular real-world datasets.</abstract><doi>10.48550/arxiv.2410.03877</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.03877
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_03877
source arXiv.org
subjects Computer Science - Learning
Statistics - Machine Learning
title A Federated Distributionally Robust Support Vector Machine with Mixture of Wasserstein Balls Ambiguity Set for Distributed Fault Diagnosis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T07%3A08%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Federated%20Distributionally%20Robust%20Support%20Vector%20Machine%20with%20Mixture%20of%20Wasserstein%20Balls%20Ambiguity%20Set%20for%20Distributed%20Fault%20Diagnosis&rft.au=Ibrahim,%20Michael&rft.date=2024-10-04&rft_id=info:doi/10.48550/arxiv.2410.03877&rft_dat=%3Carxiv_GOX%3E2410_03877%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true