Obey validity limits of data-driven models
Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2020-10 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Schweidtmann, Artur M Weber, Jana M Wende, Christian Netze, Linus Mitsos, Alexander |
description | Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models are not evaluated outside their validity domain during process optimization. We propose a method to learn this validity domain and encode it as constraints in process optimization. We first perform a topological data analysis using persistent homology identifying potential holes or separated clusters in the training data. In case clusters or holes are identified, we train a one-class classifier, i.e., a one-class support vector machine, on the training data domain and encode it as constraints in the subsequent process optimization. Otherwise, we construct the convex hull of the data and encode it as constraints. We finally perform deterministic global process optimization with the data-driven models subject to their respective validity constraints. To ensure computational tractability, we develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3,000, making it a viable tool for engineering applications. The method is ready-to-use and available open-source as part of our MeLOn toolbox (https://git.rwth-aachen.de/avt.svt/public/MeLOn). |
doi_str_mv | 10.48550/arxiv.2010.03405 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2010_03405</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2449345334</sourcerecordid><originalsourceid>FETCH-LOGICAL-a524-d5082962883dc1d41a6560495eab372cc8fd6f72250c261b78b97d3d07acf01d3</originalsourceid><addsrcrecordid>eNotj8tqwzAUREWh0JDmA7qqobuC06t79fKyhL4gkE32RrZkULDjVHJM_fd1k64GhsMwh7EHDmthpIQXG3_CuEaYCyAB8oYtkIjnRiDesVVKBwBApVFKWrDnXeWnbLRtcGGYsjZ0YUhZ32TODjZ3MYz-mHW98226Z7eNbZNf_eeS7d_f9pvPfLv7-Nq8bnMrUeROgsFCoTHkau4Et0oqEIX0tiKNdW0apxqNKKFGxSttqkI7cqBt3QB3tGSP19mLSHmKobNxKv-EyovQTDxdiVPsv88-DeWhP8fj_KlEIQoSkkjQLzbjS94</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2449345334</pqid></control><display><type>article</type><title>Obey validity limits of data-driven models</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Schweidtmann, Artur M ; Weber, Jana M ; Wende, Christian ; Netze, Linus ; Mitsos, Alexander</creator><creatorcontrib>Schweidtmann, Artur M ; Weber, Jana M ; Wende, Christian ; Netze, Linus ; Mitsos, Alexander</creatorcontrib><description>Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models are not evaluated outside their validity domain during process optimization. We propose a method to learn this validity domain and encode it as constraints in process optimization. We first perform a topological data analysis using persistent homology identifying potential holes or separated clusters in the training data. In case clusters or holes are identified, we train a one-class classifier, i.e., a one-class support vector machine, on the training data domain and encode it as constraints in the subsequent process optimization. Otherwise, we construct the convex hull of the data and encode it as constraints. We finally perform deterministic global process optimization with the data-driven models subject to their respective validity constraints. To ensure computational tractability, we develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3,000, making it a viable tool for engineering applications. The method is ready-to-use and available open-source as part of our MeLOn toolbox (https://git.rwth-aachen.de/avt.svt/public/MeLOn).</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2010.03405</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Clusters ; Computational geometry ; Convexity ; Data analysis ; Design optimization ; Domains ; Homology ; Mathematics - Optimization and Control ; Optimization ; Support vector machines ; Training ; Validity</subject><ispartof>arXiv.org, 2020-10</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2010.03405$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1007/s11081-021-09608-0$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Schweidtmann, Artur M</creatorcontrib><creatorcontrib>Weber, Jana M</creatorcontrib><creatorcontrib>Wende, Christian</creatorcontrib><creatorcontrib>Netze, Linus</creatorcontrib><creatorcontrib>Mitsos, Alexander</creatorcontrib><title>Obey validity limits of data-driven models</title><title>arXiv.org</title><description>Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models are not evaluated outside their validity domain during process optimization. We propose a method to learn this validity domain and encode it as constraints in process optimization. We first perform a topological data analysis using persistent homology identifying potential holes or separated clusters in the training data. In case clusters or holes are identified, we train a one-class classifier, i.e., a one-class support vector machine, on the training data domain and encode it as constraints in the subsequent process optimization. Otherwise, we construct the convex hull of the data and encode it as constraints. We finally perform deterministic global process optimization with the data-driven models subject to their respective validity constraints. To ensure computational tractability, we develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3,000, making it a viable tool for engineering applications. The method is ready-to-use and available open-source as part of our MeLOn toolbox (https://git.rwth-aachen.de/avt.svt/public/MeLOn).</description><subject>Clusters</subject><subject>Computational geometry</subject><subject>Convexity</subject><subject>Data analysis</subject><subject>Design optimization</subject><subject>Domains</subject><subject>Homology</subject><subject>Mathematics - Optimization and Control</subject><subject>Optimization</subject><subject>Support vector machines</subject><subject>Training</subject><subject>Validity</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj8tqwzAUREWh0JDmA7qqobuC06t79fKyhL4gkE32RrZkULDjVHJM_fd1k64GhsMwh7EHDmthpIQXG3_CuEaYCyAB8oYtkIjnRiDesVVKBwBApVFKWrDnXeWnbLRtcGGYsjZ0YUhZ32TODjZ3MYz-mHW98226Z7eNbZNf_eeS7d_f9pvPfLv7-Nq8bnMrUeROgsFCoTHkau4Et0oqEIX0tiKNdW0apxqNKKFGxSttqkI7cqBt3QB3tGSP19mLSHmKobNxKv-EyovQTDxdiVPsv88-DeWhP8fj_KlEIQoSkkjQLzbjS94</recordid><startdate>20201007</startdate><enddate>20201007</enddate><creator>Schweidtmann, Artur M</creator><creator>Weber, Jana M</creator><creator>Wende, Christian</creator><creator>Netze, Linus</creator><creator>Mitsos, Alexander</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKZ</scope><scope>GOX</scope></search><sort><creationdate>20201007</creationdate><title>Obey validity limits of data-driven models</title><author>Schweidtmann, Artur M ; Weber, Jana M ; Wende, Christian ; Netze, Linus ; Mitsos, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a524-d5082962883dc1d41a6560495eab372cc8fd6f72250c261b78b97d3d07acf01d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Clusters</topic><topic>Computational geometry</topic><topic>Convexity</topic><topic>Data analysis</topic><topic>Design optimization</topic><topic>Domains</topic><topic>Homology</topic><topic>Mathematics - Optimization and Control</topic><topic>Optimization</topic><topic>Support vector machines</topic><topic>Training</topic><topic>Validity</topic><toplevel>online_resources</toplevel><creatorcontrib>Schweidtmann, Artur M</creatorcontrib><creatorcontrib>Weber, Jana M</creatorcontrib><creatorcontrib>Wende, Christian</creatorcontrib><creatorcontrib>Netze, Linus</creatorcontrib><creatorcontrib>Mitsos, Alexander</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Mathematics</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Schweidtmann, Artur M</au><au>Weber, Jana M</au><au>Wende, Christian</au><au>Netze, Linus</au><au>Mitsos, Alexander</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Obey validity limits of data-driven models</atitle><jtitle>arXiv.org</jtitle><date>2020-10-07</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Data-driven models are becoming increasingly popular in engineering, on their own or in combination with mechanistic models. Commonly, the trained models are subsequently used in model-based optimization of design and/or operation of processes. Thus, it is critical to ensure that data-driven models are not evaluated outside their validity domain during process optimization. We propose a method to learn this validity domain and encode it as constraints in process optimization. We first perform a topological data analysis using persistent homology identifying potential holes or separated clusters in the training data. In case clusters or holes are identified, we train a one-class classifier, i.e., a one-class support vector machine, on the training data domain and encode it as constraints in the subsequent process optimization. Otherwise, we construct the convex hull of the data and encode it as constraints. We finally perform deterministic global process optimization with the data-driven models subject to their respective validity constraints. To ensure computational tractability, we develop a reduced-space formulation for trained one-class support vector machines and show that our formulation outperforms common full-space formulations by a factor of over 3,000, making it a viable tool for engineering applications. The method is ready-to-use and available open-source as part of our MeLOn toolbox (https://git.rwth-aachen.de/avt.svt/public/MeLOn).</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2010.03405</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2020-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2010_03405 |
source | arXiv.org; Free E- Journals |
subjects | Clusters Computational geometry Convexity Data analysis Design optimization Domains Homology Mathematics - Optimization and Control Optimization Support vector machines Training Validity |
title | Obey validity limits of data-driven models |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T20%3A44%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Obey%20validity%20limits%20of%20data-driven%20models&rft.jtitle=arXiv.org&rft.au=Schweidtmann,%20Artur%20M&rft.date=2020-10-07&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2010.03405&rft_dat=%3Cproquest_arxiv%3E2449345334%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2449345334&rft_id=info:pmid/&rfr_iscdi=true |