Pulling back symmetric Riemannian geometry for data analysis

Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-03
1. Verfasser: Diepeveen, Willem
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Diepeveen, Willem
description Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of non-linear geometries that has been shown to be able to capture the data geometry through empirical evidence from classical non-linear embedding. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised efficiently to data on a symmetric Riemannian manifold. A conceptual challenge comes from the lack of guidelines for constructing a symmetric Riemannian structure on the data space itself and the lack of guidelines for modifying successful algorithms on symmetric Riemannian manifolds for data analysis to this setting. This work considers these challenges in the setting of pullback Riemannian geometry through a diffeomorphism. The first part of the paper characterises diffeomorphisms that result in proper, stable and efficient data analysis. The second part then uses these best practices to guide construction of such diffeomorphisms through deep learning. As a proof of concept, different types of pullback geometries -- among which the proposed construction -- are tested on several data analysis tasks and on several toy data sets. The numerical experiments confirm the predictions from theory, i.e., that the diffeomorphisms generating the pullback geometry need to map the data manifold into a geodesic subspace of the pulled back Riemannian manifold while preserving local isometry around the data manifold for proper, stable and efficient data analysis, and that pulling back positive curvature can be problematic in terms of stability.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2955958287</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2955958287</sourcerecordid><originalsourceid>FETCH-proquest_journals_29559582873</originalsourceid><addsrcrecordid>eNqNysEKgkAQgOElCJLyHQY6CzbbpkK3KDpGdJfRVllbd2tHD759BT1Apx9-vpmIUMpNkm8RFyJm7tI0xV2GSslI7C-jtca1UFH9AJ76Xg_B1HA1uifnDDlotf_OCRof4E4DATmyExteiXlDlnX861KsT8fb4Zw8g3-Nmoey82P4YC6xUKpQOeaZ_E-9AWxKOBU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2955958287</pqid></control><display><type>article</type><title>Pulling back symmetric Riemannian geometry for data analysis</title><source>Free E- Journals</source><creator>Diepeveen, Willem</creator><creatorcontrib>Diepeveen, Willem</creatorcontrib><description>Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of non-linear geometries that has been shown to be able to capture the data geometry through empirical evidence from classical non-linear embedding. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised efficiently to data on a symmetric Riemannian manifold. A conceptual challenge comes from the lack of guidelines for constructing a symmetric Riemannian structure on the data space itself and the lack of guidelines for modifying successful algorithms on symmetric Riemannian manifolds for data analysis to this setting. This work considers these challenges in the setting of pullback Riemannian geometry through a diffeomorphism. The first part of the paper characterises diffeomorphisms that result in proper, stable and efficient data analysis. The second part then uses these best practices to guide construction of such diffeomorphisms through deep learning. As a proof of concept, different types of pullback geometries -- among which the proposed construction -- are tested on several data analysis tasks and on several toy data sets. The numerical experiments confirm the predictions from theory, i.e., that the diffeomorphisms generating the pullback geometry need to map the data manifold into a geodesic subspace of the pulled back Riemannian manifold while preserving local isometry around the data manifold for proper, stable and efficient data analysis, and that pulling back positive curvature can be problematic in terms of stability.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Best practice ; Data analysis ; Datasets ; Differential geometry ; Empirical analysis ; Euclidean geometry ; Geometry ; Guidelines ; Isomorphism ; Manifolds (mathematics) ; Riemann manifold ; Standard data ; Subspaces ; Topological manifolds</subject><ispartof>arXiv.org, 2024-03</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Diepeveen, Willem</creatorcontrib><title>Pulling back symmetric Riemannian geometry for data analysis</title><title>arXiv.org</title><description>Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of non-linear geometries that has been shown to be able to capture the data geometry through empirical evidence from classical non-linear embedding. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised efficiently to data on a symmetric Riemannian manifold. A conceptual challenge comes from the lack of guidelines for constructing a symmetric Riemannian structure on the data space itself and the lack of guidelines for modifying successful algorithms on symmetric Riemannian manifolds for data analysis to this setting. This work considers these challenges in the setting of pullback Riemannian geometry through a diffeomorphism. The first part of the paper characterises diffeomorphisms that result in proper, stable and efficient data analysis. The second part then uses these best practices to guide construction of such diffeomorphisms through deep learning. As a proof of concept, different types of pullback geometries -- among which the proposed construction -- are tested on several data analysis tasks and on several toy data sets. The numerical experiments confirm the predictions from theory, i.e., that the diffeomorphisms generating the pullback geometry need to map the data manifold into a geodesic subspace of the pulled back Riemannian manifold while preserving local isometry around the data manifold for proper, stable and efficient data analysis, and that pulling back positive curvature can be problematic in terms of stability.</description><subject>Algorithms</subject><subject>Best practice</subject><subject>Data analysis</subject><subject>Datasets</subject><subject>Differential geometry</subject><subject>Empirical analysis</subject><subject>Euclidean geometry</subject><subject>Geometry</subject><subject>Guidelines</subject><subject>Isomorphism</subject><subject>Manifolds (mathematics)</subject><subject>Riemann manifold</subject><subject>Standard data</subject><subject>Subspaces</subject><subject>Topological manifolds</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNysEKgkAQgOElCJLyHQY6CzbbpkK3KDpGdJfRVllbd2tHD759BT1Apx9-vpmIUMpNkm8RFyJm7tI0xV2GSslI7C-jtca1UFH9AJ76Xg_B1HA1uifnDDlotf_OCRof4E4DATmyExteiXlDlnX861KsT8fb4Zw8g3-Nmoey82P4YC6xUKpQOeaZ_E-9AWxKOBU</recordid><startdate>20240311</startdate><enddate>20240311</enddate><creator>Diepeveen, Willem</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240311</creationdate><title>Pulling back symmetric Riemannian geometry for data analysis</title><author>Diepeveen, Willem</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_29559582873</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Best practice</topic><topic>Data analysis</topic><topic>Datasets</topic><topic>Differential geometry</topic><topic>Empirical analysis</topic><topic>Euclidean geometry</topic><topic>Geometry</topic><topic>Guidelines</topic><topic>Isomorphism</topic><topic>Manifolds (mathematics)</topic><topic>Riemann manifold</topic><topic>Standard data</topic><topic>Subspaces</topic><topic>Topological manifolds</topic><toplevel>online_resources</toplevel><creatorcontrib>Diepeveen, Willem</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Diepeveen, Willem</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Pulling back symmetric Riemannian geometry for data analysis</atitle><jtitle>arXiv.org</jtitle><date>2024-03-11</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Data sets tend to live in low-dimensional non-linear subspaces. Ideal data analysis tools for such data sets should therefore account for such non-linear geometry. The symmetric Riemannian geometry setting can be suitable for a variety of reasons. First, it comes with a rich mathematical structure to account for a wide range of non-linear geometries that has been shown to be able to capture the data geometry through empirical evidence from classical non-linear embedding. Second, many standard data analysis tools initially developed for data in Euclidean space can also be generalised efficiently to data on a symmetric Riemannian manifold. A conceptual challenge comes from the lack of guidelines for constructing a symmetric Riemannian structure on the data space itself and the lack of guidelines for modifying successful algorithms on symmetric Riemannian manifolds for data analysis to this setting. This work considers these challenges in the setting of pullback Riemannian geometry through a diffeomorphism. The first part of the paper characterises diffeomorphisms that result in proper, stable and efficient data analysis. The second part then uses these best practices to guide construction of such diffeomorphisms through deep learning. As a proof of concept, different types of pullback geometries -- among which the proposed construction -- are tested on several data analysis tasks and on several toy data sets. The numerical experiments confirm the predictions from theory, i.e., that the diffeomorphisms generating the pullback geometry need to map the data manifold into a geodesic subspace of the pulled back Riemannian manifold while preserving local isometry around the data manifold for proper, stable and efficient data analysis, and that pulling back positive curvature can be problematic in terms of stability.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-03
issn 2331-8422
language eng
recordid cdi_proquest_journals_2955958287
source Free E- Journals
subjects Algorithms
Best practice
Data analysis
Datasets
Differential geometry
Empirical analysis
Euclidean geometry
Geometry
Guidelines
Isomorphism
Manifolds (mathematics)
Riemann manifold
Standard data
Subspaces
Topological manifolds
title Pulling back symmetric Riemannian geometry for data analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T02%3A26%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Pulling%20back%20symmetric%20Riemannian%20geometry%20for%20data%20analysis&rft.jtitle=arXiv.org&rft.au=Diepeveen,%20Willem&rft.date=2024-03-11&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2955958287%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2955958287&rft_id=info:pmid/&rfr_iscdi=true