Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion

Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Guzov, Vladimir, Chibane, Julian, Marin, Riccardo, He, Yannan, Saracoglu, Yunus, Sattler, Torsten, Pons-Moll, Gerard
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computer Vision and Pattern Recognition
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Guzov, Vladimir Chibane, Julian Marin, Riccardo He, Yannan Saracoglu, Yunus Sattler, Torsten Pons-Moll, Gerard
description	Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/
doi_str_mv	10.48550/arxiv.2205.02830
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2205_02830</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2205_02830</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-45e5777b1e479ae1f6bcfeb6b21fce03ae95b2dd276c461354d30a5d40c828da3</originalsourceid><addsrcrecordid>eNpNj71OwzAURr0woMIDMNUvkOB_p2woorRSUSXI1CW6tm-KoXEqNyB4e2jLwPRJn46OdAi54axUldbsFvJX_CyFYLpkopLskmyWacQMfoxDos-430UPd7T5fd5j2tLFRw-pWLs39CP9j0IK9MVjQlq_Qtrigc7z0J95-jQcmSty0cHugNd_OyHN_KGpF8Vq_bis71cFGMsKpVFbax1HZWeAvDPOd-iME7zzyCTgTDsRgrDGK8OlVkEy0EExX4kqgJyQ6Vl7imv3OfaQv9tjZHuKlD-Aj00b</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><source>arXiv.org</source><creator>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</creator><creatorcontrib>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</creatorcontrib><description>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/</description><identifier>DOI: 10.48550/arxiv.2205.02830</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2022-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2205.02830$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2205.02830$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Guzov, Vladimir</creatorcontrib><creatorcontrib>Chibane, Julian</creatorcontrib><creatorcontrib>Marin, Riccardo</creatorcontrib><creatorcontrib>He, Yannan</creatorcontrib><creatorcontrib>Saracoglu, Yunus</creatorcontrib><creatorcontrib>Sattler, Torsten</creatorcontrib><creatorcontrib>Pons-Moll, Gerard</creatorcontrib><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><description>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpNj71OwzAURr0woMIDMNUvkOB_p2woorRSUSXI1CW6tm-KoXEqNyB4e2jLwPRJn46OdAi54axUldbsFvJX_CyFYLpkopLskmyWacQMfoxDos-430UPd7T5fd5j2tLFRw-pWLs39CP9j0IK9MVjQlq_Qtrigc7z0J95-jQcmSty0cHugNd_OyHN_KGpF8Vq_bis71cFGMsKpVFbax1HZWeAvDPOd-iME7zzyCTgTDsRgrDGK8OlVkEy0EExX4kqgJyQ6Vl7imv3OfaQv9tjZHuKlD-Aj00b</recordid><startdate>20220505</startdate><enddate>20220505</enddate><creator>Guzov, Vladimir</creator><creator>Chibane, Julian</creator><creator>Marin, Riccardo</creator><creator>He, Yannan</creator><creator>Saracoglu, Yunus</creator><creator>Sattler, Torsten</creator><creator>Pons-Moll, Gerard</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220505</creationdate><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><author>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-45e5777b1e479ae1f6bcfeb6b21fce03ae95b2dd276c461354d30a5d40c828da3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Guzov, Vladimir</creatorcontrib><creatorcontrib>Chibane, Julian</creatorcontrib><creatorcontrib>Marin, Riccardo</creatorcontrib><creatorcontrib>He, Yannan</creatorcontrib><creatorcontrib>Saracoglu, Yunus</creatorcontrib><creatorcontrib>Sattler, Torsten</creatorcontrib><creatorcontrib>Pons-Moll, Gerard</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Guzov, Vladimir</au><au>Chibane, Julian</au><au>Marin, Riccardo</au><au>He, Yannan</au><au>Saracoglu, Yunus</au><au>Sattler, Torsten</au><au>Pons-Moll, Gerard</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</atitle><date>2022-05-05</date><risdate>2022</risdate><abstract>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model are available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/</abstract><doi>10.48550/arxiv.2205.02830</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2205.02830
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2205_02830
source	arXiv.org
subjects	Computer Science - Computer Vision and Pattern Recognition
title	Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T21%3A27%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Interaction%20Replica:%20Tracking%20Human-Object%20Interaction%20and%20Scene%20Changes%20From%20Human%20Motion&rft.au=Guzov,%20Vladimir&rft.date=2022-05-05&rft_id=info:doi/10.48550/arxiv.2205.02830&rft_dat=%3Carxiv_GOX%3E2205_02830%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true