Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide p...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Grauman, Kristen Westbury, Andrew Torresani, Lorenzo Kitani, Kris Malik, Jitendra Afouras, Triantafyllos Ashutosh, Kumar Baiyya, Vijay Bansal, Siddhant Boote, Bikram Byrne, Eugene Chavis, Zach Chen, Joya Cheng, Feng Chu, Fu-Jen Crane, Sean Dasgupta, Avijit Dong, Jing Escobar, Maria Forigua, Cristhian Gebreselasie, Abrham Haresh, Sanjay Huang, Jing Islam, Md Mohaiminul Jain, Suyog Khirodkar, Rawal Kukreja, Devansh Liang, Kevin J Liu, Jia-Wei Majumder, Sagnik Mao, Yongsen Martin, Miguel Mavroudi, Effrosyni Nagarajan, Tushar Ragusa, Francesco Ramakrishnan, Santhosh Kumar Seminara, Luigi Somayazulu, Arjun Song, Yale Su, Shan Xue, Zihui Zhang, Edward Zhang, Jinxu Castillo, Angela Chen, Changan Fu, Xinzhu Furuta, Ryosuke Gonzalez, Cristina Gupta, Prince Hu, Jiabo Huang, Yifei Huang, Yiming Khoo, Weslie Kumar, Anush Kuo, Robert Lakhavani, Sach Liu, Miao Luo, Mi Luo, Zhengyi Meredith, Brighid Miller, Austin Oguntola, Oluwatumininu Pan, Xiaqing Peng, Penny Pramanick, Shraman Ramazanova, Merey Ryan, Fiona Shan, Wei Somasundaram, Kiran Song, Chenan Southerland, Audrey Tateno, Masatoshi Wang, Huiyu Wang, Yuchen Yagi, Takuma Yan, Mingfei Yang, Xitong Yu, Zecheng Zha, Shengxin Cindy Zhao, Chen Zhao, Ziwei Zhu, Zhifan Zhuo, Jeff Arbelaez, Pablo Bertasius, Gedas Crandall, David Damen, Dima Engel, Jakob Farinella, Giovanni Maria Furnari, Antonino Ghanem, Bernard Hoffman, Judy Jawahar, C. V Newcombe, Richard Park, Hyun Soo Sato, Yoichi Savva, Manolis Shi, Jianbo Shou, Mike Zheng Wray, Michael |
description | We present Ego-Exo4D, a diverse, large-scale multimodal multiview video
dataset and benchmark challenge. Ego-Exo4D centers around
simultaneously-captured egocentric and exocentric video of skilled human
activities (e.g., sports, music, dance, bike repair). 740 participants from 13
cities worldwide performed these activities in 123 different natural scene
contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours
of video combined. The multimodal nature of the dataset is unprecedented: the
video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera
poses, IMU, and multiple paired language descriptions -- including a novel
"expert commentary" done by coaches and teachers and tailored to the
skilled-activity domain. To push the frontier of first-person video
understanding of skilled human activity, we also present a suite of benchmark
tasks and their annotations, including fine-grained activity understanding,
proficiency estimation, cross-view translation, and 3D hand/body pose. All
resources are open sourced to fuel new research in the community. Project page:
http://ego-exo4d-data.org/ |
doi_str_mv | 10.48550/arxiv.2311.18259 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2311_18259</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2311_18259</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-c52dbb5a97a6815b638f1123a4ba790b94653e580c8ce82c35973624dddf1d43</originalsourceid><addsrcrecordid>eNotj7tOwzAARb0wVIUP6IR_wCF-xmarSkqRKoHUMjBFfqVYJE7lhKr9e5LS6SznXukAsMB5xiTn-ZNO53DKCMU4w5JwNQNf5aFD5bljL8_wMzqf-kFHF-IB7n5C03gHN7-tjnBph3AKwwXWqWvhOowegqMJ998hOfQxDrsIJxz9pPr-HtzVuun9w41zsFuX-9UGbd9f31bLLdKiUMhy4ozhWhVaSMyNoLLGmFDNjC5UbhQTnHoucyutl8RSrgoqCHPO1dgxOgeP_6_XtOqYQqvTpZoSq2si_QN6FEu6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives</title><source>arXiv.org</source><creator>Grauman, Kristen ; Westbury, Andrew ; Torresani, Lorenzo ; Kitani, Kris ; Malik, Jitendra ; Afouras, Triantafyllos ; Ashutosh, Kumar ; Baiyya, Vijay ; Bansal, Siddhant ; Boote, Bikram ; Byrne, Eugene ; Chavis, Zach ; Chen, Joya ; Cheng, Feng ; Chu, Fu-Jen ; Crane, Sean ; Dasgupta, Avijit ; Dong, Jing ; Escobar, Maria ; Forigua, Cristhian ; Gebreselasie, Abrham ; Haresh, Sanjay ; Huang, Jing ; Islam, Md Mohaiminul ; Jain, Suyog ; Khirodkar, Rawal ; Kukreja, Devansh ; Liang, Kevin J ; Liu, Jia-Wei ; Majumder, Sagnik ; Mao, Yongsen ; Martin, Miguel ; Mavroudi, Effrosyni ; Nagarajan, Tushar ; Ragusa, Francesco ; Ramakrishnan, Santhosh Kumar ; Seminara, Luigi ; Somayazulu, Arjun ; Song, Yale ; Su, Shan ; Xue, Zihui ; Zhang, Edward ; Zhang, Jinxu ; Castillo, Angela ; Chen, Changan ; Fu, Xinzhu ; Furuta, Ryosuke ; Gonzalez, Cristina ; Gupta, Prince ; Hu, Jiabo ; Huang, Yifei ; Huang, Yiming ; Khoo, Weslie ; Kumar, Anush ; Kuo, Robert ; Lakhavani, Sach ; Liu, Miao ; Luo, Mi ; Luo, Zhengyi ; Meredith, Brighid ; Miller, Austin ; Oguntola, Oluwatumininu ; Pan, Xiaqing ; Peng, Penny ; Pramanick, Shraman ; Ramazanova, Merey ; Ryan, Fiona ; Shan, Wei ; Somasundaram, Kiran ; Song, Chenan ; Southerland, Audrey ; Tateno, Masatoshi ; Wang, Huiyu ; Wang, Yuchen ; Yagi, Takuma ; Yan, Mingfei ; Yang, Xitong ; Yu, Zecheng ; Zha, Shengxin Cindy ; Zhao, Chen ; Zhao, Ziwei ; Zhu, Zhifan ; Zhuo, Jeff ; Arbelaez, Pablo ; Bertasius, Gedas ; Crandall, David ; Damen, Dima ; Engel, Jakob ; Farinella, Giovanni Maria ; Furnari, Antonino ; Ghanem, Bernard ; Hoffman, Judy ; Jawahar, C. V ; Newcombe, Richard ; Park, Hyun Soo ; Sato, Yoichi ; Savva, Manolis ; Shi, Jianbo ; Shou, Mike Zheng ; Wray, Michael</creator><creatorcontrib>Grauman, Kristen ; Westbury, Andrew ; Torresani, Lorenzo ; Kitani, Kris ; Malik, Jitendra ; Afouras, Triantafyllos ; Ashutosh, Kumar ; Baiyya, Vijay ; Bansal, Siddhant ; Boote, Bikram ; Byrne, Eugene ; Chavis, Zach ; Chen, Joya ; Cheng, Feng ; Chu, Fu-Jen ; Crane, Sean ; Dasgupta, Avijit ; Dong, Jing ; Escobar, Maria ; Forigua, Cristhian ; Gebreselasie, Abrham ; Haresh, Sanjay ; Huang, Jing ; Islam, Md Mohaiminul ; Jain, Suyog ; Khirodkar, Rawal ; Kukreja, Devansh ; Liang, Kevin J ; Liu, Jia-Wei ; Majumder, Sagnik ; Mao, Yongsen ; Martin, Miguel ; Mavroudi, Effrosyni ; Nagarajan, Tushar ; Ragusa, Francesco ; Ramakrishnan, Santhosh Kumar ; Seminara, Luigi ; Somayazulu, Arjun ; Song, Yale ; Su, Shan ; Xue, Zihui ; Zhang, Edward ; Zhang, Jinxu ; Castillo, Angela ; Chen, Changan ; Fu, Xinzhu ; Furuta, Ryosuke ; Gonzalez, Cristina ; Gupta, Prince ; Hu, Jiabo ; Huang, Yifei ; Huang, Yiming ; Khoo, Weslie ; Kumar, Anush ; Kuo, Robert ; Lakhavani, Sach ; Liu, Miao ; Luo, Mi ; Luo, Zhengyi ; Meredith, Brighid ; Miller, Austin ; Oguntola, Oluwatumininu ; Pan, Xiaqing ; Peng, Penny ; Pramanick, Shraman ; Ramazanova, Merey ; Ryan, Fiona ; Shan, Wei ; Somasundaram, Kiran ; Song, Chenan ; Southerland, Audrey ; Tateno, Masatoshi ; Wang, Huiyu ; Wang, Yuchen ; Yagi, Takuma ; Yan, Mingfei ; Yang, Xitong ; Yu, Zecheng ; Zha, Shengxin Cindy ; Zhao, Chen ; Zhao, Ziwei ; Zhu, Zhifan ; Zhuo, Jeff ; Arbelaez, Pablo ; Bertasius, Gedas ; Crandall, David ; Damen, Dima ; Engel, Jakob ; Farinella, Giovanni Maria ; Furnari, Antonino ; Ghanem, Bernard ; Hoffman, Judy ; Jawahar, C. V ; Newcombe, Richard ; Park, Hyun Soo ; Sato, Yoichi ; Savva, Manolis ; Shi, Jianbo ; Shou, Mike Zheng ; Wray, Michael</creatorcontrib><description>We present Ego-Exo4D, a diverse, large-scale multimodal multiview video
dataset and benchmark challenge. Ego-Exo4D centers around
simultaneously-captured egocentric and exocentric video of skilled human
activities (e.g., sports, music, dance, bike repair). 740 participants from 13
cities worldwide performed these activities in 123 different natural scene
contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours
of video combined. The multimodal nature of the dataset is unprecedented: the
video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera
poses, IMU, and multiple paired language descriptions -- including a novel
"expert commentary" done by coaches and teachers and tailored to the
skilled-activity domain. To push the frontier of first-person video
understanding of skilled human activity, we also present a suite of benchmark
tasks and their annotations, including fine-grained activity understanding,
proficiency estimation, cross-view translation, and 3D hand/body pose. All
resources are open sourced to fuel new research in the community. Project page:
http://ego-exo4d-data.org/</description><identifier>DOI: 10.48550/arxiv.2311.18259</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2023-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2311.18259$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2311.18259$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Grauman, Kristen</creatorcontrib><creatorcontrib>Westbury, Andrew</creatorcontrib><creatorcontrib>Torresani, Lorenzo</creatorcontrib><creatorcontrib>Kitani, Kris</creatorcontrib><creatorcontrib>Malik, Jitendra</creatorcontrib><creatorcontrib>Afouras, Triantafyllos</creatorcontrib><creatorcontrib>Ashutosh, Kumar</creatorcontrib><creatorcontrib>Baiyya, Vijay</creatorcontrib><creatorcontrib>Bansal, Siddhant</creatorcontrib><creatorcontrib>Boote, Bikram</creatorcontrib><creatorcontrib>Byrne, Eugene</creatorcontrib><creatorcontrib>Chavis, Zach</creatorcontrib><creatorcontrib>Chen, Joya</creatorcontrib><creatorcontrib>Cheng, Feng</creatorcontrib><creatorcontrib>Chu, Fu-Jen</creatorcontrib><creatorcontrib>Crane, Sean</creatorcontrib><creatorcontrib>Dasgupta, Avijit</creatorcontrib><creatorcontrib>Dong, Jing</creatorcontrib><creatorcontrib>Escobar, Maria</creatorcontrib><creatorcontrib>Forigua, Cristhian</creatorcontrib><creatorcontrib>Gebreselasie, Abrham</creatorcontrib><creatorcontrib>Haresh, Sanjay</creatorcontrib><creatorcontrib>Huang, Jing</creatorcontrib><creatorcontrib>Islam, Md Mohaiminul</creatorcontrib><creatorcontrib>Jain, Suyog</creatorcontrib><creatorcontrib>Khirodkar, Rawal</creatorcontrib><creatorcontrib>Kukreja, Devansh</creatorcontrib><creatorcontrib>Liang, Kevin J</creatorcontrib><creatorcontrib>Liu, Jia-Wei</creatorcontrib><creatorcontrib>Majumder, Sagnik</creatorcontrib><creatorcontrib>Mao, Yongsen</creatorcontrib><creatorcontrib>Martin, Miguel</creatorcontrib><creatorcontrib>Mavroudi, Effrosyni</creatorcontrib><creatorcontrib>Nagarajan, Tushar</creatorcontrib><creatorcontrib>Ragusa, Francesco</creatorcontrib><creatorcontrib>Ramakrishnan, Santhosh Kumar</creatorcontrib><creatorcontrib>Seminara, Luigi</creatorcontrib><creatorcontrib>Somayazulu, Arjun</creatorcontrib><creatorcontrib>Song, Yale</creatorcontrib><creatorcontrib>Su, Shan</creatorcontrib><creatorcontrib>Xue, Zihui</creatorcontrib><creatorcontrib>Zhang, Edward</creatorcontrib><creatorcontrib>Zhang, Jinxu</creatorcontrib><creatorcontrib>Castillo, Angela</creatorcontrib><creatorcontrib>Chen, Changan</creatorcontrib><creatorcontrib>Fu, Xinzhu</creatorcontrib><creatorcontrib>Furuta, Ryosuke</creatorcontrib><creatorcontrib>Gonzalez, Cristina</creatorcontrib><creatorcontrib>Gupta, Prince</creatorcontrib><creatorcontrib>Hu, Jiabo</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Huang, Yiming</creatorcontrib><creatorcontrib>Khoo, Weslie</creatorcontrib><creatorcontrib>Kumar, Anush</creatorcontrib><creatorcontrib>Kuo, Robert</creatorcontrib><creatorcontrib>Lakhavani, Sach</creatorcontrib><creatorcontrib>Liu, Miao</creatorcontrib><creatorcontrib>Luo, Mi</creatorcontrib><creatorcontrib>Luo, Zhengyi</creatorcontrib><creatorcontrib>Meredith, Brighid</creatorcontrib><creatorcontrib>Miller, Austin</creatorcontrib><creatorcontrib>Oguntola, Oluwatumininu</creatorcontrib><creatorcontrib>Pan, Xiaqing</creatorcontrib><creatorcontrib>Peng, Penny</creatorcontrib><creatorcontrib>Pramanick, Shraman</creatorcontrib><creatorcontrib>Ramazanova, Merey</creatorcontrib><creatorcontrib>Ryan, Fiona</creatorcontrib><creatorcontrib>Shan, Wei</creatorcontrib><creatorcontrib>Somasundaram, Kiran</creatorcontrib><creatorcontrib>Song, Chenan</creatorcontrib><creatorcontrib>Southerland, Audrey</creatorcontrib><creatorcontrib>Tateno, Masatoshi</creatorcontrib><creatorcontrib>Wang, Huiyu</creatorcontrib><creatorcontrib>Wang, Yuchen</creatorcontrib><creatorcontrib>Yagi, Takuma</creatorcontrib><creatorcontrib>Yan, Mingfei</creatorcontrib><creatorcontrib>Yang, Xitong</creatorcontrib><creatorcontrib>Yu, Zecheng</creatorcontrib><creatorcontrib>Zha, Shengxin Cindy</creatorcontrib><creatorcontrib>Zhao, Chen</creatorcontrib><creatorcontrib>Zhao, Ziwei</creatorcontrib><creatorcontrib>Zhu, Zhifan</creatorcontrib><creatorcontrib>Zhuo, Jeff</creatorcontrib><creatorcontrib>Arbelaez, Pablo</creatorcontrib><creatorcontrib>Bertasius, Gedas</creatorcontrib><creatorcontrib>Crandall, David</creatorcontrib><creatorcontrib>Damen, Dima</creatorcontrib><creatorcontrib>Engel, Jakob</creatorcontrib><creatorcontrib>Farinella, Giovanni Maria</creatorcontrib><creatorcontrib>Furnari, Antonino</creatorcontrib><creatorcontrib>Ghanem, Bernard</creatorcontrib><creatorcontrib>Hoffman, Judy</creatorcontrib><creatorcontrib>Jawahar, C. V</creatorcontrib><creatorcontrib>Newcombe, Richard</creatorcontrib><creatorcontrib>Park, Hyun Soo</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><creatorcontrib>Savva, Manolis</creatorcontrib><creatorcontrib>Shi, Jianbo</creatorcontrib><creatorcontrib>Shou, Mike Zheng</creatorcontrib><creatorcontrib>Wray, Michael</creatorcontrib><title>Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives</title><description>We present Ego-Exo4D, a diverse, large-scale multimodal multiview video
dataset and benchmark challenge. Ego-Exo4D centers around
simultaneously-captured egocentric and exocentric video of skilled human
activities (e.g., sports, music, dance, bike repair). 740 participants from 13
cities worldwide performed these activities in 123 different natural scene
contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours
of video combined. The multimodal nature of the dataset is unprecedented: the
video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera
poses, IMU, and multiple paired language descriptions -- including a novel
"expert commentary" done by coaches and teachers and tailored to the
skilled-activity domain. To push the frontier of first-person video
understanding of skilled human activity, we also present a suite of benchmark
tasks and their annotations, including fine-grained activity understanding,
proficiency estimation, cross-view translation, and 3D hand/body pose. All
resources are open sourced to fuel new research in the community. Project page:
http://ego-exo4d-data.org/</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj7tOwzAARb0wVIUP6IR_wCF-xmarSkqRKoHUMjBFfqVYJE7lhKr9e5LS6SznXukAsMB5xiTn-ZNO53DKCMU4w5JwNQNf5aFD5bljL8_wMzqf-kFHF-IB7n5C03gHN7-tjnBph3AKwwXWqWvhOowegqMJ998hOfQxDrsIJxz9pPr-HtzVuun9w41zsFuX-9UGbd9f31bLLdKiUMhy4ozhWhVaSMyNoLLGmFDNjC5UbhQTnHoucyutl8RSrgoqCHPO1dgxOgeP_6_XtOqYQqvTpZoSq2si_QN6FEu6</recordid><startdate>20231130</startdate><enddate>20231130</enddate><creator>Grauman, Kristen</creator><creator>Westbury, Andrew</creator><creator>Torresani, Lorenzo</creator><creator>Kitani, Kris</creator><creator>Malik, Jitendra</creator><creator>Afouras, Triantafyllos</creator><creator>Ashutosh, Kumar</creator><creator>Baiyya, Vijay</creator><creator>Bansal, Siddhant</creator><creator>Boote, Bikram</creator><creator>Byrne, Eugene</creator><creator>Chavis, Zach</creator><creator>Chen, Joya</creator><creator>Cheng, Feng</creator><creator>Chu, Fu-Jen</creator><creator>Crane, Sean</creator><creator>Dasgupta, Avijit</creator><creator>Dong, Jing</creator><creator>Escobar, Maria</creator><creator>Forigua, Cristhian</creator><creator>Gebreselasie, Abrham</creator><creator>Haresh, Sanjay</creator><creator>Huang, Jing</creator><creator>Islam, Md Mohaiminul</creator><creator>Jain, Suyog</creator><creator>Khirodkar, Rawal</creator><creator>Kukreja, Devansh</creator><creator>Liang, Kevin J</creator><creator>Liu, Jia-Wei</creator><creator>Majumder, Sagnik</creator><creator>Mao, Yongsen</creator><creator>Martin, Miguel</creator><creator>Mavroudi, Effrosyni</creator><creator>Nagarajan, Tushar</creator><creator>Ragusa, Francesco</creator><creator>Ramakrishnan, Santhosh Kumar</creator><creator>Seminara, Luigi</creator><creator>Somayazulu, Arjun</creator><creator>Song, Yale</creator><creator>Su, Shan</creator><creator>Xue, Zihui</creator><creator>Zhang, Edward</creator><creator>Zhang, Jinxu</creator><creator>Castillo, Angela</creator><creator>Chen, Changan</creator><creator>Fu, Xinzhu</creator><creator>Furuta, Ryosuke</creator><creator>Gonzalez, Cristina</creator><creator>Gupta, Prince</creator><creator>Hu, Jiabo</creator><creator>Huang, Yifei</creator><creator>Huang, Yiming</creator><creator>Khoo, Weslie</creator><creator>Kumar, Anush</creator><creator>Kuo, Robert</creator><creator>Lakhavani, Sach</creator><creator>Liu, Miao</creator><creator>Luo, Mi</creator><creator>Luo, Zhengyi</creator><creator>Meredith, Brighid</creator><creator>Miller, Austin</creator><creator>Oguntola, Oluwatumininu</creator><creator>Pan, Xiaqing</creator><creator>Peng, Penny</creator><creator>Pramanick, Shraman</creator><creator>Ramazanova, Merey</creator><creator>Ryan, Fiona</creator><creator>Shan, Wei</creator><creator>Somasundaram, Kiran</creator><creator>Song, Chenan</creator><creator>Southerland, Audrey</creator><creator>Tateno, Masatoshi</creator><creator>Wang, Huiyu</creator><creator>Wang, Yuchen</creator><creator>Yagi, Takuma</creator><creator>Yan, Mingfei</creator><creator>Yang, Xitong</creator><creator>Yu, Zecheng</creator><creator>Zha, Shengxin Cindy</creator><creator>Zhao, Chen</creator><creator>Zhao, Ziwei</creator><creator>Zhu, Zhifan</creator><creator>Zhuo, Jeff</creator><creator>Arbelaez, Pablo</creator><creator>Bertasius, Gedas</creator><creator>Crandall, David</creator><creator>Damen, Dima</creator><creator>Engel, Jakob</creator><creator>Farinella, Giovanni Maria</creator><creator>Furnari, Antonino</creator><creator>Ghanem, Bernard</creator><creator>Hoffman, Judy</creator><creator>Jawahar, C. V</creator><creator>Newcombe, Richard</creator><creator>Park, Hyun Soo</creator><creator>Sato, Yoichi</creator><creator>Savva, Manolis</creator><creator>Shi, Jianbo</creator><creator>Shou, Mike Zheng</creator><creator>Wray, Michael</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231130</creationdate><title>Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives</title><author>Grauman, Kristen ; Westbury, Andrew ; Torresani, Lorenzo ; Kitani, Kris ; Malik, Jitendra ; Afouras, Triantafyllos ; Ashutosh, Kumar ; Baiyya, Vijay ; Bansal, Siddhant ; Boote, Bikram ; Byrne, Eugene ; Chavis, Zach ; Chen, Joya ; Cheng, Feng ; Chu, Fu-Jen ; Crane, Sean ; Dasgupta, Avijit ; Dong, Jing ; Escobar, Maria ; Forigua, Cristhian ; Gebreselasie, Abrham ; Haresh, Sanjay ; Huang, Jing ; Islam, Md Mohaiminul ; Jain, Suyog ; Khirodkar, Rawal ; Kukreja, Devansh ; Liang, Kevin J ; Liu, Jia-Wei ; Majumder, Sagnik ; Mao, Yongsen ; Martin, Miguel ; Mavroudi, Effrosyni ; Nagarajan, Tushar ; Ragusa, Francesco ; Ramakrishnan, Santhosh Kumar ; Seminara, Luigi ; Somayazulu, Arjun ; Song, Yale ; Su, Shan ; Xue, Zihui ; Zhang, Edward ; Zhang, Jinxu ; Castillo, Angela ; Chen, Changan ; Fu, Xinzhu ; Furuta, Ryosuke ; Gonzalez, Cristina ; Gupta, Prince ; Hu, Jiabo ; Huang, Yifei ; Huang, Yiming ; Khoo, Weslie ; Kumar, Anush ; Kuo, Robert ; Lakhavani, Sach ; Liu, Miao ; Luo, Mi ; Luo, Zhengyi ; Meredith, Brighid ; Miller, Austin ; Oguntola, Oluwatumininu ; Pan, Xiaqing ; Peng, Penny ; Pramanick, Shraman ; Ramazanova, Merey ; Ryan, Fiona ; Shan, Wei ; Somasundaram, Kiran ; Song, Chenan ; Southerland, Audrey ; Tateno, Masatoshi ; Wang, Huiyu ; Wang, Yuchen ; Yagi, Takuma ; Yan, Mingfei ; Yang, Xitong ; Yu, Zecheng ; Zha, Shengxin Cindy ; Zhao, Chen ; Zhao, Ziwei ; Zhu, Zhifan ; Zhuo, Jeff ; Arbelaez, Pablo ; Bertasius, Gedas ; Crandall, David ; Damen, Dima ; Engel, Jakob ; Farinella, Giovanni Maria ; Furnari, Antonino ; Ghanem, Bernard ; Hoffman, Judy ; Jawahar, C. V ; Newcombe, Richard ; Park, Hyun Soo ; Sato, Yoichi ; Savva, Manolis ; Shi, Jianbo ; Shou, Mike Zheng ; Wray, Michael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-c52dbb5a97a6815b638f1123a4ba790b94653e580c8ce82c35973624dddf1d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Grauman, Kristen</creatorcontrib><creatorcontrib>Westbury, Andrew</creatorcontrib><creatorcontrib>Torresani, Lorenzo</creatorcontrib><creatorcontrib>Kitani, Kris</creatorcontrib><creatorcontrib>Malik, Jitendra</creatorcontrib><creatorcontrib>Afouras, Triantafyllos</creatorcontrib><creatorcontrib>Ashutosh, Kumar</creatorcontrib><creatorcontrib>Baiyya, Vijay</creatorcontrib><creatorcontrib>Bansal, Siddhant</creatorcontrib><creatorcontrib>Boote, Bikram</creatorcontrib><creatorcontrib>Byrne, Eugene</creatorcontrib><creatorcontrib>Chavis, Zach</creatorcontrib><creatorcontrib>Chen, Joya</creatorcontrib><creatorcontrib>Cheng, Feng</creatorcontrib><creatorcontrib>Chu, Fu-Jen</creatorcontrib><creatorcontrib>Crane, Sean</creatorcontrib><creatorcontrib>Dasgupta, Avijit</creatorcontrib><creatorcontrib>Dong, Jing</creatorcontrib><creatorcontrib>Escobar, Maria</creatorcontrib><creatorcontrib>Forigua, Cristhian</creatorcontrib><creatorcontrib>Gebreselasie, Abrham</creatorcontrib><creatorcontrib>Haresh, Sanjay</creatorcontrib><creatorcontrib>Huang, Jing</creatorcontrib><creatorcontrib>Islam, Md Mohaiminul</creatorcontrib><creatorcontrib>Jain, Suyog</creatorcontrib><creatorcontrib>Khirodkar, Rawal</creatorcontrib><creatorcontrib>Kukreja, Devansh</creatorcontrib><creatorcontrib>Liang, Kevin J</creatorcontrib><creatorcontrib>Liu, Jia-Wei</creatorcontrib><creatorcontrib>Majumder, Sagnik</creatorcontrib><creatorcontrib>Mao, Yongsen</creatorcontrib><creatorcontrib>Martin, Miguel</creatorcontrib><creatorcontrib>Mavroudi, Effrosyni</creatorcontrib><creatorcontrib>Nagarajan, Tushar</creatorcontrib><creatorcontrib>Ragusa, Francesco</creatorcontrib><creatorcontrib>Ramakrishnan, Santhosh Kumar</creatorcontrib><creatorcontrib>Seminara, Luigi</creatorcontrib><creatorcontrib>Somayazulu, Arjun</creatorcontrib><creatorcontrib>Song, Yale</creatorcontrib><creatorcontrib>Su, Shan</creatorcontrib><creatorcontrib>Xue, Zihui</creatorcontrib><creatorcontrib>Zhang, Edward</creatorcontrib><creatorcontrib>Zhang, Jinxu</creatorcontrib><creatorcontrib>Castillo, Angela</creatorcontrib><creatorcontrib>Chen, Changan</creatorcontrib><creatorcontrib>Fu, Xinzhu</creatorcontrib><creatorcontrib>Furuta, Ryosuke</creatorcontrib><creatorcontrib>Gonzalez, Cristina</creatorcontrib><creatorcontrib>Gupta, Prince</creatorcontrib><creatorcontrib>Hu, Jiabo</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Huang, Yiming</creatorcontrib><creatorcontrib>Khoo, Weslie</creatorcontrib><creatorcontrib>Kumar, Anush</creatorcontrib><creatorcontrib>Kuo, Robert</creatorcontrib><creatorcontrib>Lakhavani, Sach</creatorcontrib><creatorcontrib>Liu, Miao</creatorcontrib><creatorcontrib>Luo, Mi</creatorcontrib><creatorcontrib>Luo, Zhengyi</creatorcontrib><creatorcontrib>Meredith, Brighid</creatorcontrib><creatorcontrib>Miller, Austin</creatorcontrib><creatorcontrib>Oguntola, Oluwatumininu</creatorcontrib><creatorcontrib>Pan, Xiaqing</creatorcontrib><creatorcontrib>Peng, Penny</creatorcontrib><creatorcontrib>Pramanick, Shraman</creatorcontrib><creatorcontrib>Ramazanova, Merey</creatorcontrib><creatorcontrib>Ryan, Fiona</creatorcontrib><creatorcontrib>Shan, Wei</creatorcontrib><creatorcontrib>Somasundaram, Kiran</creatorcontrib><creatorcontrib>Song, Chenan</creatorcontrib><creatorcontrib>Southerland, Audrey</creatorcontrib><creatorcontrib>Tateno, Masatoshi</creatorcontrib><creatorcontrib>Wang, Huiyu</creatorcontrib><creatorcontrib>Wang, Yuchen</creatorcontrib><creatorcontrib>Yagi, Takuma</creatorcontrib><creatorcontrib>Yan, Mingfei</creatorcontrib><creatorcontrib>Yang, Xitong</creatorcontrib><creatorcontrib>Yu, Zecheng</creatorcontrib><creatorcontrib>Zha, Shengxin Cindy</creatorcontrib><creatorcontrib>Zhao, Chen</creatorcontrib><creatorcontrib>Zhao, Ziwei</creatorcontrib><creatorcontrib>Zhu, Zhifan</creatorcontrib><creatorcontrib>Zhuo, Jeff</creatorcontrib><creatorcontrib>Arbelaez, Pablo</creatorcontrib><creatorcontrib>Bertasius, Gedas</creatorcontrib><creatorcontrib>Crandall, David</creatorcontrib><creatorcontrib>Damen, Dima</creatorcontrib><creatorcontrib>Engel, Jakob</creatorcontrib><creatorcontrib>Farinella, Giovanni Maria</creatorcontrib><creatorcontrib>Furnari, Antonino</creatorcontrib><creatorcontrib>Ghanem, Bernard</creatorcontrib><creatorcontrib>Hoffman, Judy</creatorcontrib><creatorcontrib>Jawahar, C. V</creatorcontrib><creatorcontrib>Newcombe, Richard</creatorcontrib><creatorcontrib>Park, Hyun Soo</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><creatorcontrib>Savva, Manolis</creatorcontrib><creatorcontrib>Shi, Jianbo</creatorcontrib><creatorcontrib>Shou, Mike Zheng</creatorcontrib><creatorcontrib>Wray, Michael</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Grauman, Kristen</au><au>Westbury, Andrew</au><au>Torresani, Lorenzo</au><au>Kitani, Kris</au><au>Malik, Jitendra</au><au>Afouras, Triantafyllos</au><au>Ashutosh, Kumar</au><au>Baiyya, Vijay</au><au>Bansal, Siddhant</au><au>Boote, Bikram</au><au>Byrne, Eugene</au><au>Chavis, Zach</au><au>Chen, Joya</au><au>Cheng, Feng</au><au>Chu, Fu-Jen</au><au>Crane, Sean</au><au>Dasgupta, Avijit</au><au>Dong, Jing</au><au>Escobar, Maria</au><au>Forigua, Cristhian</au><au>Gebreselasie, Abrham</au><au>Haresh, Sanjay</au><au>Huang, Jing</au><au>Islam, Md Mohaiminul</au><au>Jain, Suyog</au><au>Khirodkar, Rawal</au><au>Kukreja, Devansh</au><au>Liang, Kevin J</au><au>Liu, Jia-Wei</au><au>Majumder, Sagnik</au><au>Mao, Yongsen</au><au>Martin, Miguel</au><au>Mavroudi, Effrosyni</au><au>Nagarajan, Tushar</au><au>Ragusa, Francesco</au><au>Ramakrishnan, Santhosh Kumar</au><au>Seminara, Luigi</au><au>Somayazulu, Arjun</au><au>Song, Yale</au><au>Su, Shan</au><au>Xue, Zihui</au><au>Zhang, Edward</au><au>Zhang, Jinxu</au><au>Castillo, Angela</au><au>Chen, Changan</au><au>Fu, Xinzhu</au><au>Furuta, Ryosuke</au><au>Gonzalez, Cristina</au><au>Gupta, Prince</au><au>Hu, Jiabo</au><au>Huang, Yifei</au><au>Huang, Yiming</au><au>Khoo, Weslie</au><au>Kumar, Anush</au><au>Kuo, Robert</au><au>Lakhavani, Sach</au><au>Liu, Miao</au><au>Luo, Mi</au><au>Luo, Zhengyi</au><au>Meredith, Brighid</au><au>Miller, Austin</au><au>Oguntola, Oluwatumininu</au><au>Pan, Xiaqing</au><au>Peng, Penny</au><au>Pramanick, Shraman</au><au>Ramazanova, Merey</au><au>Ryan, Fiona</au><au>Shan, Wei</au><au>Somasundaram, Kiran</au><au>Song, Chenan</au><au>Southerland, Audrey</au><au>Tateno, Masatoshi</au><au>Wang, Huiyu</au><au>Wang, Yuchen</au><au>Yagi, Takuma</au><au>Yan, Mingfei</au><au>Yang, Xitong</au><au>Yu, Zecheng</au><au>Zha, Shengxin Cindy</au><au>Zhao, Chen</au><au>Zhao, Ziwei</au><au>Zhu, Zhifan</au><au>Zhuo, Jeff</au><au>Arbelaez, Pablo</au><au>Bertasius, Gedas</au><au>Crandall, David</au><au>Damen, Dima</au><au>Engel, Jakob</au><au>Farinella, Giovanni Maria</au><au>Furnari, Antonino</au><au>Ghanem, Bernard</au><au>Hoffman, Judy</au><au>Jawahar, C. V</au><au>Newcombe, Richard</au><au>Park, Hyun Soo</au><au>Sato, Yoichi</au><au>Savva, Manolis</au><au>Shi, Jianbo</au><au>Shou, Mike Zheng</au><au>Wray, Michael</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives</atitle><date>2023-11-30</date><risdate>2023</risdate><abstract>We present Ego-Exo4D, a diverse, large-scale multimodal multiview video
dataset and benchmark challenge. Ego-Exo4D centers around
simultaneously-captured egocentric and exocentric video of skilled human
activities (e.g., sports, music, dance, bike repair). 740 participants from 13
cities worldwide performed these activities in 123 different natural scene
contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours
of video combined. The multimodal nature of the dataset is unprecedented: the
video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera
poses, IMU, and multiple paired language descriptions -- including a novel
"expert commentary" done by coaches and teachers and tailored to the
skilled-activity domain. To push the frontier of first-person video
understanding of skilled human activity, we also present a suite of benchmark
tasks and their annotations, including fine-grained activity understanding,
proficiency estimation, cross-view translation, and 3D hand/body pose. All
resources are open sourced to fuel new research in the community. Project page:
http://ego-exo4d-data.org/</abstract><doi>10.48550/arxiv.2311.18259</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2311.18259 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2311_18259 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition |
title | Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T06%3A15%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Ego-Exo4D:%20Understanding%20Skilled%20Human%20Activity%20from%20First-%20and%20Third-Person%20Perspectives&rft.au=Grauman,%20Kristen&rft.date=2023-11-30&rft_id=info:doi/10.48550/arxiv.2311.18259&rft_dat=%3Carxiv_GOX%3E2311_18259%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |