DeePoint: Visual Pointing Recognition and Direction Estimation
In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction est...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-09 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Nakamura, Shu Kawanishi, Yasutomo Nobuhara, Shohei Nishino, Ko |
description | In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2802175014</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2802175014</sourcerecordid><originalsourceid>FETCH-proquest_journals_28021750143</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwc0lNDcjPzCuxUgjLLC5NzFEA8zLz0hWCUpPz0_MySzLz8xQS81IUXDKLUpPBPNfikszcRBCTh4E1LTGnOJUXSnMzKLu5hjh76BYU5ReWphaXxGfllxblAaXijSwMjAzNTQ0MTYyJUwUAr9w4Pw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2802175014</pqid></control><display><type>article</type><title>DeePoint: Visual Pointing Recognition and Direction Estimation</title><source>Free E- Journals</source><creator>Nakamura, Shu ; Kawanishi, Yasutomo ; Nobuhara, Shohei ; Nishino, Ko</creator><creatorcontrib>Nakamura, Shu ; Kawanishi, Yasutomo ; Nobuhara, Shohei ; Nishino, Ko</creatorcontrib><description>In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Body parts ; Datasets ; Recognition</subject><ispartof>arXiv.org, 2023-09</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Nakamura, Shu</creatorcontrib><creatorcontrib>Kawanishi, Yasutomo</creatorcontrib><creatorcontrib>Nobuhara, Shohei</creatorcontrib><creatorcontrib>Nishino, Ko</creatorcontrib><title>DeePoint: Visual Pointing Recognition and Direction Estimation</title><title>arXiv.org</title><description>In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding.</description><subject>Body parts</subject><subject>Datasets</subject><subject>Recognition</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwc0lNDcjPzCuxUgjLLC5NzFEA8zLz0hWCUpPz0_MySzLz8xQS81IUXDKLUpPBPNfikszcRBCTh4E1LTGnOJUXSnMzKLu5hjh76BYU5ReWphaXxGfllxblAaXijSwMjAzNTQ0MTYyJUwUAr9w4Pw</recordid><startdate>20230911</startdate><enddate>20230911</enddate><creator>Nakamura, Shu</creator><creator>Kawanishi, Yasutomo</creator><creator>Nobuhara, Shohei</creator><creator>Nishino, Ko</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope></search><sort><creationdate>20230911</creationdate><title>DeePoint: Visual Pointing Recognition and Direction Estimation</title><author>Nakamura, Shu ; Kawanishi, Yasutomo ; Nobuhara, Shohei ; Nishino, Ko</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28021750143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Body parts</topic><topic>Datasets</topic><topic>Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Nakamura, Shu</creatorcontrib><creatorcontrib>Kawanishi, Yasutomo</creatorcontrib><creatorcontrib>Nobuhara, Shohei</creatorcontrib><creatorcontrib>Nishino, Ko</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nakamura, Shu</au><au>Kawanishi, Yasutomo</au><au>Nobuhara, Shohei</au><au>Nishino, Ko</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>DeePoint: Visual Pointing Recognition and Direction Estimation</atitle><jtitle>arXiv.org</jtitle><date>2023-09-11</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>In this paper, we realize automatic visual recognition and direction estimation of pointing. We introduce the first neural pointing understanding method based on two key contributions. The first is the introduction of a first-of-its-kind large-scale dataset for pointing recognition and direction estimation, which we refer to as the DP Dataset. DP Dataset consists of more than 2 million frames of 33 people pointing in various styles annotated for each frame with pointing timings and 3D directions. The second is DeePoint, a novel deep network model for joint recognition and 3D direction estimation of pointing. DeePoint is a Transformer-based network which fully leverages the spatio-temporal coordination of the body parts, not just the hands. Through extensive experiments, we demonstrate the accuracy and efficiency of DeePoint. We believe DP Dataset and DeePoint will serve as a sound foundation for visual human intention understanding.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-09 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2802175014 |
source | Free E- Journals |
subjects | Body parts Datasets Recognition |
title | DeePoint: Visual Pointing Recognition and Direction Estimation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T13%3A33%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=DeePoint:%20Visual%20Pointing%20Recognition%20and%20Direction%20Estimation&rft.jtitle=arXiv.org&rft.au=Nakamura,%20Shu&rft.date=2023-09-11&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2802175014%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2802175014&rft_id=info:pmid/&rfr_iscdi=true |