Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines
Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. H...
Gespeichert in:
Veröffentlicht in: | Machine vision and applications 2025-01, Vol.36 (1), p.19, Article 19 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 1 |
container_start_page | 19 |
container_title | Machine vision and applications |
container_volume | 36 |
creator | Sturm, Fabian Trat, Martin Sathiyababu, Rahul Allipilli, Harshitha Menz, Benjamin Hergenroether, Elke Siegel, Melanie |
description | Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements. |
doi_str_mv | 10.1007/s00138-024-01638-9 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3143455575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143455575</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</originalsourceid><addsrcrecordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3143455575</pqid></control><display><type>article</type><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><source>Springer Nature - Complete Springer Journals</source><creator>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creator><creatorcontrib>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creatorcontrib><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><identifier>ISSN: 0932-8092</identifier><identifier>EISSN: 1432-1769</identifier><identifier>DOI: 10.1007/s00138-024-01638-9</identifier><language>eng</language><publisher>New York: Springer Nature B.V</publisher><subject>Activity recognition ; Assembly lines ; Corporate learning ; Deep learning ; Hand (anatomy) ; Labels ; Machine learning ; Moving object recognition ; Regression models ; Representations ; Robustness ; Self-supervised learning</subject><ispartof>Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19</ispartof><rights>Copyright Springer Nature B.V. Jan 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><title>Machine vision and applications</title><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><subject>Activity recognition</subject><subject>Assembly lines</subject><subject>Corporate learning</subject><subject>Deep learning</subject><subject>Hand (anatomy)</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Moving object recognition</subject><subject>Regression models</subject><subject>Representations</subject><subject>Robustness</subject><subject>Self-supervised learning</subject><issn>0932-8092</issn><issn>1432-1769</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Sturm, Fabian</creator><creator>Trat, Martin</creator><creator>Sathiyababu, Rahul</creator><creator>Allipilli, Harshitha</creator><creator>Menz, Benjamin</creator><creator>Hergenroether, Elke</creator><creator>Siegel, Melanie</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202501</creationdate><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><author>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Activity recognition</topic><topic>Assembly lines</topic><topic>Corporate learning</topic><topic>Deep learning</topic><topic>Hand (anatomy)</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Moving object recognition</topic><topic>Regression models</topic><topic>Representations</topic><topic>Robustness</topic><topic>Self-supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><collection>CrossRef</collection><jtitle>Machine vision and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sturm, Fabian</au><au>Trat, Martin</au><au>Sathiyababu, Rahul</au><au>Allipilli, Harshitha</au><au>Menz, Benjamin</au><au>Hergenroether, Elke</au><au>Siegel, Melanie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</atitle><jtitle>Machine vision and applications</jtitle><date>2025-01</date><risdate>2025</risdate><volume>36</volume><issue>1</issue><spage>19</spage><pages>19-</pages><artnum>19</artnum><issn>0932-8092</issn><eissn>1432-1769</eissn><abstract>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</abstract><cop>New York</cop><pub>Springer Nature B.V</pub><doi>10.1007/s00138-024-01638-9</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0932-8092 |
ispartof | Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19 |
issn | 0932-8092 1432-1769 |
language | eng |
recordid | cdi_proquest_journals_3143455575 |
source | Springer Nature - Complete Springer Journals |
subjects | Activity recognition Assembly lines Corporate learning Deep learning Hand (anatomy) Labels Machine learning Moving object recognition Regression models Representations Robustness Self-supervised learning |
title | Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T10%3A17%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Self-supervised%20representation%20learning%20for%20robust%20fine-grained%20human%20hand%20action%20recognition%20in%20industrial%20assembly%20lines&rft.jtitle=Machine%20vision%20and%20applications&rft.au=Sturm,%20Fabian&rft.date=2025-01&rft.volume=36&rft.issue=1&rft.spage=19&rft.pages=19-&rft.artnum=19&rft.issn=0932-8092&rft.eissn=1432-1769&rft_id=info:doi/10.1007/s00138-024-01638-9&rft_dat=%3Cproquest_cross%3E3143455575%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3143455575&rft_id=info:pmid/&rfr_iscdi=true |