Accuracy and Performance Comparison of Video Action Recognition Approaches

Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and infere...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2020-08
Hauptverfasser: Hutchinson, Matthew, Samsi, Siddharth, Arcand, William, Bestor, David, Bergeron, Bill, Byun, Chansup, Houle, Micheal, Hubbell, Matthew, Jones, Micheal, Kepner, Jeremy, Kirby, Andrew, Michaleas, Peter, Milechin, Lauren, Mullen, Julie, Prout, Andrew, Rosa, Antonio, Reuther, Albert, Yee, Charles, Gadepally, Vijay
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Hutchinson, Matthew
Samsi, Siddharth
Arcand, William
Bestor, David
Bergeron, Bill
Byun, Chansup
Houle, Micheal
Hubbell, Matthew
Jones, Micheal
Kepner, Jeremy
Kirby, Andrew
Michaleas, Peter
Milechin, Lauren
Mullen, Julie
Prout, Andrew
Rosa, Antonio
Reuther, Albert
Yee, Charles
Gadepally, Vijay
description Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.
doi_str_mv 10.48550/arxiv.2008.09037
format Article
fullrecord <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2008_09037</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2435940455</sourcerecordid><originalsourceid>FETCH-LOGICAL-a525-513e26985190abe9c766e1124f112fe140375b03b27fe28a27aabfbe0ac599273</originalsourceid><addsrcrecordid>eNotj01Lw0AQhhdBsNT-AE8GPKfOzu4m2WMofiIoUryGyXZWU2w2blqx_9619TIzL7wMzyPEhYS5royBa4o_3fccAao5WFDliZigUjKvNOKZmI3jGgCwKNEYNRGPtXO7SG6fUb_KXjj6EDfUO84WYTNQ7MbQZ8Fnb92KQ1a7bZfyK7vw3neHux6GGMh98HguTj19jjz731OxvL1ZLu7zp-e7h0X9lJNBkxupGAtbGWmBWrauLAqWErVPw7PUCdm0oFosPWNFWBK1vmUgZ6zFUk3F5fHtQbQZYrehuG_-hJuDcGpcHRuJ7GvH47ZZh13sE1ODWhmrQSf3X5JnV9s</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2435940455</pqid></control><display><type>article</type><title>Accuracy and Performance Comparison of Video Action Recognition Approaches</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Hutchinson, Matthew ; Samsi, Siddharth ; Arcand, William ; Bestor, David ; Bergeron, Bill ; Byun, Chansup ; Houle, Micheal ; Hubbell, Matthew ; Jones, Micheal ; Kepner, Jeremy ; Kirby, Andrew ; Michaleas, Peter ; Milechin, Lauren ; Mullen, Julie ; Prout, Andrew ; Rosa, Antonio ; Reuther, Albert ; Yee, Charles ; Gadepally, Vijay</creator><creatorcontrib>Hutchinson, Matthew ; Samsi, Siddharth ; Arcand, William ; Bestor, David ; Bergeron, Bill ; Byun, Chansup ; Houle, Micheal ; Hubbell, Matthew ; Jones, Micheal ; Kepner, Jeremy ; Kirby, Andrew ; Michaleas, Peter ; Milechin, Lauren ; Mullen, Julie ; Prout, Andrew ; Rosa, Antonio ; Reuther, Albert ; Yee, Charles ; Gadepally, Vijay</creatorcontrib><description>Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2008.09037</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Accuracy ; Algorithms ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Computer Science - Performance ; Model accuracy ; Recognition ; Training</subject><ispartof>arXiv.org, 2020-08</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2008.09037$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/HPEC43674.2020.9286249$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Hutchinson, Matthew</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, Bill</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Houle, Micheal</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Jones, Micheal</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Kirby, Andrew</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><title>Accuracy and Performance Comparison of Video Action Recognition Approaches</title><title>arXiv.org</title><description>Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Performance</subject><subject>Model accuracy</subject><subject>Recognition</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotj01Lw0AQhhdBsNT-AE8GPKfOzu4m2WMofiIoUryGyXZWU2w2blqx_9619TIzL7wMzyPEhYS5royBa4o_3fccAao5WFDliZigUjKvNOKZmI3jGgCwKNEYNRGPtXO7SG6fUb_KXjj6EDfUO84WYTNQ7MbQZ8Fnb92KQ1a7bZfyK7vw3neHux6GGMh98HguTj19jjz731OxvL1ZLu7zp-e7h0X9lJNBkxupGAtbGWmBWrauLAqWErVPw7PUCdm0oFosPWNFWBK1vmUgZ6zFUk3F5fHtQbQZYrehuG_-hJuDcGpcHRuJ7GvH47ZZh13sE1ODWhmrQSf3X5JnV9s</recordid><startdate>20200820</startdate><enddate>20200820</enddate><creator>Hutchinson, Matthew</creator><creator>Samsi, Siddharth</creator><creator>Arcand, William</creator><creator>Bestor, David</creator><creator>Bergeron, Bill</creator><creator>Byun, Chansup</creator><creator>Houle, Micheal</creator><creator>Hubbell, Matthew</creator><creator>Jones, Micheal</creator><creator>Kepner, Jeremy</creator><creator>Kirby, Andrew</creator><creator>Michaleas, Peter</creator><creator>Milechin, Lauren</creator><creator>Mullen, Julie</creator><creator>Prout, Andrew</creator><creator>Rosa, Antonio</creator><creator>Reuther, Albert</creator><creator>Yee, Charles</creator><creator>Gadepally, Vijay</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200820</creationdate><title>Accuracy and Performance Comparison of Video Action Recognition Approaches</title><author>Hutchinson, Matthew ; Samsi, Siddharth ; Arcand, William ; Bestor, David ; Bergeron, Bill ; Byun, Chansup ; Houle, Micheal ; Hubbell, Matthew ; Jones, Micheal ; Kepner, Jeremy ; Kirby, Andrew ; Michaleas, Peter ; Milechin, Lauren ; Mullen, Julie ; Prout, Andrew ; Rosa, Antonio ; Reuther, Albert ; Yee, Charles ; Gadepally, Vijay</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a525-513e26985190abe9c766e1124f112fe140375b03b27fe28a27aabfbe0ac599273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Performance</topic><topic>Model accuracy</topic><topic>Recognition</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Hutchinson, Matthew</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, Bill</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Houle, Micheal</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Jones, Micheal</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Kirby, Andrew</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hutchinson, Matthew</au><au>Samsi, Siddharth</au><au>Arcand, William</au><au>Bestor, David</au><au>Bergeron, Bill</au><au>Byun, Chansup</au><au>Houle, Micheal</au><au>Hubbell, Matthew</au><au>Jones, Micheal</au><au>Kepner, Jeremy</au><au>Kirby, Andrew</au><au>Michaleas, Peter</au><au>Milechin, Lauren</au><au>Mullen, Julie</au><au>Prout, Andrew</au><au>Rosa, Antonio</au><au>Reuther, Albert</au><au>Yee, Charles</au><au>Gadepally, Vijay</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Accuracy and Performance Comparison of Video Action Recognition Approaches</atitle><jtitle>arXiv.org</jtitle><date>2020-08-20</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2008.09037</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2020-08
issn 2331-8422
language eng
recordid cdi_arxiv_primary_2008_09037
source arXiv.org; Free E- Journals
subjects Accuracy
Algorithms
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Learning
Computer Science - Performance
Model accuracy
Recognition
Training
title Accuracy and Performance Comparison of Video Action Recognition Approaches
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T11%3A58%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Accuracy%20and%20Performance%20Comparison%20of%20Video%20Action%20Recognition%20Approaches&rft.jtitle=arXiv.org&rft.au=Hutchinson,%20Matthew&rft.date=2020-08-20&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2008.09037&rft_dat=%3Cproquest_arxiv%3E2435940455%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2435940455&rft_id=info:pmid/&rfr_iscdi=true