Parallel Grid Pooling for Data Augmentation
Convolutional neural network (CNN) architectures utilize downsampling layers, which restrict the subsequent layers to learn spatially invariant features while reducing computational costs. However, such a downsampling operation makes it impossible to use the full spectrum of input features. Motivate...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Takeki, Akito Ikami, Daiki Irie, Go Aizawa, Kiyoharu |
description | Convolutional neural network (CNN) architectures utilize downsampling layers,
which restrict the subsequent layers to learn spatially invariant features
while reducing computational costs. However, such a downsampling operation
makes it impossible to use the full spectrum of input features. Motivated by
this observation, we propose a novel layer called parallel grid pooling (PGP)
which is applicable to various CNN models. PGP performs downsampling without
discarding any intermediate feature. It works as data augmentation and is
complementary to commonly used data augmentation techniques. Furthermore, we
demonstrate that a dilated convolution can naturally be represented using PGP
operations, which suggests that the dilated convolution can also be regarded as
a type of data augmentation technique. Experimental results based on popular
image classification benchmarks demonstrate the effectiveness of the proposed
method. Code is available at: https://github.com/akitotakeki |
doi_str_mv | 10.48550/arxiv.1803.11370 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1803_11370</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1803_11370</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-313f0772a7d8a1fe589ebeeb880a8bc5a46dfac853d9dc86896ab42b894538723</originalsourceid><addsrcrecordid>eNotzrsKwjAUgOEsDqI-gJPZpTVpmuZ0FO8g6OBeTppEArGVWEXfXrxM__bzETLmLM1BSjbD-PSPlAMTKedCsT6ZHjFiCDbQTfSGHts2-OZMXRvpEjuk8_v5YpsOO982Q9JzGG529O-AnNar02Kb7A-b3WK-T7BQLBFcOKZUhsoAcmcllFZbqwEYgq4l5oVxWIMUpjQ1FFAWqPNMQ5lLASoTAzL5bb_a6hr9BeOr-qirr1q8AYqSO8k</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Parallel Grid Pooling for Data Augmentation</title><source>arXiv.org</source><creator>Takeki, Akito ; Ikami, Daiki ; Irie, Go ; Aizawa, Kiyoharu</creator><creatorcontrib>Takeki, Akito ; Ikami, Daiki ; Irie, Go ; Aizawa, Kiyoharu</creatorcontrib><description>Convolutional neural network (CNN) architectures utilize downsampling layers,
which restrict the subsequent layers to learn spatially invariant features
while reducing computational costs. However, such a downsampling operation
makes it impossible to use the full spectrum of input features. Motivated by
this observation, we propose a novel layer called parallel grid pooling (PGP)
which is applicable to various CNN models. PGP performs downsampling without
discarding any intermediate feature. It works as data augmentation and is
complementary to commonly used data augmentation techniques. Furthermore, we
demonstrate that a dilated convolution can naturally be represented using PGP
operations, which suggests that the dilated convolution can also be regarded as
a type of data augmentation technique. Experimental results based on popular
image classification benchmarks demonstrate the effectiveness of the proposed
method. Code is available at: https://github.com/akitotakeki</description><identifier>DOI: 10.48550/arxiv.1803.11370</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2018-03</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1803.11370$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1803.11370$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Takeki, Akito</creatorcontrib><creatorcontrib>Ikami, Daiki</creatorcontrib><creatorcontrib>Irie, Go</creatorcontrib><creatorcontrib>Aizawa, Kiyoharu</creatorcontrib><title>Parallel Grid Pooling for Data Augmentation</title><description>Convolutional neural network (CNN) architectures utilize downsampling layers,
which restrict the subsequent layers to learn spatially invariant features
while reducing computational costs. However, such a downsampling operation
makes it impossible to use the full spectrum of input features. Motivated by
this observation, we propose a novel layer called parallel grid pooling (PGP)
which is applicable to various CNN models. PGP performs downsampling without
discarding any intermediate feature. It works as data augmentation and is
complementary to commonly used data augmentation techniques. Furthermore, we
demonstrate that a dilated convolution can naturally be represented using PGP
operations, which suggests that the dilated convolution can also be regarded as
a type of data augmentation technique. Experimental results based on popular
image classification benchmarks demonstrate the effectiveness of the proposed
method. Code is available at: https://github.com/akitotakeki</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrsKwjAUgOEsDqI-gJPZpTVpmuZ0FO8g6OBeTppEArGVWEXfXrxM__bzETLmLM1BSjbD-PSPlAMTKedCsT6ZHjFiCDbQTfSGHts2-OZMXRvpEjuk8_v5YpsOO982Q9JzGG529O-AnNar02Kb7A-b3WK-T7BQLBFcOKZUhsoAcmcllFZbqwEYgq4l5oVxWIMUpjQ1FFAWqPNMQ5lLASoTAzL5bb_a6hr9BeOr-qirr1q8AYqSO8k</recordid><startdate>20180330</startdate><enddate>20180330</enddate><creator>Takeki, Akito</creator><creator>Ikami, Daiki</creator><creator>Irie, Go</creator><creator>Aizawa, Kiyoharu</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20180330</creationdate><title>Parallel Grid Pooling for Data Augmentation</title><author>Takeki, Akito ; Ikami, Daiki ; Irie, Go ; Aizawa, Kiyoharu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-313f0772a7d8a1fe589ebeeb880a8bc5a46dfac853d9dc86896ab42b894538723</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Takeki, Akito</creatorcontrib><creatorcontrib>Ikami, Daiki</creatorcontrib><creatorcontrib>Irie, Go</creatorcontrib><creatorcontrib>Aizawa, Kiyoharu</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Takeki, Akito</au><au>Ikami, Daiki</au><au>Irie, Go</au><au>Aizawa, Kiyoharu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Parallel Grid Pooling for Data Augmentation</atitle><date>2018-03-30</date><risdate>2018</risdate><abstract>Convolutional neural network (CNN) architectures utilize downsampling layers,
which restrict the subsequent layers to learn spatially invariant features
while reducing computational costs. However, such a downsampling operation
makes it impossible to use the full spectrum of input features. Motivated by
this observation, we propose a novel layer called parallel grid pooling (PGP)
which is applicable to various CNN models. PGP performs downsampling without
discarding any intermediate feature. It works as data augmentation and is
complementary to commonly used data augmentation techniques. Furthermore, we
demonstrate that a dilated convolution can naturally be represented using PGP
operations, which suggests that the dilated convolution can also be regarded as
a type of data augmentation technique. Experimental results based on popular
image classification benchmarks demonstrate the effectiveness of the proposed
method. Code is available at: https://github.com/akitotakeki</abstract><doi>10.48550/arxiv.1803.11370</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.1803.11370 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_1803_11370 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Parallel Grid Pooling for Data Augmentation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T13%3A32%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Parallel%20Grid%20Pooling%20for%20Data%20Augmentation&rft.au=Takeki,%20Akito&rft.date=2018-03-30&rft_id=info:doi/10.48550/arxiv.1803.11370&rft_dat=%3Carxiv_GOX%3E1803_11370%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |