Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata

We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2018-01
Hauptverfasser:	Long, Chengjiang, Collins, Roddy, Swears, Eran, Hoogs, Anthony
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Digital media Labels Mathematical models Metadata Neural networks Optimization Pixels Predictions Recurrent neural networks Social networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Long, Chengjiang Collins, Roddy Swears, Eran Hoogs, Anthony
description	We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2071293445</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2071293445</sourcerecordid><originalsourceid>FETCH-proquest_journals_20712934453</originalsourceid><addsrcrecordid>eNqNi0ELgjAYQEcQJOV_-KCzoJtmnS1JqA4VdOggS79MW5ttk-jf5yHo2ukd3nsD4lDGAm8eUjoirjGN7_t0FtMoYg45LxFb2GGnuehhX0rfDWQS0k6INyRKSiwslpDsU0iVhuzBK4QNv6CoZQWn2t7goIr6t8MWLS-55RMyvHJh0P1yTKbp6pisvVarZ4fG5o3qtOxVTv04oAsWhhH7r_oAjplBZw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2071293445</pqid></control><display><type>article</type><title>Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata</title><source>Free E- Journals</source><creator>Long, Chengjiang ; Collins, Roddy ; Swears, Eran ; Hoogs, Anthony</creator><creatorcontrib>Long, Chengjiang ; Collins, Roddy ; Swears, Eran ; Hoogs, Anthony</creatorcontrib><description>We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Digital media ; Labels ; Mathematical models ; Metadata ; Neural networks ; Optimization ; Pixels ; Predictions ; Recurrent neural networks ; Social networks</subject><ispartof>arXiv.org, 2018-01</ispartof><rights>2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Long, Chengjiang</creatorcontrib><creatorcontrib>Collins, Roddy</creatorcontrib><creatorcontrib>Swears, Eran</creatorcontrib><creatorcontrib>Hoogs, Anthony</creatorcontrib><title>Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata</title><title>arXiv.org</title><description>We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches.</description><subject>Artificial neural networks</subject><subject>Digital media</subject><subject>Labels</subject><subject>Mathematical models</subject><subject>Metadata</subject><subject>Neural networks</subject><subject>Optimization</subject><subject>Pixels</subject><subject>Predictions</subject><subject>Recurrent neural networks</subject><subject>Social networks</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi0ELgjAYQEcQJOV_-KCzoJtmnS1JqA4VdOggS79MW5ttk-jf5yHo2ukd3nsD4lDGAm8eUjoirjGN7_t0FtMoYg45LxFb2GGnuehhX0rfDWQS0k6INyRKSiwslpDsU0iVhuzBK4QNv6CoZQWn2t7goIr6t8MWLS-55RMyvHJh0P1yTKbp6pisvVarZ4fG5o3qtOxVTv04oAsWhhH7r_oAjplBZw</recordid><startdate>20180127</startdate><enddate>20180127</enddate><creator>Long, Chengjiang</creator><creator>Collins, Roddy</creator><creator>Swears, Eran</creator><creator>Hoogs, Anthony</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20180127</creationdate><title>Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata</title><author>Long, Chengjiang ; Collins, Roddy ; Swears, Eran ; Hoogs, Anthony</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20712934453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial neural networks</topic><topic>Digital media</topic><topic>Labels</topic><topic>Mathematical models</topic><topic>Metadata</topic><topic>Neural networks</topic><topic>Optimization</topic><topic>Pixels</topic><topic>Predictions</topic><topic>Recurrent neural networks</topic><topic>Social networks</topic><toplevel>online_resources</toplevel><creatorcontrib>Long, Chengjiang</creatorcontrib><creatorcontrib>Collins, Roddy</creatorcontrib><creatorcontrib>Swears, Eran</creatorcontrib><creatorcontrib>Hoogs, Anthony</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Database (Proquest)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Long, Chengjiang</au><au>Collins, Roddy</au><au>Swears, Eran</au><au>Hoogs, Anthony</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata</atitle><jtitle>arXiv.org</jtitle><date>2018-01-27</date><risdate>2018</risdate><eissn>2331-8422</eissn><abstract>We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2018-01
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2071293445
source	Free E- Journals
subjects	Artificial neural networks Digital media Labels Mathematical models Metadata Neural networks Optimization Pixels Predictions Recurrent neural networks Social networks
title	Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T20%3A48%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Deep%20Neural%20Networks%20In%20Fully%20Connected%20CRF%20For%20Image%20Labeling%20With%20Social%20Network%20Metadata&rft.jtitle=arXiv.org&rft.au=Long,%20Chengjiang&rft.date=2018-01-27&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2071293445%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2071293445&rft_id=info:pmid/&rfr_iscdi=true