Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In or...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhou, Guorui, Fan, Ying, Cui, Runpeng, Bian, Weijie, Zhu, Xiaoqiang, Gai, Kun
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Zhou, Guorui Fan, Ying Cui, Runpeng Bian, Weijie Zhu, Xiaoqiang Gai, Kun
description	Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.
doi_str_mv	10.48550/arxiv.1708.04106
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1708_04106</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1708_04106</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-2d6ec3d9d4a231451677c7181b69e8b74852b212681a935e5363c14903ec8d163</originalsourceid><addsrcrecordid>eNotj0FLw0AUhPfiQao_wJP7BxL3ZZPdjbdSWhWCgqT0IoSXzUu7NNmUbaz6702rp2EGZpiPsTsQcWqyTDxg-HanGLQwsUhBqGv28T7YPY28wE9vd85vH_mcr707UThix9E3fNm2zjryI18F7OlrCHveDoGXAZ2fGnxDXRcdKExhf_aF2-5G_krjDbtqsTvS7b_OWLlalovnqHh7elnMiwiVVlHSKLKyyZsUEwlpBkprq8FArXIytZ6uJ3UCiTKAucwok0paSHMhyZoGlJyx-7_ZC151CK7H8FOdMasLpvwFxH5MNA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net</title><source>arXiv.org</source><creator>Zhou, Guorui ; Fan, Ying ; Cui, Runpeng ; Bian, Weijie ; Zhu, Xiaoqiang ; Gai, Kun</creator><creatorcontrib>Zhou, Guorui ; Fan, Ying ; Cui, Runpeng ; Bian, Weijie ; Zhu, Xiaoqiang ; Gai, Kun</creatorcontrib><description>Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.</description><identifier>DOI: 10.48550/arxiv.1708.04106</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2017-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1708.04106$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1708.04106$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhou, Guorui</creatorcontrib><creatorcontrib>Fan, Ying</creatorcontrib><creatorcontrib>Cui, Runpeng</creatorcontrib><creatorcontrib>Bian, Weijie</creatorcontrib><creatorcontrib>Zhu, Xiaoqiang</creatorcontrib><creatorcontrib>Gai, Kun</creatorcontrib><title>Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net</title><description>Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0FLw0AUhPfiQao_wJP7BxL3ZZPdjbdSWhWCgqT0IoSXzUu7NNmUbaz6702rp2EGZpiPsTsQcWqyTDxg-HanGLQwsUhBqGv28T7YPY28wE9vd85vH_mcr707UThix9E3fNm2zjryI18F7OlrCHveDoGXAZ2fGnxDXRcdKExhf_aF2-5G_krjDbtqsTvS7b_OWLlalovnqHh7elnMiwiVVlHSKLKyyZsUEwlpBkprq8FArXIytZ6uJ3UCiTKAucwok0paSHMhyZoGlJyx-7_ZC151CK7H8FOdMasLpvwFxH5MNA</recordid><startdate>20170814</startdate><enddate>20170814</enddate><creator>Zhou, Guorui</creator><creator>Fan, Ying</creator><creator>Cui, Runpeng</creator><creator>Bian, Weijie</creator><creator>Zhu, Xiaoqiang</creator><creator>Gai, Kun</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20170814</creationdate><title>Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net</title><author>Zhou, Guorui ; Fan, Ying ; Cui, Runpeng ; Bian, Weijie ; Zhu, Xiaoqiang ; Gai, Kun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-2d6ec3d9d4a231451677c7181b69e8b74852b212681a935e5363c14903ec8d163</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Guorui</creatorcontrib><creatorcontrib>Fan, Ying</creatorcontrib><creatorcontrib>Cui, Runpeng</creatorcontrib><creatorcontrib>Bian, Weijie</creatorcontrib><creatorcontrib>Zhu, Xiaoqiang</creatorcontrib><creatorcontrib>Gai, Kun</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhou, Guorui</au><au>Fan, Ying</au><au>Cui, Runpeng</au><au>Bian, Weijie</au><au>Zhu, Xiaoqiang</au><au>Gai, Kun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net</atitle><date>2017-08-14</date><risdate>2017</risdate><abstract>Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.</abstract><doi>10.48550/arxiv.1708.04106</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1708.04106
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1708_04106
source	arXiv.org
subjects	Computer Science - Learning Statistics - Machine Learning
title	Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T00%3A20%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Rocket%20Launching:%20A%20Universal%20and%20Efficient%20Framework%20for%20Training%20Well-performing%20Light%20Net&rft.au=Zhou,%20Guorui&rft.date=2017-08-14&rft_id=info:doi/10.48550/arxiv.1708.04106&rft_dat=%3Carxiv_GOX%3E1708_04106%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true