Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model

With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:China communications 2014-12, Vol.11 (12), p.13-25
Hauptverfasser: Fang, Cheng, Liu, Jun, Lei, Zhenming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 25
container_issue 12
container_start_page 13
container_title China communications
container_volume 11
creator Fang, Cheng
Liu, Jun
Lei, Zhenming
description With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.
doi_str_mv 10.1109/CC.2014.7019836
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_chongqing_primary_663363138</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cqvip_id>663363138</cqvip_id><ieee_id>7019836</ieee_id><sourcerecordid>10_1109_CC_2014_7019836</sourcerecordid><originalsourceid>FETCH-LOGICAL-c288t-b8b647af27791011fec672039bae5dd399c5fcfcd1fe2182ac807c8e9eadf8e3</originalsourceid><addsrcrecordid>eNpFkL9PAjEYhjtoIkFmB5fG_aC9Hv0x6qFgApGYc3K4lPYrVI87bIkJ_vWWQPRbvuF93nd4ELqhZEgpUaOyHOaEFkNBqJKMX6Ae5YJl46IQV2gQ4wdJJzlnPO-h96UOummg8T9g8VuEgMvGm8-IX8F069bvfddiF7otXugY_TfgWVUt8UTvNX7QMZVSPoEdtBZac8DToHcbvOgsNNfo0ukmwuD8-6h6eqzKWTZ_mT6X9_PM5FLus5Vc8UJolwuhKKHUgeEiJ0ytNIytZUqZsTPO2JTkVObaSCKMBAXaOgmsj0anWRO6GAO4ehf8VodDTUl9NFKXZX00Up-NpMbtqeEB4I_-T-_Oe5uuXX_5dv2HJGkJoEyyX1W9ajo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model</title><source>IEEE Electronic Library (IEL)</source><creator>Fang, Cheng ; Liu, Jun ; Lei, Zhenming</creator><creatorcontrib>Fang, Cheng ; Liu, Jun ; Lei, Zhenming</creatorcontrib><description>With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.</description><identifier>ISSN: 1673-5447</identifier><identifier>DOI: 10.1109/CC.2014.7019836</identifier><identifier>CODEN: CCHOBE</identifier><language>eng</language><publisher>China Institute of Communications</publisher><subject>Algorithm design and analysis ; Big data ; cloud computing ; Computational modeling ; Data mining ; Data models ; Data preprocessing ; graph model ; HTTP ; Internet ; massive data ; Parallel algorithms ; web usage mining ; Web使用挖掘 ; 图模型 ; 并行算法 ; 用户 ; 移动核心网络 ; 网站结构 ; 网络技术</subject><ispartof>China communications, 2014-12, Vol.11 (12), p.13-25</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c288t-b8b647af27791011fec672039bae5dd399c5fcfcd1fe2182ac807c8e9eadf8e3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://image.cqvip.com/vip1000/qk/89450X/89450X.jpg</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7019836$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7019836$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Fang, Cheng</creatorcontrib><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Lei, Zhenming</creatorcontrib><title>Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model</title><title>China communications</title><addtitle>ChinaComm</addtitle><addtitle>China Communications</addtitle><description>With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.</description><subject>Algorithm design and analysis</subject><subject>Big data</subject><subject>cloud computing</subject><subject>Computational modeling</subject><subject>Data mining</subject><subject>Data models</subject><subject>Data preprocessing</subject><subject>graph model</subject><subject>HTTP</subject><subject>Internet</subject><subject>massive data</subject><subject>Parallel algorithms</subject><subject>web usage mining</subject><subject>Web使用挖掘</subject><subject>图模型</subject><subject>并行算法</subject><subject>用户</subject><subject>移动核心网络</subject><subject>网站结构</subject><subject>网络技术</subject><issn>1673-5447</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpFkL9PAjEYhjtoIkFmB5fG_aC9Hv0x6qFgApGYc3K4lPYrVI87bIkJ_vWWQPRbvuF93nd4ELqhZEgpUaOyHOaEFkNBqJKMX6Ae5YJl46IQV2gQ4wdJJzlnPO-h96UOummg8T9g8VuEgMvGm8-IX8F069bvfddiF7otXugY_TfgWVUt8UTvNX7QMZVSPoEdtBZac8DToHcbvOgsNNfo0ukmwuD8-6h6eqzKWTZ_mT6X9_PM5FLus5Vc8UJolwuhKKHUgeEiJ0ytNIytZUqZsTPO2JTkVObaSCKMBAXaOgmsj0anWRO6GAO4ehf8VodDTUl9NFKXZX00Up-NpMbtqeEB4I_-T-_Oe5uuXX_5dv2HJGkJoEyyX1W9ajo</recordid><startdate>20141201</startdate><enddate>20141201</enddate><creator>Fang, Cheng</creator><creator>Liu, Jun</creator><creator>Lei, Zhenming</creator><general>China Institute of Communications</general><scope>2RA</scope><scope>92L</scope><scope>CQIGP</scope><scope>W92</scope><scope>~WA</scope><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20141201</creationdate><title>Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model</title><author>Fang, Cheng ; Liu, Jun ; Lei, Zhenming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c288t-b8b647af27791011fec672039bae5dd399c5fcfcd1fe2182ac807c8e9eadf8e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithm design and analysis</topic><topic>Big data</topic><topic>cloud computing</topic><topic>Computational modeling</topic><topic>Data mining</topic><topic>Data models</topic><topic>Data preprocessing</topic><topic>graph model</topic><topic>HTTP</topic><topic>Internet</topic><topic>massive data</topic><topic>Parallel algorithms</topic><topic>web usage mining</topic><topic>Web使用挖掘</topic><topic>图模型</topic><topic>并行算法</topic><topic>用户</topic><topic>移动核心网络</topic><topic>网站结构</topic><topic>网络技术</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fang, Cheng</creatorcontrib><creatorcontrib>Liu, Jun</creatorcontrib><creatorcontrib>Lei, Zhenming</creatorcontrib><collection>中文科技期刊数据库</collection><collection>中文科技期刊数据库-CALIS站点</collection><collection>中文科技期刊数据库-7.0平台</collection><collection>中文科技期刊数据库-工程技术</collection><collection>中文科技期刊数据库- 镜像站点</collection><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>China communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Fang, Cheng</au><au>Liu, Jun</au><au>Lei, Zhenming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model</atitle><jtitle>China communications</jtitle><stitle>ChinaComm</stitle><addtitle>China Communications</addtitle><date>2014-12-01</date><risdate>2014</risdate><volume>11</volume><issue>12</issue><spage>13</spage><epage>25</epage><pages>13-25</pages><issn>1673-5447</issn><coden>CCHOBE</coden><abstract>With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.</abstract><pub>China Institute of Communications</pub><doi>10.1109/CC.2014.7019836</doi><tpages>13</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1673-5447
ispartof China communications, 2014-12, Vol.11 (12), p.13-25
issn 1673-5447
language eng
recordid cdi_chongqing_primary_663363138
source IEEE Electronic Library (IEL)
subjects Algorithm design and analysis
Big data
cloud computing
Computational modeling
Data mining
Data models
Data preprocessing
graph model
HTTP
Internet
massive data
Parallel algorithms
web usage mining
Web使用挖掘
图模型
并行算法
用户
移动核心网络
网站结构
网络技术
title Parallelized User Clicks Recognition from Massive HTTP Data Based on Dependency Graph Model
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T09%3A25%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Parallelized%20User%20Clicks%20Recognition%20from%20Massive%20HTTP%20Data%20Based%20on%20Dependency%20Graph%20Model&rft.jtitle=China%20communications&rft.au=Fang,%20Cheng&rft.date=2014-12-01&rft.volume=11&rft.issue=12&rft.spage=13&rft.epage=25&rft.pages=13-25&rft.issn=1673-5447&rft.coden=CCHOBE&rft_id=info:doi/10.1109/CC.2014.7019836&rft_dat=%3Ccrossref_RIE%3E10_1109_CC_2014_7019836%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_cqvip_id=663363138&rft_ieee_id=7019836&rfr_iscdi=true