Automated classification of HTML forms on ecommerce web sites
Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applicat...
Gespeichert in:
Veröffentlicht in: | Online information review 2007-08, Vol.31 (4), p.451-466 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 466 |
---|---|
container_issue | 4 |
container_start_page | 451 |
container_title | Online information review |
container_volume | 31 |
creator | Ru, Yanbo Horowitz, Ellis |
description | Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce. |
doi_str_mv | 10.1108/14684520710780412 |
format | Article |
fullrecord | <record><control><sourceid>istex</sourceid><recordid>TN_cdi_istex_primary_ark_67375_4W2_07T241V0_V</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ark_67375_4W2_07T241V0_V</sourcerecordid><originalsourceid>FETCH-istex_primary_ark_67375_4W2_07T241V0_V3</originalsourceid><addsrcrecordid>eNqVyr0KwjAUQOEMCtafB3DLC1Rv0tR0cRBROuhWdAyxphBtjORG1Le3gi_gdODjEDJlMGMMijkTi0LkHCQDWYBgvEeSr6UdygEZIl4AGBdZnpDl6hG909Gcad1qRNvYWkfrb9Q3tKz2O9r44JB2YGrvnAm1oU9zomijwTHpN7pFM_l1RNLtplqXqcVoXuoerNPhrXS4qoXMZK7EkSuQFRfsAOqQ_ft_ALwEQgU</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automated classification of HTML forms on ecommerce web sites</title><source>Emerald Journals</source><source>Standard: Emerald eJournal Premier Collection</source><creator>Ru, Yanbo ; Horowitz, Ellis</creator><creatorcontrib>Ru, Yanbo ; Horowitz, Ellis</creatorcontrib><description>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</description><identifier>ISSN: 1468-4527</identifier><identifier>DOI: 10.1108/14684520710780412</identifier><language>eng</language><publisher>Emerald Group Publishing Limited</publisher><subject>Classification ; Electronic commerce ; Learning ; Worldwide web</subject><ispartof>Online information review, 2007-08, Vol.31 (4), p.451-466</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,961,21674,27901,27902</link.rule.ids></links><search><creatorcontrib>Ru, Yanbo</creatorcontrib><creatorcontrib>Horowitz, Ellis</creatorcontrib><title>Automated classification of HTML forms on ecommerce web sites</title><title>Online information review</title><description>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</description><subject>Classification</subject><subject>Electronic commerce</subject><subject>Learning</subject><subject>Worldwide web</subject><issn>1468-4527</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><recordid>eNqVyr0KwjAUQOEMCtafB3DLC1Rv0tR0cRBROuhWdAyxphBtjORG1Le3gi_gdODjEDJlMGMMijkTi0LkHCQDWYBgvEeSr6UdygEZIl4AGBdZnpDl6hG909Gcad1qRNvYWkfrb9Q3tKz2O9r44JB2YGrvnAm1oU9zomijwTHpN7pFM_l1RNLtplqXqcVoXuoerNPhrXS4qoXMZK7EkSuQFRfsAOqQ_ft_ALwEQgU</recordid><startdate>20070814</startdate><enddate>20070814</enddate><creator>Ru, Yanbo</creator><creator>Horowitz, Ellis</creator><general>Emerald Group Publishing Limited</general><scope>BSCLL</scope></search><sort><creationdate>20070814</creationdate><title>Automated classification of HTML forms on ecommerce web sites</title><author>Ru, Yanbo ; Horowitz, Ellis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-istex_primary_ark_67375_4W2_07T241V0_V3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Classification</topic><topic>Electronic commerce</topic><topic>Learning</topic><topic>Worldwide web</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ru, Yanbo</creatorcontrib><creatorcontrib>Horowitz, Ellis</creatorcontrib><collection>Istex</collection><jtitle>Online information review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ru, Yanbo</au><au>Horowitz, Ellis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automated classification of HTML forms on ecommerce web sites</atitle><jtitle>Online information review</jtitle><date>2007-08-14</date><risdate>2007</risdate><volume>31</volume><issue>4</issue><spage>451</spage><epage>466</epage><pages>451-466</pages><issn>1468-4527</issn><abstract>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</abstract><pub>Emerald Group Publishing Limited</pub><doi>10.1108/14684520710780412</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1468-4527 |
ispartof | Online information review, 2007-08, Vol.31 (4), p.451-466 |
issn | 1468-4527 |
language | eng |
recordid | cdi_istex_primary_ark_67375_4W2_07T241V0_V |
source | Emerald Journals; Standard: Emerald eJournal Premier Collection |
subjects | Classification Electronic commerce Learning Worldwide web |
title | Automated classification of HTML forms on ecommerce web sites |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T18%3A31%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-istex&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automated%20classification%20of%20HTML%20forms%20on%20ecommerce%20web%20sites&rft.jtitle=Online%20information%20review&rft.au=Ru,%20Yanbo&rft.date=2007-08-14&rft.volume=31&rft.issue=4&rft.spage=451&rft.epage=466&rft.pages=451-466&rft.issn=1468-4527&rft_id=info:doi/10.1108/14684520710780412&rft_dat=%3Cistex%3Eark_67375_4W2_07T241V0_V%3C/istex%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |