Automated classification of HTML forms on ecommerce web sites

Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applicat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Online information review 2007-08, Vol.31 (4), p.451-466
Hauptverfasser: Ru, Yanbo, Horowitz, Ellis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 466
container_issue 4
container_start_page 451
container_title Online information review
container_volume 31
creator Ru, Yanbo
Horowitz, Ellis
description Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.
doi_str_mv 10.1108/14684520710780412
format Article
fullrecord <record><control><sourceid>istex</sourceid><recordid>TN_cdi_istex_primary_ark_67375_4W2_07T241V0_V</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ark_67375_4W2_07T241V0_V</sourcerecordid><originalsourceid>FETCH-istex_primary_ark_67375_4W2_07T241V0_V3</originalsourceid><addsrcrecordid>eNqVyr0KwjAUQOEMCtafB3DLC1Rv0tR0cRBROuhWdAyxphBtjORG1Le3gi_gdODjEDJlMGMMijkTi0LkHCQDWYBgvEeSr6UdygEZIl4AGBdZnpDl6hG909Gcad1qRNvYWkfrb9Q3tKz2O9r44JB2YGrvnAm1oU9zomijwTHpN7pFM_l1RNLtplqXqcVoXuoerNPhrXS4qoXMZK7EkSuQFRfsAOqQ_ft_ALwEQgU</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automated classification of HTML forms on ecommerce web sites</title><source>Emerald Journals</source><source>Standard: Emerald eJournal Premier Collection</source><creator>Ru, Yanbo ; Horowitz, Ellis</creator><creatorcontrib>Ru, Yanbo ; Horowitz, Ellis</creatorcontrib><description>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</description><identifier>ISSN: 1468-4527</identifier><identifier>DOI: 10.1108/14684520710780412</identifier><language>eng</language><publisher>Emerald Group Publishing Limited</publisher><subject>Classification ; Electronic commerce ; Learning ; Worldwide web</subject><ispartof>Online information review, 2007-08, Vol.31 (4), p.451-466</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,961,21674,27901,27902</link.rule.ids></links><search><creatorcontrib>Ru, Yanbo</creatorcontrib><creatorcontrib>Horowitz, Ellis</creatorcontrib><title>Automated classification of HTML forms on ecommerce web sites</title><title>Online information review</title><description>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</description><subject>Classification</subject><subject>Electronic commerce</subject><subject>Learning</subject><subject>Worldwide web</subject><issn>1468-4527</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><recordid>eNqVyr0KwjAUQOEMCtafB3DLC1Rv0tR0cRBROuhWdAyxphBtjORG1Le3gi_gdODjEDJlMGMMijkTi0LkHCQDWYBgvEeSr6UdygEZIl4AGBdZnpDl6hG909Gcad1qRNvYWkfrb9Q3tKz2O9r44JB2YGrvnAm1oU9zomijwTHpN7pFM_l1RNLtplqXqcVoXuoerNPhrXS4qoXMZK7EkSuQFRfsAOqQ_ft_ALwEQgU</recordid><startdate>20070814</startdate><enddate>20070814</enddate><creator>Ru, Yanbo</creator><creator>Horowitz, Ellis</creator><general>Emerald Group Publishing Limited</general><scope>BSCLL</scope></search><sort><creationdate>20070814</creationdate><title>Automated classification of HTML forms on ecommerce web sites</title><author>Ru, Yanbo ; Horowitz, Ellis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-istex_primary_ark_67375_4W2_07T241V0_V3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Classification</topic><topic>Electronic commerce</topic><topic>Learning</topic><topic>Worldwide web</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ru, Yanbo</creatorcontrib><creatorcontrib>Horowitz, Ellis</creatorcontrib><collection>Istex</collection><jtitle>Online information review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ru, Yanbo</au><au>Horowitz, Ellis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automated classification of HTML forms on ecommerce web sites</atitle><jtitle>Online information review</jtitle><date>2007-08-14</date><risdate>2007</risdate><volume>31</volume><issue>4</issue><spage>451</spage><epage>466</epage><pages>451-466</pages><issn>1468-4527</issn><abstract>Purpose Most ecommerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results. Designmethodologyapproach Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them. Findings The authors tested their classifier on an ecommerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method. Originalityvalue The paper is of value to those involved with information management and ecommerce.</abstract><pub>Emerald Group Publishing Limited</pub><doi>10.1108/14684520710780412</doi></addata></record>
fulltext fulltext
identifier ISSN: 1468-4527
ispartof Online information review, 2007-08, Vol.31 (4), p.451-466
issn 1468-4527
language eng
recordid cdi_istex_primary_ark_67375_4W2_07T241V0_V
source Emerald Journals; Standard: Emerald eJournal Premier Collection
subjects Classification
Electronic commerce
Learning
Worldwide web
title Automated classification of HTML forms on ecommerce web sites
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T18%3A31%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-istex&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automated%20classification%20of%20HTML%20forms%20on%20ecommerce%20web%20sites&rft.jtitle=Online%20information%20review&rft.au=Ru,%20Yanbo&rft.date=2007-08-14&rft.volume=31&rft.issue=4&rft.spage=451&rft.epage=466&rft.pages=451-466&rft.issn=1468-4527&rft_id=info:doi/10.1108/14684520710780412&rft_dat=%3Cistex%3Eark_67375_4W2_07T241V0_V%3C/istex%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true