Archive knowledge discovery by proxy cache

An archive is a file containing several related files. Many Internet resources, such as freeware, shareware and trail software, are often packaged into archives for easy installation and taking. Additionally, thousands of users search for archives and download them from different sources everyday. I...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Internet research 2004-01, Vol.14 (1), p.34-47
Hauptverfasser:	Yu, Hsiang-Fu, Chen, Yi-Ming, Tseng, Li-Ming
Format:	Artikel
Sprache:	eng
Schlagworte:	Archives Archives & records Cache Computer Networks Digital archives Freeware Information Retrieval Information Sources Information systems Internet Links Proxy cache servers Robots Search engines Searching Servers Software Software packages Studies Worldwide web
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	An archive is a file containing several related files. Many Internet resources, such as freeware, shareware and trail software, are often packaged into archives for easy installation and taking. Additionally, thousands of users search for archives and download them from different sources everyday. In this paper, previous research on archive downloading is extended via proxy cache to support archive searching. Internet proxy cache servers are used to gather a significant number of Web pages, detect those that contain archive links, and then use the obtained data to search archives by description or filename. Two schemes, iterative and backtracking, are proposed to obtain Web pages with archive links. The experimental results indicate that the precision that both of the schemes can achieve is about the same; however, the backtracking scheme reduces the number of checked pages by a factor of 26. Finally, a real system was implemented to demonstrate the proposed approaches.
ISSN:	1066-2243 2054-5657
DOI:	10.1108/10662240410516309