Adapting content repositories for crawling and serving

A system for searching files stored in a closed file source that is not accessible via a web crawler obtains file identifiers for files stored in the file source and creates a unique URL for each of the identifiers. Each URL may be based on a file identifier and a domain portion of a URL associated...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: ILES BRANDON PLAYER, ANDERSON ERIC JON, FELTON JOHN, OPALINSKI PAWEL
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A system for searching files stored in a closed file source that is not accessible via a web crawler obtains file identifiers for files stored in the file source and creates a unique URL for each of the identifiers. Each URL may be based on a file identifier and a domain portion of a URL associated with the system. The system may provide the unique URLs to a search engine. The system may respond to a crawl request from the search engine for a particular URL by converting the URL back into a file identifier, obtaining the contents of the file, creating an HTTP response from the contents of the file, and returning the response to the search engine. The system may respond to a request for a seed URL with a plurality of URLs as links in a single HTTP response.