State extrapolation for automated and semi-automated crawling architecture

A system for automated acquisition of content from an application includes a link extraction controller that receives an identification of a target state of the application directly reachable from an intermediate state and a specification of a user interface element of the intermediate state actuate...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Sarpasayanam, Vasanthakumar, Desineni, Kalyan, Sankaranarasimhan, Manikandan
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A system for automated acquisition of content from an application includes a link extraction controller that receives an identification of a target state of the application directly reachable from an intermediate state and a specification of a user interface element of the intermediate state actuated by a user to arrive at the target state. After navigating to the intermediate state in an executing instance of the application and extracting a tree of user interface widgets, the link extraction controller identifies widget sub-trees that have at least a threshold level of commonality with a reference widget sub-tree that includes the specified user interface element. The link extraction controller adds states, including the target state, reachable by user actuation of the identified widget sub-trees to a state list. A scraper module extracts text and metadata from each of the states in the state list for storage in a data store.