Method and system for processing and learning rules for extracting information from incoming web pages
An example of a method includes determining features of a first type for a web page of a plurality of web pages. The method also includes electronically determining a plurality of rules for an attribute of the first web page, wherein the plurality of rules are determined based on features of the fir...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An example of a method includes determining features of a first type for a web page of a plurality of web pages. The method also includes electronically determining a plurality of rules for an attribute of the first web page, wherein the plurality of rules are determined based on features of the first type. The method also includes electronically identifying a first rule, from the plurality of rules, which satisfies a first predefined criterion. The first predefined criteria include at least one of a first threshold for a precision parameter, a second threshold for a support parameter, a third threshold for a distance parameter and a fourth threshold for a recall parameter. The method further includes storing the first rule to enable extraction of value of the attribute from a second web page. |
---|