COMMODITY INFORMATION EXTRACTION RULE GENERATING METHOD, APPARATUS AND PROGRAM
PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common points identifying section 25 identifies inter-page common points in which an identical character string commonly appears at a predetermined rate or more at identical points in commodity detailed information pages included in a commodity detailed information page group. A commodity attribute value extraction point determination section 27 compares the inter-page common points identified in a commodity attribute value extraction candidate and the periphery of the commodity attribute value extraction candidate and a commodity attribute characteristic of a commodity attribute for each commodity attribute to determine whether the commodity attribute value extraction candidate is the extraction point of the attribute value of the commodity attribute. A commodity information extraction rule generating section 28 generates a pair of the extraction point of each of the determined attribute value extraction and a commodity attribute name for each commodity attribute as a commodity information extraction rule. |
---|