COMMODITY INFORMATION EXTRACTION RULE GENERATING METHOD, APPARATUS AND PROGRAM

PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: UCHIYAMA MASASHI, SHIOBARA TOSHIKO, TANAKA AKIMICHI, IIMURA YUKAKO
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common points identifying section 25 identifies inter-page common points in which an identical character string commonly appears at a predetermined rate or more at identical points in commodity detailed information pages included in a commodity detailed information page group. A commodity attribute value extraction point determination section 27 compares the inter-page common points identified in a commodity attribute value extraction candidate and the periphery of the commodity attribute value extraction candidate and a commodity attribute characteristic of a commodity attribute for each commodity attribute to determine whether the commodity attribute value extraction candidate is the extraction point of the attribute value of the commodity attribute. A commodity information extraction rule generating section 28 generates a pair of the extraction point of each of the determined attribute value extraction and a commodity attribute name for each commodity attribute as a commodity information extraction rule.