IDENTIFYING PRODUCT REFERENCES IN USER-GENERATED CONTENT
Systems and methods are disclosed herein for extracting products referenced in a document. A document is analyzed to identify a product type that is referenced in the document. Attributes are extracted from the document. A set of candidate products are identified corresponding to the extracted attri...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Systems and methods are disclosed herein for extracting products referenced in a document. A document is analyzed to identify a product type that is referenced in the document. Attributes are extracted from the document. A set of candidate products are identified corresponding to the extracted attributes. A score is calculated for the candidate products and the products are further selected or filtered based on the score, whitelist rules, and blacklist rules in order to identify one or more inferred products referenced by the document. The whitelist and blacklist rules may take as inputs a domain, a user identifier, and keywords included in the document. A set of sufficient attributes may be identified for each product type. Selection of a candidate product may be based at least in part on the document including all of the attributes in the set of sufficient attributes. |
---|