Hint learning-based semi-structured webpage attribute value extraction method and system
The invention discloses a semi-structured webpage attribute value extraction method and system based on prompt learning, and relates to the field of Internet, first, according to a DOM tree simplification algorithm, a DOM tree visual angle prompt of a variable node is retrieved, then a task template...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a semi-structured webpage attribute value extraction method and system based on prompt learning, and relates to the field of Internet, first, according to a DOM tree simplification algorithm, a DOM tree visual angle prompt of a variable node is retrieved, then a task template containing task description is designed to obtain template visual angle prompt information, and the template visual angle prompt information is extracted; and finally, introducing a pre-training language model based on an encoder-decoder structure, taking'prompt 'as core operation, comprehensively analyzing domain data characteristics and target task characteristics, designing prompt information of two visual angles, and filling and fusing the double-visual-angle prompt information through a template, so as to obtain a target object. The pre-training language model is jointly guided to perform task learning on a semantic level and a task level in a prompt learning mode, so that effective combination of the pre-tra |
---|