SEMI-STRUCTURED WEBPAGE ATTRIBUTE VALUE EXTRACTION METHOD BASED ON PROMPT LEARNING, AND ELECTRONIC DEVICE AND STORAGE MEDIUM

The present invention relates to the field of the Internet. Disclosed are a semi-structured webpage attribute value extraction method based on prompt learning, and an electronic device and a storage medium. The method comprises: first, searching for a DOM-tree-perspective prompt of a variable node a...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	LI, Baoke, CAO, Yanan, LIU, Yanbing, FENG, Jiali, YUAN, Fangfang, LU, Yuhai, CAO, Cong
Format:	Patent
Sprache:	chi ; eng ; fre
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The present invention relates to the field of the Internet. Disclosed are a semi-structured webpage attribute value extraction method based on prompt learning, and an electronic device and a storage medium. The method comprises: first, searching for a DOM-tree-perspective prompt of a variable node according to a DOM tree simplification algorithm; then, designing a task template including a task description, so as to obtain template-perspective prompt information; and finally, introducing a pre-trained language model based on an encoder-decoder structure, and using a "prompt" as a core operation to comprehensively analyze the characteristics of domain data and the characteristics of a target task. Prompt information of two perspectives are designed, and the prompt information of the two perspectives is fused by means of template filling; and by means of prompt learning, a pre-trained language model is jointly guided at a semantic level and a task level to perform task learning, and thus the effective combinati