Robust cross-domain text retrieval method under noise label

The invention discloses a robust cross-domain text retrieval method under a noise label, and belongs to the technical field of intelligent text retrieval. The method comprises the following steps: acquiring to-be-retrieved data; establishing a cross-domain text retrieval depth model; and retrieving...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	HU PENG, PENG XI, SUN YUAN, FENG YANGLIN, PENG DEZHONG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention discloses a robust cross-domain text retrieval method under a noise label, and belongs to the technical field of intelligent text retrieval. The method comprises the following steps: acquiring to-be-retrieved data; establishing a cross-domain text retrieval depth model; and retrieving to-be-retrieved data by using the cross-domain text retrieval depth model to obtain a retrieval result, and completing cross-domain text retrieval. According to the cross-domain text deep learning method, the important problem that the retrieval effect of the obtained cross-domain text is greatly reduced due to the fact that an existing cross-domain text deep learning method cannot distinguish noise tags and clean tags which cannot be avoided in text data during training and finally is over-fitted to the noise tags is solved. 本发明公开了一种在噪声标签下鲁棒的跨域文本检索方法，属于文本智能检索技术领域，该方法包括获取待检索数据；建立跨域文本检索深度模型；利用所述跨域文本检索深度模型对待检索数据进行检索，得到检索结果，完成跨域文本检索。本发明解决了现有的跨域文本深度学习方法在训练时无法区分文本数据中无法避免带有的噪声标签和干净标签，最终过拟合于噪声标签，导致得到的跨域文本的检索效果大幅降低的重要问题。