Cross-modal video retrieval method and system based on multi-granularity knowledge distillation

The invention relates to a cross-modal video retrieval method and system based on multi-granularity knowledge distillation, and the method is characterized in that the method comprises the steps: determining a to-be-queried text; inputting a to-be-queried text into a student model of a pre-input vid...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	TIAN KAIBIN, LI XIRONG, ZHAO RUIXIANG
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention relates to a cross-modal video retrieval method and system based on multi-granularity knowledge distillation, and the method is characterized in that the method comprises the steps: determining a to-be-queried text; inputting a to-be-queried text into a student model of a pre-input video, and outputting a plurality of corresponding videos; the student model is obtained by training the teacher model by adopting a multi-granularity teaching training algorithm, and the method reduces the retrieval precision difference between the student model and the teacher model while keeping relatively low calculation and storage overhead of the student model, and can be widely applied to the field of cross-modal video retrieval. 本发明涉及一种基于多粒度知识蒸馏的跨模态视频检索方法及系统，其特征在于，该方法包括：确定待查询的文本；将待查询的文本输入至预先输入视频的学生模型内，输出对应的若干视频；所述学生模型为教师模型采用多粒度教学训练算法训练得到的，本发明在保留学生模型较低的计算和存储开销的同时，缩小了学生模型和教师模型之间的检索精度差距，可以广泛应用于跨模态视频检索领域中。