Target positioning method and related equipment thereof

The invention discloses a target positioning method and related equipment thereof, and the method comprises the steps: obtaining a to-be-processed image and first text prompt information, the first text prompt information being description information of a target needing to be retrieved from the to-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG JIN, HUANG BO, CHI ZIQIU, LI YIFEI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a target positioning method and related equipment thereof, and the method comprises the steps: obtaining a to-be-processed image and first text prompt information, the first text prompt information being description information of a target needing to be retrieved from the to-be-processed image; based on the to-be-processed image and the first text prompt information, positioning a bounding box of a target in the to-be-processed image through a preset multi-modal large model to obtain a positioning result; wherein the preset multi-modal large model is obtained by training a basic multi-modal large model on the basis of a preset bounding box positioning data set, each piece of data in the preset bounding box positioning data set comprises a first data set and a bounding box label corresponding to the first data set, the bounding box label and the first data set are in an N-to-1 relationship, N is greater than or equal to 0, and the first data set is a first data set; the first data set c