Localization of Objects Encoded in Image Data in Accordance with Natural Language Queries
Generally, the disclosure is directed to generalized objected location, where the located object is in accordance to a natural language (NL) query. More specifically, the embodiments include a unified generalized visual localization architecture. The architecture achieves enhanced performance on the...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generally, the disclosure is directed to generalized objected location, where the located object is in accordance to a natural language (NL) query. More specifically, the embodiments include a unified generalized visual localization architecture. The architecture achieves enhanced performance on the following three tasks: referring expression comprehension, object localization, and object detection. The embodiments employ machine-learned NL models and/or image models. The architecture is enabled to understand and answer natural localization questions towards an image, to output multiple boxes, provide no output if the object is not present (e.g., a null result), as well as, solve general detection tasks. |
---|