Description and recognition of complex spatial configurations of object pairs with Force Banner 2D features

•Stronger description of spatial configurations to enhance image content understanding.•2D extension of the Force Histogram to model a large panel of forces between object pairs.•Translation of relative position descriptors into spatial relations in natural language.•Integration of relative position...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2022-03, Vol.123, p.108410, Article 108410
Hauptverfasser: Deléarde, Robin, Kurtz, Camille, Wendling, Laurent
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Stronger description of spatial configurations to enhance image content understanding.•2D extension of the Force Histogram to model a large panel of forces between object pairs.•Translation of relative position descriptors into spatial relations in natural language.•Integration of relative position descriptors into recent classification systems (e.g. 2D CNNs). [Display omitted] A major challenge in scene understanding is the handling of spatial relations between objects or object parts. Several descriptors dedicated to this task already exist, such as the force histogram which is a typical example of relative position descriptor. By computing the interaction between two objects for a given force in all the directions, it gives a good overview of the configuration, and it has useful properties that can make it invariant to the 2D viewpoint. Considering that using complementary forces (negative for repulsion, positive for attraction) should improve the description of complex spatial configurations, we propose to extend the force histogram to a panel of forces so as to make it a more complete descriptor. This gives a 2D descriptor that we called “(discrete) Force Banner” and which can be used as input of a classical Convolutional Neural Network (CNN), benefiting from their powerful performances, and reduced into more compact spatial features to use them in another system. As an illustration of its ability to describe spatial configurations, we used it to solve a classification problem aiming to discriminate simple spatial relations, but with variable configuration complexities. Experimental results obtained on datasets of synthetic and natural images with various shapes highlight the interest of this approach, in particular for complex spatial configurations.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2021.108410