A Deep Learning Framework for Grocery Product Detection and Recognition
Object detection and recognition are the most important and challenging problems in computer vision. The remarkable advancements in deep learning techniques have significantly accelerated the momentum of object detection/recognition in recent years. Meanwhile, text detection/recognition is also a cr...
Gespeichert in:
Veröffentlicht in: | Food analytical methods 2022-12, Vol.15 (12), p.3498-3522 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Object detection and recognition are the most important and challenging problems in computer vision. The remarkable advancements in deep learning techniques have significantly accelerated the momentum of object detection/recognition in recent years. Meanwhile, text detection/recognition is also a critical task in computer vision and has gotten more attention from many researchers due to its wide range of applications. This work focuses on detecting and recognizing multiple retail products stacked on the shelves and off the shelves in the grocery stores by identifying the label texts. In this paper, we proposed a new framework is composed of three modules: (a) retail product detection, (b) product-text detection, (c) product-text recognition. In the first module, on-the-shelf and off-the-shelf retail products are detected using the YOLOv5 object detection algorithm. In the second module, we improve the performance of the state-of-the-art text detection algorithm by replacing the backbone network with ResNet50 + FPN and by introducing a new post-processing technique, Width Height based Bounding Box Reconstruction, to mitigate the problem of inaccurate text detection. In the final module, we used a state-of-the-art text recognition model to recognize the retail product’s text information. The YOLOv5 algorithm accurately detects both on-the-shelf and off-the-shelf grocery products from the video frames and the static images. The experimental results show that the proposed post-processing approach improves the performance of the existing methods on both regular and irregular text. The robust text detection and text recognition methods greatly support our proposed framework to recognize the on-the-shelf retail products by extracting product information such as product name, brand name, price, and expiring date. The recognized text contexts around the retail products can be used as the identifier to distinguish the product. |
---|---|
ISSN: | 1936-9751 1936-976X |
DOI: | 10.1007/s12161-022-02384-2 |