FSIC: Frequency-separated image compression for small object detection

The existing image compression methods are designed for the human visual system. They can achieve good compression quality for low-frequency components of the image that are important to human vision. However, for object detection models, both high and low-frequency components are essential. As a re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digital signal processing 2025-01, Vol.156, p.104822, Article 104822
Hauptverfasser: Dai, Chengjie, Song, Tiantian, Chen, Qiang, Gong, Hanshen, Yang, Bowei, Song, Guanghua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The existing image compression methods are designed for the human visual system. They can achieve good compression quality for low-frequency components of the image that are important to human vision. However, for object detection models, both high and low-frequency components are essential. As a result, the detection metrics on the compressed images obtained by current methods will decline. Particularly for small object detection, the lack of high-frequency signals makes it difficult to distinguish the targets from the background. In this paper, we propose a frequency-separated image compression model, named FSIC. During the training process, the compression of low-frequency components only employs MSE loss, while the compression of high-frequency components additionally incorporates a detection loss. We validate FSIC's image compression capability for the small object detection task on the VisDrone dataset and Dota dataset. Results show that under extremely high compression rates, FSIC demonstrates a better performance compared with current compression methods. Furthermore, FSIC has the fastest encoding speed among current learning-based compression models.
ISSN:1051-2004
DOI:10.1016/j.dsp.2024.104822