Region- and Pixel-Level Multi-Focus Image Fusion through Convolutional Neural Networks

Capturing all-in-focus images with 3D scenes is typically a challenging task due to depth of field limitations, and various multi-focus image fusion methods have been employed to generate all-in-focus images. However, existing methods have difficulty simultaneously achieving real-time and superior f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Mobile networks and applications 2021-02, Vol.26 (1), p.40-56
Hauptverfasser:	Zhao, Wenyi, Yang, Huihua, Wang, Jie, Pan, Xipeng, Cao, Zhiwei
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Communications Engineering Computer Communication Networks Computer vision Depth of field Electrical Engineering Engineering Image classification Image processing Interpolation IT in Business Methods Networks Neural networks Object recognition Pixels Real time Visual perception
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Capturing all-in-focus images with 3D scenes is typically a challenging task due to depth of field limitations, and various multi-focus image fusion methods have been employed to generate all-in-focus images. However, existing methods have difficulty simultaneously achieving real-time and superior fusion performance. In this paper, we propose a region- and pixel-based method that can recognize the focus and defocus regions or pixels by the neighborhood information in the source images. The proposed method can obtain satisfactory fusion results and achieve improved real-time performance. First, a convolutional neural network (CNN)-based classifier generates a coarse region-based trimap quickly, which contains focus, defocus and boundary regions. Then, precise fine-tuning is implemented at the boundary regions to address the boundary pixels that are difficult to discriminate by existing methods. Based on a public database, a high-quality dataset is constructed that provides abundant precise pixel-level labels, so that the proposed method can accurately classify the regions and pixels without artifacts. Furthermore, an image interpolation method called NEAREST_Gaussian is proposed to improve the recognition ability at the boundary. Experimental results show that the proposed method outperforms other state-of-the-art methods in visual perception and object metrics. Additionally, the proposed method has 80% improved to the conventional CNN-based methods.
ISSN:	1383-469X 1572-8153
DOI:	10.1007/s11036-020-01719-9