PL-GNet: Pixel Level Global Network for detection and localization of image forgeries

Unlike most Image Forgery Detection and Localization (IFDL) methods that classify the tampered regions by local patch, the features from the whole image in the spatial and frequency domains are leveraged in this paper to classify each pixel in the image. This paper proposes a high-confidence pixel l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal processing. Image communication 2023-11, Vol.119, p.117029, Article 117029
Hauptverfasser:	Shi, Zenan, Shen, Xuanjing, Chen, Haipeng, Lyu, Yingda
Format:	Artikel
Sprache:	eng
Schlagworte:	Atrous convolution Decoding net Encoding net Global network Image forgery detection and localization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Unlike most Image Forgery Detection and Localization (IFDL) methods that classify the tampered regions by local patch, the features from the whole image in the spatial and frequency domains are leveraged in this paper to classify each pixel in the image. This paper proposes a high-confidence pixel level global network called PL-GNet to combat real-life image forgery that commonly involves different types and visually looks particularly realistic. There are three building blocks in our end-to-end PL-GNet framework: (1) An Encoding net allows us to extract the global features and generate the high-quality feature maps which indicate possible tampered regions. The newly designed first layer and backbone network architecture based on atrous convolution in Encoding net are adopted to capture the changes of pixel relationships and extract rich multi-scale spatial information. (2) A Long Short Term Memory (LSTM) network based on co-occurrence matrix is designed to capture the tampering traces and the discriminative features between manipulated and non-manipulated regions. (3) A Decoding net that incorporates the output of Encoding net and LSTM network learns the mapping from low-resolution feature maps to pixel-wise prediction masks. Furthermore, we conduct a series of ablation experiments to optimize the design of the Encoding network systematically. Extensive experiments on the six challenging datasets demonstrate that our PL-GNet outperforms each subnetwork and consistently achieves state-of-the-art performance compared to alternative methods over three evaluation metrics. •A pixel-level global network that considers different kinds of inputs is proposed to localize manipulated regions in the image.•A new first layer and the backbone architecture in Encoding net are designed to extract efficient and discriminative features.•In addition, the sub-network LSTM Network, which introduces co-occurrence based features as input, is designed in the frequency domain.•Our PL-GNet is not limited to one specific manipulation type and shows superior performance on six challenging datasets.
ISSN:	0923-5965 1879-2677
DOI:	10.1016/j.image.2023.117029