Semantic-agnostic progressive subtractive network for image manipulation detection and localization
In this paper, we propose a new detection and localization framework capable of detecting suspicious forgeries using the Semantic-Agnostic Progressive Subtractive Network (SAPS-Net). Our approach is based on the key observation that fluctuations in image content severely interfere with the capture o...
Gespeichert in:
Veröffentlicht in: | Neurocomputing (Amsterdam) 2023-07, Vol.543, p.126263, Article 126263 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we propose a new detection and localization framework capable of detecting suspicious forgeries using the Semantic-Agnostic Progressive Subtractive Network (SAPS-Net). Our approach is based on the key observation that fluctuations in image content severely interfere with the capture of general manipulations by existing convolutional architecture. Distinct from the aggregation attention employed by traditional methods, we design the Semantic-Agnostic Manipulation Attention (SAMA) based on subtractive operation for mitigating the effect of rich image semantics on manipulation extraction. Initially, the Multi-Scale feature Iterative Fusion Block (MSIFB) and Multi-Kernel feature Fusion Residual Block (MKFRB) are designed to iteratively crawl potential semantic associations of different hierarchical feature mappings. Then, we further devise the subtractive operation to effectively remove the semantic associations as distractors and promote the network to adaptively learn general forgery. Notably, these semantic associations based on image content may be fundamentally different from the manipulation traces that alter the internal patterns of images. By progressively utilizing SAMAs, the network remains robust to image content manipulation with rich semantics. Extensive experiments on six challenge datasets show that our approach has more than 3.03% pixel-level AUC gains and 3.70% image-level AUC gains in cross-dataset scenarios compared to state-of-the-art methods, especially on the IMD20 dataset (pixel-level AUC: 0.859) and Wild dataset (pixel-level AUC: 0.821) with realistic scenarios. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2023.126263 |