Image Retrieval using Convolutional Autoencoder, InfoGAN, and Vision Transformer Unsupervised Models

Query by Image Content (QBIC), subsequently known as Content-Based Image Retrieval (CBIR) systems, may offer a more advantageous solution in a variety of applications, including medical, meteorological, search by image, and others. Such systems primarily use similarity matching algorithms to compare...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE access 2023-01, Vol.11, p.1-1
Hauptverfasser:	Sabry, Eman S., Elagooz, Salah, Abd El-Samie, Fathi E., El-Shafai, Walid, El-Bahnasawy, Nirmeen A., Banby, Ghada El, Algarni, Abeer D., Soliman, Naglaa F., Ramadan, Rabie A.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Convolutional neural networks Datasets Face recognition Feature extraction Image retrieval InfoGAN Matching object matching objects matching Performance evaluation Queries Retrieval performance measures Similarity Sketched-real image retrieval Sketches Spatial distance measure spatial distance measurement System effectiveness Touch screens Transformers Vision transformer Visualization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Query by Image Content (QBIC), subsequently known as Content-Based Image Retrieval (CBIR) systems, may offer a more advantageous solution in a variety of applications, including medical, meteorological, search by image, and others. Such systems primarily use similarity matching algorithms to compare image content to get their relevance from databases. They are essentially measuring the spatial distance between extracted visual features from a query image and their correspondence in the dataset. One of the most challenging query retrieval problems is Facial Sketched-Real Image Retrieval (FSRIR), which is content similarity matching based. These facial retrieving systems are employed in a variety of contexts, including criminal justice. The difficulties of retrieving such sorts come from the composition of the human face and its distinctive parts. In addition, the comparison between these images is made from two different domains. Besides, to our knowledge, there is a rare existence of large-scale facial datasets that can be used to evolve the performance of the retrieving system. The success of the retrieval process is governed by the method used to calculate similarity and the efficient representation of compared images. However, by effectively representing visual features, the main challenge-posing component of such approaches might be resolved. Hence, this paper has several contributions that fill the research gap in content-based similarity matching and retrieving as follows: 1) The first contribution is extending the Chinese University Face Sketch (CUFS) dataset by including augmented images, introducing to the community a novel dataset named Extended Sketched-Real Image Retrieval (ESRIR). The CUFS dataset has been extended from 100 images to include 53,000 facial sketches and 53,000 real facial images. 2) The paper's second contribution is proposing three new algorithms for sketched-real image retrieving based on convolutional autoencoder, Infogan, and Vision Transformer unsupervised models for large datasets. 3) Furthermore, to meet the subjective demands of the users because of the prevalence of multiple query formats. The third contribution of this paper is to train and assess the proposed algorithms across two additional facial datasets of various image sorts. 4) Recently, the majority of people have preferred searching for brand logo images, but it may be tricky to separate certain brand logo features from their alternatives and even from other f
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3241858