UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection

Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with sp...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-12
Hauptverfasser: Gu, Zhaopeng, Zhu, Bingke, Zhu, Guibo, Chen, Yingying, Tang, Ming, Wang, Jinqiao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Gu, Zhaopeng
Zhu, Bingke
Zhu, Guibo
Chen, Yingying
Tang, Ming
Wang, Jinqiao
description Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with specialized detection techniques and model architectures that are difficult to generalize across different domains. Moreover, even within the same domain, current VAD approaches often follow a "one-category-one-model" paradigm, requiring large amounts of normal samples to train class-specific models, resulting in poor generalizability and hindering unified evaluation across domains. To address this issue, we propose a generalized few-shot VAD method, UniVAD, capable of detecting anomalies across various domains, such as industrial, logical, and medical anomalies, with a training-free unified model. UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects, without training on the specific domain. Specifically, UniVAD employs a Contextual Component Clustering (\(C^3\)) module based on clustering and vision foundation models to segment components within the image accurately, and leverages Component-Aware Patch Matching (CAPM) and Graph-Enhanced Component Modeling (GECM) modules to detect anomalies at different semantic levels, which are aggregated to produce the final detection result. We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance in few-shot anomaly detection tasks across multiple domains, outperforming domain-specific anomaly detection models. The code will be made publicly available.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3141682623</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3141682623</sourcerecordid><originalsourceid>FETCH-proquest_journals_31416826233</originalsourceid><addsrcrecordid>eNqNjrEKwjAURYMgWLT_8MA50CZtLW7FWgRx064l2BdNiYkmKeLf28EPcLpwzhnujESM85SWGWMLEns_JEnCig3Lcx6R48Wotqq3UMHZCWWUuVHpEGHiUmEPJ9ujBmkdNPim_m4DtMqPQkNl7EPoD9QY8BqUNSsyl0J7jH-7JOtmf94d6NPZ14g-dIMdnZlUx9MsLUpWTM_-q744ADuy</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3141682623</pqid></control><display><type>article</type><title>UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection</title><source>Freely Accessible Journals</source><creator>Gu, Zhaopeng ; Zhu, Bingke ; Zhu, Guibo ; Chen, Yingying ; Tang, Ming ; Wang, Jinqiao</creator><creatorcontrib>Gu, Zhaopeng ; Zhu, Bingke ; Zhu, Guibo ; Chen, Yingying ; Tang, Ming ; Wang, Jinqiao</creatorcontrib><description>Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with specialized detection techniques and model architectures that are difficult to generalize across different domains. Moreover, even within the same domain, current VAD approaches often follow a "one-category-one-model" paradigm, requiring large amounts of normal samples to train class-specific models, resulting in poor generalizability and hindering unified evaluation across domains. To address this issue, we propose a generalized few-shot VAD method, UniVAD, capable of detecting anomalies across various domains, such as industrial, logical, and medical anomalies, with a training-free unified model. UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects, without training on the specific domain. Specifically, UniVAD employs a Contextual Component Clustering (\(C^3\)) module based on clustering and vision foundation models to segment components within the image accurately, and leverages Component-Aware Patch Matching (CAPM) and Graph-Enhanced Component Modeling (GECM) modules to detect anomalies at different semantic levels, which are aggregated to produce the final detection result. We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance in few-shot anomaly detection tasks across multiple domains, outperforming domain-specific anomaly detection models. The code will be made publicly available.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Anomalies ; Clustering ; Image enhancement ; Modules</subject><ispartof>arXiv.org, 2024-12</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Gu, Zhaopeng</creatorcontrib><creatorcontrib>Zhu, Bingke</creatorcontrib><creatorcontrib>Zhu, Guibo</creatorcontrib><creatorcontrib>Chen, Yingying</creatorcontrib><creatorcontrib>Tang, Ming</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><title>UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection</title><title>arXiv.org</title><description>Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with specialized detection techniques and model architectures that are difficult to generalize across different domains. Moreover, even within the same domain, current VAD approaches often follow a "one-category-one-model" paradigm, requiring large amounts of normal samples to train class-specific models, resulting in poor generalizability and hindering unified evaluation across domains. To address this issue, we propose a generalized few-shot VAD method, UniVAD, capable of detecting anomalies across various domains, such as industrial, logical, and medical anomalies, with a training-free unified model. UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects, without training on the specific domain. Specifically, UniVAD employs a Contextual Component Clustering (\(C^3\)) module based on clustering and vision foundation models to segment components within the image accurately, and leverages Component-Aware Patch Matching (CAPM) and Graph-Enhanced Component Modeling (GECM) modules to detect anomalies at different semantic levels, which are aggregated to produce the final detection result. We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance in few-shot anomaly detection tasks across multiple domains, outperforming domain-specific anomaly detection models. The code will be made publicly available.</description><subject>Anomalies</subject><subject>Clustering</subject><subject>Image enhancement</subject><subject>Modules</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjrEKwjAURYMgWLT_8MA50CZtLW7FWgRx064l2BdNiYkmKeLf28EPcLpwzhnujESM85SWGWMLEns_JEnCig3Lcx6R48Wotqq3UMHZCWWUuVHpEGHiUmEPJ9ujBmkdNPim_m4DtMqPQkNl7EPoD9QY8BqUNSsyl0J7jH-7JOtmf94d6NPZ14g-dIMdnZlUx9MsLUpWTM_-q744ADuy</recordid><startdate>20241205</startdate><enddate>20241205</enddate><creator>Gu, Zhaopeng</creator><creator>Zhu, Bingke</creator><creator>Zhu, Guibo</creator><creator>Chen, Yingying</creator><creator>Tang, Ming</creator><creator>Wang, Jinqiao</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241205</creationdate><title>UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection</title><author>Gu, Zhaopeng ; Zhu, Bingke ; Zhu, Guibo ; Chen, Yingying ; Tang, Ming ; Wang, Jinqiao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31416826233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Anomalies</topic><topic>Clustering</topic><topic>Image enhancement</topic><topic>Modules</topic><toplevel>online_resources</toplevel><creatorcontrib>Gu, Zhaopeng</creatorcontrib><creatorcontrib>Zhu, Bingke</creatorcontrib><creatorcontrib>Zhu, Guibo</creatorcontrib><creatorcontrib>Chen, Yingying</creatorcontrib><creatorcontrib>Tang, Ming</creatorcontrib><creatorcontrib>Wang, Jinqiao</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gu, Zhaopeng</au><au>Zhu, Bingke</au><au>Zhu, Guibo</au><au>Chen, Yingying</au><au>Tang, Ming</au><au>Wang, Jinqiao</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection</atitle><jtitle>arXiv.org</jtitle><date>2024-12-05</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with specialized detection techniques and model architectures that are difficult to generalize across different domains. Moreover, even within the same domain, current VAD approaches often follow a "one-category-one-model" paradigm, requiring large amounts of normal samples to train class-specific models, resulting in poor generalizability and hindering unified evaluation across domains. To address this issue, we propose a generalized few-shot VAD method, UniVAD, capable of detecting anomalies across various domains, such as industrial, logical, and medical anomalies, with a training-free unified model. UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects, without training on the specific domain. Specifically, UniVAD employs a Contextual Component Clustering (\(C^3\)) module based on clustering and vision foundation models to segment components within the image accurately, and leverages Component-Aware Patch Matching (CAPM) and Graph-Enhanced Component Modeling (GECM) modules to detect anomalies at different semantic levels, which are aggregated to produce the final detection result. We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance in few-shot anomaly detection tasks across multiple domains, outperforming domain-specific anomaly detection models. The code will be made publicly available.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_3141682623
source Freely Accessible Journals
subjects Anomalies
Clustering
Image enhancement
Modules
title UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T10%3A49%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=UniVAD:%20A%20Training-free%20Unified%20Model%20for%20Few-shot%20Visual%20Anomaly%20Detection&rft.jtitle=arXiv.org&rft.au=Gu,%20Zhaopeng&rft.date=2024-12-05&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3141682623%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3141682623&rft_id=info:pmid/&rfr_iscdi=true