MaskSearch: Querying Image Masks at Scale

Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a mo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: He, Dong, Zhang, Jieyu, Daum, Maureen, Ratner, Alexander, Balazinska, Magdalena
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator He, Dong
Zhang, Jieyu
Daum, Maureen
Ratner, Alexander
Balazinska, Magdalena
description Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a model). While queries that retrieve examples based on mask properties are valuable to practitioners, existing systems do not support them efficiently. In this paper, we formalize the problem and propose MaskSearch, a system that focuses on accelerating queries over databases of image masks while guaranteeing the correctness of query results. MaskSearch leverages a novel indexing technique and an efficient filter-verification query execution framework. Experiments with our prototype show that MaskSearch, using indexes approximately 5% of the compressed data size, accelerates individual queries by up to two orders of magnitude and consistently outperforms existing methods on various multi-query workloads that simulate dataset exploration and analysis processes.
doi_str_mv 10.48550/arxiv.2305.02375
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2305_02375</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2305_02375</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-ba87fd120ff7dcb70d0173999dc40d11dddb7658049a9ac8d36cb33be68bbc053</originalsourceid><addsrcrecordid>eNotzj1vwjAUhWEvHarAD-iEV4aE6zj-6lahtiCBqirs0bWvA1GTqnIKKv--4mM6wysdPYw9CSgqqxQsMP11p6KUoAoopVGPbL7F8auOmMLhmX8eYzp333u-HnAf-SWNHH95HbCPE_bQYj_G6X0ztnt73S1X-ebjfb182eSojco9WtOSKKFtDQVvgEAY6ZyjUAEJQUTeaGWhcugwWJI6eCl91Nb7AEpmbHa7vVqbn9QNmM7NxdxczfIf1p07FQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MaskSearch: Querying Image Masks at Scale</title><source>arXiv.org</source><creator>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</creator><creatorcontrib>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</creatorcontrib><description>Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a model). While queries that retrieve examples based on mask properties are valuable to practitioners, existing systems do not support them efficiently. In this paper, we formalize the problem and propose MaskSearch, a system that focuses on accelerating queries over databases of image masks while guaranteeing the correctness of query results. MaskSearch leverages a novel indexing technique and an efficient filter-verification query execution framework. Experiments with our prototype show that MaskSearch, using indexes approximately 5% of the compressed data size, accelerates individual queries by up to two orders of magnitude and consistently outperforms existing methods on various multi-query workloads that simulate dataset exploration and analysis processes.</description><identifier>DOI: 10.48550/arxiv.2305.02375</identifier><language>eng</language><subject>Computer Science - Databases ; Computer Science - Learning ; Computer Science - Multimedia</subject><creationdate>2023-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2305.02375$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2305.02375$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Dong</creatorcontrib><creatorcontrib>Zhang, Jieyu</creatorcontrib><creatorcontrib>Daum, Maureen</creatorcontrib><creatorcontrib>Ratner, Alexander</creatorcontrib><creatorcontrib>Balazinska, Magdalena</creatorcontrib><title>MaskSearch: Querying Image Masks at Scale</title><description>Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a model). While queries that retrieve examples based on mask properties are valuable to practitioners, existing systems do not support them efficiently. In this paper, we formalize the problem and propose MaskSearch, a system that focuses on accelerating queries over databases of image masks while guaranteeing the correctness of query results. MaskSearch leverages a novel indexing technique and an efficient filter-verification query execution framework. Experiments with our prototype show that MaskSearch, using indexes approximately 5% of the compressed data size, accelerates individual queries by up to two orders of magnitude and consistently outperforms existing methods on various multi-query workloads that simulate dataset exploration and analysis processes.</description><subject>Computer Science - Databases</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Multimedia</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzj1vwjAUhWEvHarAD-iEV4aE6zj-6lahtiCBqirs0bWvA1GTqnIKKv--4mM6wysdPYw9CSgqqxQsMP11p6KUoAoopVGPbL7F8auOmMLhmX8eYzp333u-HnAf-SWNHH95HbCPE_bQYj_G6X0ztnt73S1X-ebjfb182eSojco9WtOSKKFtDQVvgEAY6ZyjUAEJQUTeaGWhcugwWJI6eCl91Nb7AEpmbHa7vVqbn9QNmM7NxdxczfIf1p07FQ</recordid><startdate>20230503</startdate><enddate>20230503</enddate><creator>He, Dong</creator><creator>Zhang, Jieyu</creator><creator>Daum, Maureen</creator><creator>Ratner, Alexander</creator><creator>Balazinska, Magdalena</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230503</creationdate><title>MaskSearch: Querying Image Masks at Scale</title><author>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-ba87fd120ff7dcb70d0173999dc40d11dddb7658049a9ac8d36cb33be68bbc053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Databases</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Multimedia</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Dong</creatorcontrib><creatorcontrib>Zhang, Jieyu</creatorcontrib><creatorcontrib>Daum, Maureen</creatorcontrib><creatorcontrib>Ratner, Alexander</creatorcontrib><creatorcontrib>Balazinska, Magdalena</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Dong</au><au>Zhang, Jieyu</au><au>Daum, Maureen</au><au>Ratner, Alexander</au><au>Balazinska, Magdalena</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MaskSearch: Querying Image Masks at Scale</atitle><date>2023-05-03</date><risdate>2023</risdate><abstract>Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a model). While queries that retrieve examples based on mask properties are valuable to practitioners, existing systems do not support them efficiently. In this paper, we formalize the problem and propose MaskSearch, a system that focuses on accelerating queries over databases of image masks while guaranteeing the correctness of query results. MaskSearch leverages a novel indexing technique and an efficient filter-verification query execution framework. Experiments with our prototype show that MaskSearch, using indexes approximately 5% of the compressed data size, accelerates individual queries by up to two orders of magnitude and consistently outperforms existing methods on various multi-query workloads that simulate dataset exploration and analysis processes.</abstract><doi>10.48550/arxiv.2305.02375</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2305.02375
ispartof
issn
language eng
recordid cdi_arxiv_primary_2305_02375
source arXiv.org
subjects Computer Science - Databases
Computer Science - Learning
Computer Science - Multimedia
title MaskSearch: Querying Image Masks at Scale
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T11%3A17%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MaskSearch:%20Querying%20Image%20Masks%20at%20Scale&rft.au=He,%20Dong&rft.date=2023-05-03&rft_id=info:doi/10.48550/arxiv.2305.02375&rft_dat=%3Carxiv_GOX%3E2305_02375%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true