MaskSearch: Querying Image Masks at Scale
Machine learning tasks over image databases often generate masks that annotate image content (e.g., saliency maps, segmentation maps, depth maps) and enable a variety of applications (e.g., determine if a model is learning spurious correlations or if an image was maliciously modified to mislead a mo...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | He, Dong Zhang, Jieyu Daum, Maureen Ratner, Alexander Balazinska, Magdalena |
description | Machine learning tasks over image databases often generate masks that
annotate image content (e.g., saliency maps, segmentation maps, depth maps) and
enable a variety of applications (e.g., determine if a model is learning
spurious correlations or if an image was maliciously modified to mislead a
model). While queries that retrieve examples based on mask properties are
valuable to practitioners, existing systems do not support them efficiently. In
this paper, we formalize the problem and propose MaskSearch, a system that
focuses on accelerating queries over databases of image masks while
guaranteeing the correctness of query results. MaskSearch leverages a novel
indexing technique and an efficient filter-verification query execution
framework. Experiments with our prototype show that MaskSearch, using indexes
approximately 5% of the compressed data size, accelerates individual queries by
up to two orders of magnitude and consistently outperforms existing methods on
various multi-query workloads that simulate dataset exploration and analysis
processes. |
doi_str_mv | 10.48550/arxiv.2305.02375 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2305_02375</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2305_02375</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-ba87fd120ff7dcb70d0173999dc40d11dddb7658049a9ac8d36cb33be68bbc053</originalsourceid><addsrcrecordid>eNotzj1vwjAUhWEvHarAD-iEV4aE6zj-6lahtiCBqirs0bWvA1GTqnIKKv--4mM6wysdPYw9CSgqqxQsMP11p6KUoAoopVGPbL7F8auOmMLhmX8eYzp333u-HnAf-SWNHH95HbCPE_bQYj_G6X0ztnt73S1X-ebjfb182eSojco9WtOSKKFtDQVvgEAY6ZyjUAEJQUTeaGWhcugwWJI6eCl91Nb7AEpmbHa7vVqbn9QNmM7NxdxczfIf1p07FQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MaskSearch: Querying Image Masks at Scale</title><source>arXiv.org</source><creator>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</creator><creatorcontrib>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</creatorcontrib><description>Machine learning tasks over image databases often generate masks that
annotate image content (e.g., saliency maps, segmentation maps, depth maps) and
enable a variety of applications (e.g., determine if a model is learning
spurious correlations or if an image was maliciously modified to mislead a
model). While queries that retrieve examples based on mask properties are
valuable to practitioners, existing systems do not support them efficiently. In
this paper, we formalize the problem and propose MaskSearch, a system that
focuses on accelerating queries over databases of image masks while
guaranteeing the correctness of query results. MaskSearch leverages a novel
indexing technique and an efficient filter-verification query execution
framework. Experiments with our prototype show that MaskSearch, using indexes
approximately 5% of the compressed data size, accelerates individual queries by
up to two orders of magnitude and consistently outperforms existing methods on
various multi-query workloads that simulate dataset exploration and analysis
processes.</description><identifier>DOI: 10.48550/arxiv.2305.02375</identifier><language>eng</language><subject>Computer Science - Databases ; Computer Science - Learning ; Computer Science - Multimedia</subject><creationdate>2023-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2305.02375$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2305.02375$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>He, Dong</creatorcontrib><creatorcontrib>Zhang, Jieyu</creatorcontrib><creatorcontrib>Daum, Maureen</creatorcontrib><creatorcontrib>Ratner, Alexander</creatorcontrib><creatorcontrib>Balazinska, Magdalena</creatorcontrib><title>MaskSearch: Querying Image Masks at Scale</title><description>Machine learning tasks over image databases often generate masks that
annotate image content (e.g., saliency maps, segmentation maps, depth maps) and
enable a variety of applications (e.g., determine if a model is learning
spurious correlations or if an image was maliciously modified to mislead a
model). While queries that retrieve examples based on mask properties are
valuable to practitioners, existing systems do not support them efficiently. In
this paper, we formalize the problem and propose MaskSearch, a system that
focuses on accelerating queries over databases of image masks while
guaranteeing the correctness of query results. MaskSearch leverages a novel
indexing technique and an efficient filter-verification query execution
framework. Experiments with our prototype show that MaskSearch, using indexes
approximately 5% of the compressed data size, accelerates individual queries by
up to two orders of magnitude and consistently outperforms existing methods on
various multi-query workloads that simulate dataset exploration and analysis
processes.</description><subject>Computer Science - Databases</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Multimedia</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzj1vwjAUhWEvHarAD-iEV4aE6zj-6lahtiCBqirs0bWvA1GTqnIKKv--4mM6wysdPYw9CSgqqxQsMP11p6KUoAoopVGPbL7F8auOmMLhmX8eYzp333u-HnAf-SWNHH95HbCPE_bQYj_G6X0ztnt73S1X-ebjfb182eSojco9WtOSKKFtDQVvgEAY6ZyjUAEJQUTeaGWhcugwWJI6eCl91Nb7AEpmbHa7vVqbn9QNmM7NxdxczfIf1p07FQ</recordid><startdate>20230503</startdate><enddate>20230503</enddate><creator>He, Dong</creator><creator>Zhang, Jieyu</creator><creator>Daum, Maureen</creator><creator>Ratner, Alexander</creator><creator>Balazinska, Magdalena</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230503</creationdate><title>MaskSearch: Querying Image Masks at Scale</title><author>He, Dong ; Zhang, Jieyu ; Daum, Maureen ; Ratner, Alexander ; Balazinska, Magdalena</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-ba87fd120ff7dcb70d0173999dc40d11dddb7658049a9ac8d36cb33be68bbc053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Databases</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Multimedia</topic><toplevel>online_resources</toplevel><creatorcontrib>He, Dong</creatorcontrib><creatorcontrib>Zhang, Jieyu</creatorcontrib><creatorcontrib>Daum, Maureen</creatorcontrib><creatorcontrib>Ratner, Alexander</creatorcontrib><creatorcontrib>Balazinska, Magdalena</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>He, Dong</au><au>Zhang, Jieyu</au><au>Daum, Maureen</au><au>Ratner, Alexander</au><au>Balazinska, Magdalena</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MaskSearch: Querying Image Masks at Scale</atitle><date>2023-05-03</date><risdate>2023</risdate><abstract>Machine learning tasks over image databases often generate masks that
annotate image content (e.g., saliency maps, segmentation maps, depth maps) and
enable a variety of applications (e.g., determine if a model is learning
spurious correlations or if an image was maliciously modified to mislead a
model). While queries that retrieve examples based on mask properties are
valuable to practitioners, existing systems do not support them efficiently. In
this paper, we formalize the problem and propose MaskSearch, a system that
focuses on accelerating queries over databases of image masks while
guaranteeing the correctness of query results. MaskSearch leverages a novel
indexing technique and an efficient filter-verification query execution
framework. Experiments with our prototype show that MaskSearch, using indexes
approximately 5% of the compressed data size, accelerates individual queries by
up to two orders of magnitude and consistently outperforms existing methods on
various multi-query workloads that simulate dataset exploration and analysis
processes.</abstract><doi>10.48550/arxiv.2305.02375</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2305.02375 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2305_02375 |
source | arXiv.org |
subjects | Computer Science - Databases Computer Science - Learning Computer Science - Multimedia |
title | MaskSearch: Querying Image Masks at Scale |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T11%3A17%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MaskSearch:%20Querying%20Image%20Masks%20at%20Scale&rft.au=He,%20Dong&rft.date=2023-05-03&rft_id=info:doi/10.48550/arxiv.2305.02375&rft_dat=%3Carxiv_GOX%3E2305_02375%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |