Efficient deduplication of randomized file paths

Disclosed are techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files to be ingested into a data exchange to generate a set of bloom filters, wherein the data exchange includes a metadata storage where metadata including a list of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Muralidhar, Subramanian, Ramachandran, Raghav, Iyer, Ganeshan Ramachandran
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Muralidhar, Subramanian
Ramachandran, Raghav
Iyer, Ganeshan Ramachandran
description Disclosed are techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files to be ingested into a data exchange to generate a set of bloom filters, wherein the data exchange includes a metadata storage where metadata including a list of files ingested is stored. The set of bloom filters may be stored in the metadata storage of the data exchange. In response to receiving a set of candidate files to be ingested into the data exchange, the set of bloom filters may be used to identify from within the set of candidate files, each candidate file that is duplicative of a file in the first set of files.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US11853274B2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US11853274B2</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US11853274B23</originalsourceid><addsrcrecordid>eNrjZDBwTUvLTM5MzStRSElNKS3IyUxOLMnMz1PIT1MoSsxLyc_NrEpNUUjLzElVKEgsySjmYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxocGGhhamxkbmJk5GxsSoAQB21Cud</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Efficient deduplication of randomized file paths</title><source>esp@cenet</source><creator>Muralidhar, Subramanian ; Ramachandran, Raghav ; Iyer, Ganeshan Ramachandran</creator><creatorcontrib>Muralidhar, Subramanian ; Ramachandran, Raghav ; Iyer, Ganeshan Ramachandran</creatorcontrib><description>Disclosed are techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files to be ingested into a data exchange to generate a set of bloom filters, wherein the data exchange includes a metadata storage where metadata including a list of files ingested is stored. The set of bloom filters may be stored in the metadata storage of the data exchange. In response to receiving a set of candidate files to be ingested into the data exchange, the set of bloom filters may be used to identify from within the set of candidate files, each candidate file that is duplicative of a file in the first set of files.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20231226&amp;DB=EPODOC&amp;CC=US&amp;NR=11853274B2$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25544,76293</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20231226&amp;DB=EPODOC&amp;CC=US&amp;NR=11853274B2$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Muralidhar, Subramanian</creatorcontrib><creatorcontrib>Ramachandran, Raghav</creatorcontrib><creatorcontrib>Iyer, Ganeshan Ramachandran</creatorcontrib><title>Efficient deduplication of randomized file paths</title><description>Disclosed are techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files to be ingested into a data exchange to generate a set of bloom filters, wherein the data exchange includes a metadata storage where metadata including a list of files ingested is stored. The set of bloom filters may be stored in the metadata storage of the data exchange. In response to receiving a set of candidate files to be ingested into the data exchange, the set of bloom filters may be used to identify from within the set of candidate files, each candidate file that is duplicative of a file in the first set of files.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDBwTUvLTM5MzStRSElNKS3IyUxOLMnMz1PIT1MoSsxLyc_NrEpNUUjLzElVKEgsySjmYWBNS8wpTuWF0twMim6uIc4euqkF-fGpxQWJyal5qSXxocGGhhamxkbmJk5GxsSoAQB21Cud</recordid><startdate>20231226</startdate><enddate>20231226</enddate><creator>Muralidhar, Subramanian</creator><creator>Ramachandran, Raghav</creator><creator>Iyer, Ganeshan Ramachandran</creator><scope>EVB</scope></search><sort><creationdate>20231226</creationdate><title>Efficient deduplication of randomized file paths</title><author>Muralidhar, Subramanian ; Ramachandran, Raghav ; Iyer, Ganeshan Ramachandran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US11853274B23</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2023</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Muralidhar, Subramanian</creatorcontrib><creatorcontrib>Ramachandran, Raghav</creatorcontrib><creatorcontrib>Iyer, Ganeshan Ramachandran</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Muralidhar, Subramanian</au><au>Ramachandran, Raghav</au><au>Iyer, Ganeshan Ramachandran</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Efficient deduplication of randomized file paths</title><date>2023-12-26</date><risdate>2023</risdate><abstract>Disclosed are techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files to be ingested into a data exchange to generate a set of bloom filters, wherein the data exchange includes a metadata storage where metadata including a list of files ingested is stored. The set of bloom filters may be stored in the metadata storage of the data exchange. In response to receiving a set of candidate files to be ingested into the data exchange, the set of bloom filters may be used to identify from within the set of candidate files, each candidate file that is duplicative of a file in the first set of files.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US11853274B2
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title Efficient deduplication of randomized file paths
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T02%3A56%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Muralidhar,%20Subramanian&rft.date=2023-12-26&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS11853274B2%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true