THEME DETECTION WITHIN A CORPUS OF INFORMATION

Systems and methods are used to detect underlying themes from a collection of documents at an aggregated level. A representative set of documents may be selected from a cluster of documents, with the representative set of documents corresponding to a general theme of the cluster. Candidate theme phr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shapira, Sharon, Chidambaram, Senthil C, Kapoor, Ankit, Nadig, Deepak Seetharam, Gangadharaiah, Rashmi, Bhattacharjee, Kasturi, Ng, Tony Chun Tung
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Shapira, Sharon
Chidambaram, Senthil C
Kapoor, Ankit
Nadig, Deepak Seetharam
Gangadharaiah, Rashmi
Bhattacharjee, Kasturi
Ng, Tony Chun Tung
description Systems and methods are used to detect underlying themes from a collection of documents at an aggregated level. A representative set of documents may be selected from a cluster of documents, with the representative set of documents corresponding to a general theme of the cluster. Candidate theme phrases may then be extracted from the documents and used to generate document embeddings and candidate phrase embeddings, which may be ranked, such as with a diversity-based ranking approach. Certain candidates may be selected from the ranking. Each of the documents forming the representative set may then be concatenated and a query embedding may be generated and ranked against the candidate phrases. In this manner, a collection of phrases associated with both the general underlying theme of the cluster, along with granular topics associated with that theme, may be identified.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2024160651A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2024160651A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2024160651A13</originalsourceid><addsrcrecordid>eNrjZNAL8XD1dVVwcQ1xdQ7x9PdTCPcM8fD0U3BUcPYPCggNVvB3U_D0c_MP8nUESfMwsKYl5hSn8kJpbgZlN9cQZw_d1IL8-NTigsTk1LzUkvjQYCMDIxNDMwMzU0NHQ2PiVAEAdAcmnA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>THEME DETECTION WITHIN A CORPUS OF INFORMATION</title><source>esp@cenet</source><creator>Shapira, Sharon ; Chidambaram, Senthil C ; Kapoor, Ankit ; Nadig, Deepak Seetharam ; Gangadharaiah, Rashmi ; Bhattacharjee, Kasturi ; Ng, Tony Chun Tung</creator><creatorcontrib>Shapira, Sharon ; Chidambaram, Senthil C ; Kapoor, Ankit ; Nadig, Deepak Seetharam ; Gangadharaiah, Rashmi ; Bhattacharjee, Kasturi ; Ng, Tony Chun Tung</creatorcontrib><description>Systems and methods are used to detect underlying themes from a collection of documents at an aggregated level. A representative set of documents may be selected from a cluster of documents, with the representative set of documents corresponding to a general theme of the cluster. Candidate theme phrases may then be extracted from the documents and used to generate document embeddings and candidate phrase embeddings, which may be ranked, such as with a diversity-based ranking approach. Certain candidates may be selected from the ranking. Each of the documents forming the representative set may then be concatenated and a query embedding may be generated and ranked against the candidate phrases. In this manner, a collection of phrases associated with both the general underlying theme of the cluster, along with granular topics associated with that theme, may be identified.</description><language>eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240516&amp;DB=EPODOC&amp;CC=US&amp;NR=2024160651A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25543,76293</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240516&amp;DB=EPODOC&amp;CC=US&amp;NR=2024160651A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Shapira, Sharon</creatorcontrib><creatorcontrib>Chidambaram, Senthil C</creatorcontrib><creatorcontrib>Kapoor, Ankit</creatorcontrib><creatorcontrib>Nadig, Deepak Seetharam</creatorcontrib><creatorcontrib>Gangadharaiah, Rashmi</creatorcontrib><creatorcontrib>Bhattacharjee, Kasturi</creatorcontrib><creatorcontrib>Ng, Tony Chun Tung</creatorcontrib><title>THEME DETECTION WITHIN A CORPUS OF INFORMATION</title><description>Systems and methods are used to detect underlying themes from a collection of documents at an aggregated level. A representative set of documents may be selected from a cluster of documents, with the representative set of documents corresponding to a general theme of the cluster. Candidate theme phrases may then be extracted from the documents and used to generate document embeddings and candidate phrase embeddings, which may be ranked, such as with a diversity-based ranking approach. Certain candidates may be selected from the ranking. Each of the documents forming the representative set may then be concatenated and a query embedding may be generated and ranked against the candidate phrases. In this manner, a collection of phrases associated with both the general underlying theme of the cluster, along with granular topics associated with that theme, may be identified.</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZNAL8XD1dVVwcQ1xdQ7x9PdTCPcM8fD0U3BUcPYPCggNVvB3U_D0c_MP8nUESfMwsKYl5hSn8kJpbgZlN9cQZw_d1IL8-NTigsTk1LzUkvjQYCMDIxNDMwMzU0NHQ2PiVAEAdAcmnA</recordid><startdate>20240516</startdate><enddate>20240516</enddate><creator>Shapira, Sharon</creator><creator>Chidambaram, Senthil C</creator><creator>Kapoor, Ankit</creator><creator>Nadig, Deepak Seetharam</creator><creator>Gangadharaiah, Rashmi</creator><creator>Bhattacharjee, Kasturi</creator><creator>Ng, Tony Chun Tung</creator><scope>EVB</scope></search><sort><creationdate>20240516</creationdate><title>THEME DETECTION WITHIN A CORPUS OF INFORMATION</title><author>Shapira, Sharon ; Chidambaram, Senthil C ; Kapoor, Ankit ; Nadig, Deepak Seetharam ; Gangadharaiah, Rashmi ; Bhattacharjee, Kasturi ; Ng, Tony Chun Tung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2024160651A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>Shapira, Sharon</creatorcontrib><creatorcontrib>Chidambaram, Senthil C</creatorcontrib><creatorcontrib>Kapoor, Ankit</creatorcontrib><creatorcontrib>Nadig, Deepak Seetharam</creatorcontrib><creatorcontrib>Gangadharaiah, Rashmi</creatorcontrib><creatorcontrib>Bhattacharjee, Kasturi</creatorcontrib><creatorcontrib>Ng, Tony Chun Tung</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shapira, Sharon</au><au>Chidambaram, Senthil C</au><au>Kapoor, Ankit</au><au>Nadig, Deepak Seetharam</au><au>Gangadharaiah, Rashmi</au><au>Bhattacharjee, Kasturi</au><au>Ng, Tony Chun Tung</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>THEME DETECTION WITHIN A CORPUS OF INFORMATION</title><date>2024-05-16</date><risdate>2024</risdate><abstract>Systems and methods are used to detect underlying themes from a collection of documents at an aggregated level. A representative set of documents may be selected from a cluster of documents, with the representative set of documents corresponding to a general theme of the cluster. Candidate theme phrases may then be extracted from the documents and used to generate document embeddings and candidate phrase embeddings, which may be ranked, such as with a diversity-based ranking approach. Certain candidates may be selected from the ranking. Each of the documents forming the representative set may then be concatenated and a query embedding may be generated and ranked against the candidate phrases. In this manner, a collection of phrases associated with both the general underlying theme of the cluster, along with granular topics associated with that theme, may be identified.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng
recordid cdi_epo_espacenet_US2024160651A1
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
title THEME DETECTION WITHIN A CORPUS OF INFORMATION
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T16%3A54%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=Shapira,%20Sharon&rft.date=2024-05-16&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2024160651A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true