A Noisy-Channel Model for Document Compression
We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hie...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Report |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Daume, III, Hal Marcu, Daniel |
description | We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a baseline and a sentence-based compression system that operates by simplifying sequentially all sentences in a text. Our results support the claim that discourse knowledge plays an important role in document summarization. |
format | Report |
fullrecord | <record><control><sourceid>dtic_1RU</sourceid><recordid>TN_cdi_dtic_stinet_ADA459360</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>ADA459360</sourcerecordid><originalsourceid>FETCH-dtic_stinet_ADA4593603</originalsourceid><addsrcrecordid>eNrjZNBzVPDLzyyu1HXOSMzLS81R8M1PAZJp-UUKLvnJpbmpeSUKzvm5BUWpxcWZ-Xk8DKxpiTnFqbxQmptBxs01xNlDN6UkMzm-uCQzL7Uk3tHF0cTU0tjMwJiANAANAycp</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>report</recordtype></control><display><type>report</type><title>A Noisy-Channel Model for Document Compression</title><source>DTIC Technical Reports</source><creator>Daume, III, Hal ; Marcu, Daniel</creator><creatorcontrib>Daume, III, Hal ; Marcu, Daniel ; UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST</creatorcontrib><description>We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a baseline and a sentence-based compression system that operates by simplifying sequentially all sentences in a text. Our results support the claim that discourse knowledge plays an important role in document summarization.</description><language>eng</language><subject>COMPRESSION ; DOCUMENTS ; GRAMMARS ; HIERARCHIES ; Linguistics ; MATHEMATICAL MODELS ; SIMPLIFICATION ; STATISTICAL ANALYSIS ; SYNTAX ; WORDS(LANGUAGE)</subject><creationdate>2002</creationdate><rights>Approved for public release; distribution is unlimited.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,777,882,27548,27549</link.rule.ids><linktorsrc>$$Uhttps://apps.dtic.mil/sti/citations/ADA459360$$EView_record_in_DTIC$$FView_record_in_$$GDTIC$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Daume, III, Hal</creatorcontrib><creatorcontrib>Marcu, Daniel</creatorcontrib><creatorcontrib>UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST</creatorcontrib><title>A Noisy-Channel Model for Document Compression</title><description>We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a baseline and a sentence-based compression system that operates by simplifying sequentially all sentences in a text. Our results support the claim that discourse knowledge plays an important role in document summarization.</description><subject>COMPRESSION</subject><subject>DOCUMENTS</subject><subject>GRAMMARS</subject><subject>HIERARCHIES</subject><subject>Linguistics</subject><subject>MATHEMATICAL MODELS</subject><subject>SIMPLIFICATION</subject><subject>STATISTICAL ANALYSIS</subject><subject>SYNTAX</subject><subject>WORDS(LANGUAGE)</subject><fulltext>true</fulltext><rsrctype>report</rsrctype><creationdate>2002</creationdate><recordtype>report</recordtype><sourceid>1RU</sourceid><recordid>eNrjZNBzVPDLzyyu1HXOSMzLS81R8M1PAZJp-UUKLvnJpbmpeSUKzvm5BUWpxcWZ-Xk8DKxpiTnFqbxQmptBxs01xNlDN6UkMzm-uCQzL7Uk3tHF0cTU0tjMwJiANAANAycp</recordid><startdate>200201</startdate><enddate>200201</enddate><creator>Daume, III, Hal</creator><creator>Marcu, Daniel</creator><scope>1RU</scope><scope>BHM</scope></search><sort><creationdate>200201</creationdate><title>A Noisy-Channel Model for Document Compression</title><author>Daume, III, Hal ; Marcu, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-dtic_stinet_ADA4593603</frbrgroupid><rsrctype>reports</rsrctype><prefilter>reports</prefilter><language>eng</language><creationdate>2002</creationdate><topic>COMPRESSION</topic><topic>DOCUMENTS</topic><topic>GRAMMARS</topic><topic>HIERARCHIES</topic><topic>Linguistics</topic><topic>MATHEMATICAL MODELS</topic><topic>SIMPLIFICATION</topic><topic>STATISTICAL ANALYSIS</topic><topic>SYNTAX</topic><topic>WORDS(LANGUAGE)</topic><toplevel>online_resources</toplevel><creatorcontrib>Daume, III, Hal</creatorcontrib><creatorcontrib>Marcu, Daniel</creatorcontrib><creatorcontrib>UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST</creatorcontrib><collection>DTIC Technical Reports</collection><collection>DTIC STINET</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Daume, III, Hal</au><au>Marcu, Daniel</au><aucorp>UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST</aucorp><format>book</format><genre>unknown</genre><ristype>RPRT</ristype><btitle>A Noisy-Channel Model for Document Compression</btitle><date>2002-01</date><risdate>2002</risdate><abstract>We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constituents so as to generate coherent, grammatical document compressions of arbitrary length. The system outperforms both a baseline and a sentence-based compression system that operates by simplifying sequentially all sentences in a text. Our results support the claim that discourse knowledge plays an important role in document summarization.</abstract><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | |
ispartof | |
issn | |
language | eng |
recordid | cdi_dtic_stinet_ADA459360 |
source | DTIC Technical Reports |
subjects | COMPRESSION DOCUMENTS GRAMMARS HIERARCHIES Linguistics MATHEMATICAL MODELS SIMPLIFICATION STATISTICAL ANALYSIS SYNTAX WORDS(LANGUAGE) |
title | A Noisy-Channel Model for Document Compression |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A11%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-dtic_1RU&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=A%20Noisy-Channel%20Model%20for%20Document%20Compression&rft.au=Daume,%20III,%20Hal&rft.aucorp=UNIVERSITY%20OF%20SOUTHERN%20CALIFORNIA%20MARINA%20DEL%20REY%20INFORMATION%20SCIENCES%20INST&rft.date=2002-01&rft_id=info:doi/&rft_dat=%3Cdtic_1RU%3EADA459360%3C/dtic_1RU%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |