Towards Story Understanding and Search - Web Mining Methods and Tools for Exploration, Search and Discovery (Web mining methoden en applicaties voor exploratie, zoeken en ontdekking voor het verstaan van en zoeken in verhaallijnen)

Over the past decade the Internet became one of the leading sources of news content, and using different news provider services available on the Internet has for many people become the main medium for staying informed about the world. Such services support Internet users in interaction with stories....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Subasic, Ilija
Format: Dissertation
Sprache:dut
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Over the past decade the Internet became one of the leading sources of news content, and using different news provider services available on the Internet has for many people become the main medium for staying informed about the world. Such services support Internet users in interaction with stories. In this thesis, we regard a story as a set of time-stamped documents describing correlated subjects, such as for example persons, event descriptions, and topics. Our particular interest is to investigate the time dimension of stories and particularly story tracking following a story over time. The goal of different research areas interested in story tracking is to identify and highlight developments novel and relevant information in a story. In this work we restrict ourselves to news collections and investigate effectiveness and usability of temporal text mining (TTM) story tracking methods.Across the thesis we investigate four areas related to stories: (a) stories and search engines; (b) story tracking methods and tools, (c) story tracking evaluation frameworks, and (d) stories and sources. We formalize these 4 thematic areas into more concrete research questions addressed in this thesis: (Q1) How are search engines affected by story developments? (Q2) Does the semi-automatic story tracking approach we developed enable user comprehension and navigation of stories? (Q3) Can the graph-based patterns extracted by our algorithm be used for story tracking? (Q4) How can different bursty text patterns be used for discovering origins of the changes in document sets? (Q5) How do users interact with interfaces for story tracking? (Q6): How to measure differences between a story across different sources?We start by exploring how search engine users change their behaviour when new developments emerge in a story. For this we investigate a one-year long query log from a leading commercial search engine, and describe the changes of user behaviour correlated with the emergence of new developments. Then, we continue by exploring story tracking methods and tools as means for accommodating for these changes in user behaviour. We propose a new, graph-based, story tracking method and build a tool to support it. Additionally, we investigate the effectiveness of story tracking methods and define a new framework for automatic and user oriented evaluation. Although there are many TTM methods developed, there is a lack of common evaluation procedure. We propose an evaluation framework