An Evolutionary-Based Method for Reconstructing Conversation Threads in Email Corpora

Email is a type of Web data which is produced in enormous quantities. It is beneficial to detect conversation threads contained in the email corpora for various applications, including discussion search, expert finding and even email clustering and classification. Conversation thread in email corpor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dehghani, M., Asadpour, M., Shakery, A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Email is a type of Web data which is produced in enormous quantities. It is beneficial to detect conversation threads contained in the email corpora for various applications, including discussion search, expert finding and even email clustering and classification. Conversation thread in email corpora can be defined as a cluster of exchanged emails among the same group of people by reply or forwarding on the same topic. According to this definition, we can define parent-child relation between emails, so email conversation threads seem to demonstrate tree structure. This paper presents a new approach based on genetic programming for reconstruction of conversation threads in emails data. This approach considers finding email conversation threads as an optimization problem, and exploits genetic programming to search intelligently in the space of possible solutions. Rather than several studies that have been conducted on this problem, this work concentrates on detecting accurate structure of conversation threads in high recall. This paper provides a comprehensive evaluation on the BC3 data set. Preliminary results suggest that our method provides acceptable precision and higher recall than existing methods.
DOI:10.1109/ASONAM.2012.195