Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection
In recent years, the increasing use of Artificial Intelligence based text generation tools has posed new challenges in document provenance, authentication, and authorship detection. However, advancements in stylometry have provided opportunities for automatic authorship and author change detection i...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In recent years, the increasing use of Artificial Intelligence based text
generation tools has posed new challenges in document provenance,
authentication, and authorship detection. However, advancements in stylometry
have provided opportunities for automatic authorship and author change
detection in multi-authored documents using style analysis techniques. Style
analysis can serve as a primary step toward document provenance and
authentication through authorship detection. This paper investigates three key
tasks of style analysis: (i) classification of single and multi-authored
documents, (ii) single change detection, which involves identifying the point
where the author switches, and (iii) multiple author-switching detection in
multi-authored documents. We formulate all three tasks as classification
problems and propose a merit-based fusion framework that integrates several
state-of-the-art natural language processing (NLP) algorithms and weight
optimization techniques. We also explore the potential of special characters,
which are typically removed during pre-processing in NLP applications, on the
performance of the proposed methods for these tasks by conducting extensive
experiments on both cleaned and raw datasets. Experimental results demonstrate
significant improvements over existing solutions for all three tasks on a
benchmark dataset. |
---|---|
DOI: | 10.48550/arxiv.2401.06752 |