Logo du site
  • English
  • Français
  • Se connecter
Logo du site
  • English
  • Français
  • Se connecter
  1. Accueil
  2. Université de Neuchâtel
  3. Publications
  4. Authorship attribution based on a probabilistic topic model
 
  • Details
Options
Vignette d'image

Authorship attribution based on a probabilistic topic model

Auteur(s)
Savoy, Jacques 
Institut d'informatique 
Date de parution
2013
In
Information Processing and Management
Vol.
49
No
1
De la page
341
A la page
354
Mots-clés
  • Authorship attribution
  • Text categorization
  • Machine learning
  • Lexical statistics
  • Authorship attributio...

  • Text categorization

  • Machine learning

  • Lexical statistics

Résumé
This paper describes, evaluates and compares the use of <i>Latent Dirichlet allocation</i> (LDA) as an approach to authorship attribution. Based on this generative probabilistic topic model, we can model each document as a mixture of topic distributions with each topic specifying a distribution over words. Based on author profiles (aggregation of all texts written by the same writer) we suggest computing the distance with a disputed text to determine its possible writer. This distance is based on the difference between the two topic distributions. To evaluate different attribution schemes, we carried out an experiment based on 5408 newspaper articles (<i>Glasgow Herald</i>) written by 20 distinct authors. To complement this experiment, we used 4326 articles extracted from the Italian newspaper <i>La Stampa</i> and written by 20 journalists. This research demonstrates that the LDA-based classification scheme tends to outperform the Delta rule, and the <i>Χ</i><sup>2</sup> distance, two classical approaches in authorship attribution based on a restricted number of terms. Compared to the Kullback–Leibler divergence, the LDA-based scheme can provide better effectiveness when considering a larger number of terms.
Identifiants
https://libra.unine.ch/handle/123456789/9564
_
10.1016/j.ipm.2012.06.003
Type de publication
journal article
Dossier(s) à télécharger
 main article: Savoy_Jacques-Authorship_attribution_based_on_a_probabilistic_topic_model-20130109.pdf (761.65 KB)
google-scholar
Présentation du portailGuide d'utilisationStratégie Open AccessDirective Open Access La recherche à l'UniNE Open Access ORCIDNouveautés

Service information scientifique & bibliothèques
Rue Emile-Argand 11
2000 Neuchâtel
contact.libra@unine.ch

Propulsé par DSpace, DSpace-CRIS & 4Science | v2022.02.00