Vol. 35 No. 3 (2023)
Articles

Visualization of textual data: a complement to autorship attribution

Ludovic Lebart
Centre National de la Recherche Scientifique (CNRS), Paris

Published 2024-07-26

Keywords

  • Textual data visualization,
  • Authorship attribution,
  • Additive trees,
  • CA

How to Cite

Lebart, L. (2024). Visualization of textual data: a complement to autorship attribution. Italian Journal of Applied Statistics, 35(3), 359–370. https://doi.org/10.26398/IJAS.0035-016

Abstract

In textual data analysis, authorship attribution is precisely a leading case of statistical decision. While analyzing a large corpus of 50 French novels of the 20th century, we investigate the frontiers between descriptive (or unsupervised) methods, and confirmatory (or supervised) methods. It will be shown that Additive Trees applied to the coordinates of a preliminary Correspondence Analysis (CA) can provide both a description and a decision. Our results aim at showing the complementarity between exploratory techniques and I.A. in that field.