Vol. 35 No. 3 (2023)
Articles

Multilingual textual data: an approach through multiple factor analysis

Belchin Kostov
Department of Statistics and Operational Research, Universitat Politècnica de Catalunya, C/ Jordi Girona 1-3, 08034 Barcelona
Ramón Alvarez-Esteban
Department of Economics and Statistics, Universidad de León, Campus de Vegazana s/n, 24071 León
Mónica Bécue-Bertaut
Department of Statistics and Operational Research, Universitat Politècnica de Catalunya, C/ Jordi Girona 1-3, 08034 Barcelona
François Husson
Institut Agro, Univ Rennes 1, CNRS, IRMAR, 35000, Rennes

Published 2024-07-26

Keywords

  • Correspondence analysis,
  • Lexical tables,
  • Textual and contextual data,
  • Multiple factor analysis,
  • Generalised aggregated lexical table

How to Cite

Kostov, B., Alvarez-Esteban, R., Bécue-Bertaut, M., & Husson, F. (2024). Multilingual textual data: an approach through multiple factor analysis. Italian Journal of Applied Statistics, 35(3), 339–357. https://doi.org/10.26398/IJAS.0035-015

Abstract

This paper focuses on the analysis of open-ended questions answered in different languages. Closed-ended questions, called contextual variables, are asked to all respondents in order to understand the relationships between open-ended and closedended responses across samples, as the latter are likely to influence word choice. We have developed "Multiple Factor Analysis on Generalised Aggregated Lexical Tables" (MFAGALT) to examine together open-ended responses in different languages through the relationships between word choice and the variables that drive that choice. MFA-GALT investigates whether the variability between words is structured in the same way as the variability between variables, and vice versa, from one sample to another. An application
to an international satisfaction survey shows the easy-to-interpret results proposed.