Published 2024-07-26
Keywords
- Sparsity,
- Generalized singular value decomposition,
- Correspondence analysis,
- LASSO,
- Penalized matrix decomposition
How to Cite
Copyright (c) 2024 Hervé Abdi, Vincent Guillemot, Ruiping Liu, Ndeye Niang, Gilbert Saporta, Ju-Chi Yu
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
Correspondence Analysis (CA) is the method of choice to analyze contingency matrices and is widely applied in text analysis, psychometrics, chemometrics etc. But CA becomes difficult to interpret when the number of rows or columns is large, a configuration routinely found in contemporary statistical practice. For principal component analysis (PCA), this interpretation problem has been traditionally handled with rotation and more recently with sparsification methods such as the LASSO. Curiously, despite the strong connections between CA and PCA, sparsifying correspondence analysis remains essentially unexplored. In this paper, we derive an extension of the Penalized Matrix Decomposition (a method based on the singular value decomposition) to sparsify CA. We present some theoretical results and properties of the resulting sparse correspondence analysis and illustrate this new method with an analysis of the causes of deaths in the United States in 2019.