Saarland University, Machine Learning Group, Fak. MI - Mathematik und Informatik, Campus E1 1, 66123 Saarbrücken, Germany     

Machine Learning Group
Department of Mathematics and Computer Science - Saarland University


by Anastasia Podosinnikova, Simon Setzer and Matthias Hein


Principal Component Analysis (PCA), a standard tool for feature selection and dimensionality reduction in data analysis, can be strongly affected by outliers such that even a single outlier can change the principal components (PCs) drastically. This phenomenon motivates development of robust PCA methods which recover the PCs of the uncontaminated data.

It is well known that finding the first k standard PCs can be equivalently formulated as finding the k-dimensional subspace of maximum variance in the data or the k-dimensional affine subspace with minimal reconstruction error. As opposed to standard PCA, robust PCA formulations based on the maximization of robust estimators of the variance and the minimization of robust estimators of reconstruction error are not equivalent anymore. As the former approach was discovered a lot in the literature, the latter did not receive much attention. In [1], we propose a new algorithm for robust PCA, called TRPCA, which finds a robust center and robust PCs of data through the minimization of a robust version of the reconstruction error over the Stiefel manifold , that is

where reconstruction errors are sorted in non-decreasing order, i.e. , and t  is a lower bound on the number of outliers with the default value . The advantages of our algorithm, among others, are fast running time and absence of parameters which are non-trivial to adjust. Moreover, optimization over the Stiefel manifold allows TRPCA to avoid the deflation procedure which often leads to significant errors.



The TRPCA algorithm for robust PCA has been developed by Anastasia Podosinnikova, Simon Setzer and Matthias Hein, Department of Computer Science, Saarland University, Germany. The code for TRPCA is published as free software under the terms of the GNU GPL v3.0. Please include a reference to the paper Robust PCA: Optimization of the trimmed reconstruction error over the Stiefel manifold and include the original documentation and copyright notice.

Download trpca.m   (Matlab-Code, Version: 1.0)


[1] A. Podosinnikova, S. Setzer and M. Hein
Robust PCA: Optimization of the Robust Reconstruction Error on the Stiefel Manifold
accepted at GCPR 2014 PDF  (Supplementary material: PDF )