Saarland University, Machine Learning Group, Fak. MI - Mathematik und Informatik, Campus E1 1, 66123 Saarbrücken, Germany

Machine Learning Group
Department of Mathematics and Computer Science - Saarland University

# INTRINSIC DIMENSIONALITY ESTIMATION

by Matthias Hein and Jean-Yves Audibert

On this website you can download the code and datasets for intrinsic dimensionality estimation as described in the paper. Together with the new estimator two classical estimates, the correlation dimension and the Takens estimator, are computed.

## CODE

These files are all included in one zip-file. Ensure in the unzipping process that the folder structure is preserved.

 GetDim.cpp Plain C++ code for dimensionality estimation. There are two options. Either one uses data from a file or one generates data. For a detailed description see the ReadMe file. matlab\GetDim.cpp Mex-File for the use in MATLAB. Here the argument is simply the data matrix (sparse matrices are allowed). Output are the three estimates of the intrinsic dimension. For a detailed description see the ReadMe file. matlab\GenerateManifoldData.cpp Mex-File for the generation of manifold data as they were used in the paper (and more). For a detailed description see the ReadMe file. matlab\Dimension_Exp.m Matlab function in order to repeat the experiments as they have been reported in the paper. In order to do that one needs the datasets: Sinusoid, Sphere, Gauss, Moebius and M12. For a detailed description see the ReadMe file.

## DATASETS

We provide here the data as it was used in the paper so that an exact comparison with another method is possible.

 Sinusoid 90 runs of 400, 500 and 600 points. download:sinusoid Sphere 90 runs of 600, 800, 1000 and 1200 points in 4, 6, 8 and 10 dimensions. download:sphere Gaussian 90 runs of 100, 200, 400 and 800 points in 3, 4, 5 and 6 dimensions.download:gauss Moebius 90 runs of 20, 40, 80 and 120 points.download:moebius M12 90 runs of 200, 400, 800 and 1600 points.download:m12

## NOTE

In the paper the results for the 12-dimensional submanifold of R^72 for 800 points have been accidently exchanged with the ones for 600 points. The correct results for 800 points are 71 - 74 -77.

## REFERENCES

M. Hein and J.-Y. Audibert, Intrinsic dimensionality estimation of submanifolds in Euclidean space, Proceedings of the 22nd Internatical Conference on Machine Learning (ICML), 289--296. (Eds.) L. de Raedt and S. Wrobel (2005). download:paper