Saarland University, Machine Learning Group, Fak. MI - Mathematik und Informatik, Campus E1 1, 66123 Saarbrücken, Germany     

Machine Learning Group
Department of Mathematics and Computer Science - Saarland University

TEACHING

MACHINE LEARNING

Wintersemester 2012/2013

GENERAL INFORMATION

In a broader perspective machine learning tries to automatize the process of empirical sciences - namely extracting knowledge about natural phenomena from measured data with the goal to either understand better the underlying processes or to make good predictions. Machine learning methods are therefore widely used in different fields: bioinformatics, computer vision, information retrieval, computer linguistics, robotics,...

The lecture gives a broad introduction into machine learning methods. After the lecture the students should be able to solve and analyze learning problems.

List of topics (tentative)

  • Reminder of probability theory
  • Maximum Likelihood/Maximum A Posteriori Estimators
  • Bayesian decision theory
  • Linear classification and regression
  • Kernel methods
  • Model selection and evaluation of learning methods
  • Feature selection
  • Nonparametric methods
  • Boosting, Decision trees
  • Neural networks
  • Structured Output
  • Semi-supervised learning
  • Unsupervised learning (Clustering, Independent Component Analysis)
  • Dimensionality Reduction and Manifold Learning
  • Statistical learning theory

Previous knowledge of machine learning is not required. The participants should be familiar with linear algebra, analysis and probability theory on the level of the local `Mathematics for Computer Scienticists I-III' lectures. In particular, attendees should be familiar with

Type: Core lecture (Stammvorlesung), 9 credit points

LECTURE MATERIAL

Lecture notes: PDF  . It is not recommended to print them as these notes will updated over the semester.

The practical exercises will be in Matlab.

SLIDES AND EXCERCISES

17.10. - Introduction Exercise 1 Solution 1
22.10. - Revision: Probability Theory
24.10. - Revision: Probability Theory (2) Exercise 2 Solution 2
29.10. - Bayesian Decision Theory Matlab Decision Boundary Demo
31.10. - Empirical risk minimization Exercise 3 Solution 3
05.12. - Linear Regression
07.11. - Linear Regression (2) Exercise 4 Solution 4 Data
12.11. - Introduction to Optimization
14.11. - Optimization (cont) Exercise 5 Solution 5 Data
19.11. - Linear Classification
21.11. - Linear Classification Exercise 6 Solution 6 Data
26.11. - Kernel Methods
28.11. - Lecture canceled Exercise 7 Solution 7 Data
03.12. - Kernel Methods II
05.12. - Kernel Methods III Exercise 8 Solution 8
10.12. - Evaluation, ROC-Curve, AUC
12.12. - Tests, Model selection Exercise 9 Solution 9 Data
17.12. - Lecture canceled
19.12. - Tests, Model selection continued
07.01. - Feature selection I
09.01. - Feature selection II Exercise 10 Solution 10 Data
14.01. - Boosting
16.01. - Boosting (cont.) Exercise 11 Solution 11 Data
21.01. - Decision Trees, Neural Networks
and Nearest Neighbor Methods
23.01. - Semi-supervised Learning
28.01. - SSL (continued)
30.01. - K-Means and Spectral Clustering
04.02. - Hierarchical Clustering
04.02. - Dimensionality Reduction

LITERATURE AND OTHER RESOURCES

The lecture will be partially based on the following books and partially on recent research papers:

  • R.O. Duda, P.E. Hart, and D.G.Stork: Pattern Classification, Wiley, (2000).
  • B. Schoelkopf and A. J. Smola: Learning with Kernels, MIT Press, (2002).
  • J. Shawe-Taylor and N. Christianini: Kernel Methods for Pattern Analysis, Cambridge University Press, (2004).
  • C. M. Bishop: Pattern recognition and Machine Learning, Springer, (2006).
  • T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning, Springer, second edition, (2008).
  • L. Devroye, L. Gyoerfi, G. Lugosi: A Probabilistic Theory of Pattern Recognition, Springer, (1996).
  • L. Wasserman: All of Statistics, Springer, (2004).
  • S. Boyd and L. Vandenberghe: Convex Optimization, Cambridge University Press, (2004).

Other resources:

NEWS

Exam Inspection: Monday, 15.4.2013 from 15.00 to 17.00 in E1 1, Room 222.2.

Result of Re-exam and Final Result: can be found here  

HISPOS Registration for mathematics students is now possible !

The results of the competition in exercise 9 can be found here.

Google Group for the Lecture: We have set up a google group for the lecture. The idea is that discussions and comments/corrections are more quickly spread to all of you. Posts can be either done directly on the web (google account required) or via email.

New date for endterm exam: Due to collision with other lectures the endterm exam is moved to 19.2., 14-17.

Download CVX: Link zur CVX Matlab toolbox - here

Correction of Ex. Sheet 1: The expression for the Hessian in 2b) has been corrected.

Linear Algebra Tutorial: A quick reminder of the basic ideas of linear algebra can be found in the tutorial   of Mark Schmidt (I did not check it for correctness!). Apart from the LU factorization this summarizes all what is used in the lecture in a non-formal way.

Official registration in HISPOS: You have to register officially for the course here until 4.11.2012. You can unroll until two weeks before the final exam.

Exercises: The exercise sheets (handwritten) have to be submitted each wednesday before the lecture. You are allowed to submit in groups of up to three students which all have to belong to the same group. Write the label of your exercise group (A,B,C) together with the names/matrikel numbers of all members of your group on the first sheet.

TIME AND LOCATION

Lecture: Mo, 10-12, and We, 10-12, E1 3, HS II

Exercise groups:

  • Group A, Tue, 14-16 - SR U12, E1 1, Sahely Bhadra
  • Group B, Th, 12-14 - SR 016, E1 3, Anastasia Podosinnikova
  • Group C, Fr 14-16 - SR 015, E1 3, Pavel Kolev

EXAMS AND GRADING

Exams: End-term: tba, Re-exam: tba

Grading:

  • 50% of the points in the exercises (up to that point) are needed to take part in the exams (end-term/re-exam). In order to being admitted for the endterm and re-exam, you need to have presented properly once a solution in the exercise groups.
  • An exam is passed if you get at least 50% of the points.
  • The grading is based on the best result of the end-term and re-exam

LECTURER

Prof. Dr. Matthias Hein

Office Hours: Mo, 16-18, Do, 16-18

Organization: Thomas Buehler, tb@cs.uni-sb.de