Saarland University, Machine Learning Group, Fak. MI - Mathematik und Informatik, Campus E1 1, 66123 Saarbrücken, Germany

Machine Learning Group
Department of Mathematics and Computer Science - Saarland University

# TEACHING

## MACHINE LEARNING

Wintersemester 2012/2013

### GENERAL INFORMATION

In a broader perspective machine learning tries to automatize the process of empirical sciences - namely extracting knowledge about natural phenomena from measured data with the goal to either understand better the underlying processes or to make good predictions. Machine learning methods are therefore widely used in different fields: bioinformatics, computer vision, information retrieval, computer linguistics, robotics,...

The lecture gives a broad introduction into machine learning methods. After the lecture the students should be able to solve and analyze learning problems.

List of topics (tentative)

• Reminder of probability theory
• Maximum Likelihood/Maximum A Posteriori Estimators
• Bayesian decision theory
• Linear classification and regression
• Kernel methods
• Model selection and evaluation of learning methods
• Feature selection
• Nonparametric methods
• Boosting, Decision trees
• Neural networks
• Structured Output
• Semi-supervised learning
• Unsupervised learning (Clustering, Independent Component Analysis)
• Dimensionality Reduction and Manifold Learning
• Statistical learning theory

Previous knowledge of machine learning is not required. The participants should be familiar with linear algebra, analysis and probability theory on the level of the local `Mathematics for Computer Scienticists I-III' lectures. In particular, attendees should be familiar with

Type: Core lecture (Stammvorlesung), 9 credit points

### LECTURE MATERIAL

Lecture notes: PDF  . It is not recommended to print them as these notes will updated over the semester.

The practical exercises will be in Matlab.

### SLIDES AND EXCERCISES

 17.10. - Introduction Exercise 1 Solution 1 22.10. - Revision: Probability Theory 24.10. - Revision: Probability Theory (2) Exercise 2 Solution 2 29.10. - Bayesian Decision Theory Matlab Decision Boundary Demo 31.10. - Empirical risk minimization Exercise 3 Solution 3 05.12. - Linear Regression 07.11. - Linear Regression (2) Exercise 4 Solution 4 Data 12.11. - Introduction to Optimization 14.11. - Optimization (cont) Exercise 5 Solution 5 Data 19.11. - Linear Classification 21.11. - Linear Classification Exercise 6 Solution 6 Data 26.11. - Kernel Methods 28.11. - Lecture canceled Exercise 7 Solution 7 Data 03.12. - Kernel Methods II 05.12. - Kernel Methods III Exercise 8 Solution 8 10.12. - Evaluation, ROC-Curve, AUC 12.12. - Tests, Model selection Exercise 9 Solution 9 Data 17.12. - Lecture canceled 19.12. - Tests, Model selection continued 07.01. - Feature selection I 09.01. - Feature selection II Exercise 10 Solution 10 Data 14.01. - Boosting 16.01. - Boosting (cont.) Exercise 11 Solution 11 Data 21.01. - Decision Trees, Neural Networks and Nearest Neighbor Methods 23.01. - Semi-supervised Learning 28.01. - SSL (continued) 30.01. - K-Means and Spectral Clustering 04.02. - Hierarchical Clustering 04.02. - Dimensionality Reduction

### LITERATURE AND OTHER RESOURCES

The lecture will be partially based on the following books and partially on recent research papers:

• R.O. Duda, P.E. Hart, and D.G.Stork: Pattern Classification, Wiley, (2000).
• B. Schoelkopf and A. J. Smola: Learning with Kernels, MIT Press, (2002).
• J. Shawe-Taylor and N. Christianini: Kernel Methods for Pattern Analysis, Cambridge University Press, (2004).
• C. M. Bishop: Pattern recognition and Machine Learning, Springer, (2006).
• T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning, Springer, second edition, (2008).
• L. Devroye, L. Gyoerfi, G. Lugosi: A Probabilistic Theory of Pattern Recognition, Springer, (1996).
• L. Wasserman: All of Statistics, Springer, (2004).
• S. Boyd and L. Vandenberghe: Convex Optimization, Cambridge University Press, (2004).

Other resources:

### NEWS

Exam Inspection: Monday, 15.4.2013 from 15.00 to 17.00 in E1 1, Room 222.2.

Result of Re-exam and Final Result: can be found here

HISPOS Registration for mathematics students is now possible !

The results of the competition in exercise 9 can be found here.

Google Group for the Lecture: We have set up a google group for the lecture. The idea is that discussions and comments/corrections are more quickly spread to all of you. Posts can be either done directly on the web (google account required) or via email.

New date for endterm exam: Due to collision with other lectures the endterm exam is moved to 19.2., 14-17.

Correction of Ex. Sheet 1: The expression for the Hessian in 2b) has been corrected.

Linear Algebra Tutorial: A quick reminder of the basic ideas of linear algebra can be found in the tutorial   of Mark Schmidt (I did not check it for correctness!). Apart from the LU factorization this summarizes all what is used in the lecture in a non-formal way.

Official registration in HISPOS: You have to register officially for the course here until 4.11.2012. You can unroll until two weeks before the final exam.

Exercises: The exercise sheets (handwritten) have to be submitted each wednesday before the lecture. You are allowed to submit in groups of up to three students which all have to belong to the same group. Write the label of your exercise group (A,B,C) together with the names/matrikel numbers of all members of your group on the first sheet.

### TIME AND LOCATION

Lecture: Mo, 10-12, and We, 10-12, E1 3, HS II

Exercise groups:

• Group A, Tue, 14-16 - SR U12, E1 1, Sahely Bhadra
• Group B, Th, 12-14 - SR 016, E1 3, Anastasia Podosinnikova
• Group C, Fr 14-16 - SR 015, E1 3, Pavel Kolev

Exams: End-term: tba, Re-exam: tba

• 50% of the points in the exercises (up to that point) are needed to take part in the exams (end-term/re-exam). In order to being admitted for the endterm and re-exam, you need to have presented properly once a solution in the exercise groups.
• An exam is passed if you get at least 50% of the points.
• The grading is based on the best result of the end-term and re-exam

### LECTURER

Prof. Dr. Matthias Hein

Office Hours: Mo, 16-18, Do, 16-18

Organization: Thomas Buehler, tb@cs.uni-sb.de