Saarland University, Machine Learning Group, Fak. MI - Mathematik und Informatik, Campus E1 1, 66123 Saarbrücken, Germany     

Machine Learning Group
Department of Mathematics and Computer Science - Saarland University

TEACHING

MACHINE LEARNING

Wintersemester 2016/2017

LECTURE MATERIAL

Lecture notes: PDF  (update: 09.01.2017). The notes are pretty stable, but new material might be added during the semester.

The practical exercises will be in Matlab.

The google group of the lecture can be accessed HERE.

SLIDES AND EXCERCISES

25.10. - Introduction Exercise 0 Solution 0
28.10. - Recap Probability Exercise 1 Solution 1
01.11. - Public Holiday
04.11. - Lecture canceled no exercise this week
08.11. - Bayesian Decision Theory     Matlab Decision
Boundary Demo
10.11. - Bayesian Decision Theory (ctd) Exercise 2 Solution 2
15.11. - Emp. Risk Min/Maximum Likelihood    
18.11. - Linear Regression Exercise 3 Solution 3 Data/Material for Exercise 9
22.11. - Smooth Optimization    
25.11. - Smooth Optimization Exercise 4 Solution 4 Material for Problem 10
29.11. - Smooth Optimization Lasso derivation
2.12. - Linear Classification Exercise 5 Solution 5 Data for Problem 13
06.12. - Lecture canceled (NIPS)
09.12. - Lecture canceled (NIPS)
13.12. - Linear SVM/Kernels
16.12. - Kernel Methods Exercise 6 Solution 6 Data for Problem 15
03.01. - Evaluation, ROC-Curve      
06.01. - AUC, Statistical Tests Exercise 7 Solution 7 Data for Problem 16
10.01. - Confidence Intervals, Model selection      
13.01. - Feature selection I Exercise 8 Solution 8 Data for Problem 19/20
17.01. - Feature selection II      
20.01. - Boosting Exercise 9 Solution 9
24.01. - Decision Trees/Nonparametric Methods      
27.01. - Large Scale Learning Exercise 10 Solution 10
31.01. - Neural Networks aka Deep Learning      
03.02. - Semi-supervised Learning Exercise 11 (Last) Solution 11  
07.02. - K-Means and Spectral Clustering      
10.02. - Hierarchical Clustering      
14.02. - Dimensionality Reduction      
17.02. - Statistical Learning Theory      

TIME AND LOCATION

Lecture:

  • Tu, 16-18, HS 002, E1 3
  • Fr, 10-12, HS 002, E1 3

Exercise Groups:

  • Group A, Mo 14-16, SR 015, E1 3, Monday Group 1
  • Group B, Mo 14-16, SR 016, E1 3, Monday Group 2
  • Group C, We 14-16, SR 206, E1 1, Wednesday Group 1
  • Group D, We 14-16, SR U12, E1 1, Wednesday Group 2

  • If copies of previous year's solutions are submitted, this counts as plagiarism. The first time this happens, you get for the full sheet zero points - if it happens again, you are excluded from the course.

EXAMS AND GRADING

Exam: 3.3., 14.00-17.00, E2 2, Re-exam: 7.4. , 14.00-17.00, E2 2

Grading:

  • 50% of the points in the exercises (up to that point) are needed to take part in the exams (end-term/re-exam). In order to being admitted for the endterm and re-exam, you need to have presented properly once a solution in the exercise groups.
  • An exam is passed if you get at least 50% of the points.
  • The grading is based on the best result of the end-term and re-exam

LECTURER

Prof. Dr. Matthias Hein

Office Hours: Mo, 16-18, Do, 16-18

Organization: Antoine Gautier

GENERAL INFORMATION

In a broader perspective machine learning tries to automatize the process of empirical sciences - namely extracting knowledge about natural phenomena from measured data with the goal to either understand better the underlying processes or to make good predictions. Machine learning methods are therefore widely used in different fields: bioinformatics, computer vision, information retrieval, computer linguistics, robotics,...

The lecture gives a broad introduction into machine learning methods. After the lecture the students should be able to solve and analyze learning problems.

List of topics (tentative)

  • Reminder of probability theory
  • Maximum Likelihood/Maximum A Posteriori Estimators
  • Bayesian decision theory
  • Linear classification and regression
  • Kernel methods
  • Model selection and evaluation of learning methods
  • Feature selection
  • Nonparametric methods
  • Boosting, Decision trees
  • Neural networks
  • Structured Output
  • Semi-supervised learning
  • Unsupervised learning (Clustering, Independent Component Analysis)
  • Dimensionality Reduction and Manifold Learning
  • Statistical learning theory

Previous knowledge of machine learning is not required. The participants should be familiar with linear algebra, analysis and probability theory on the level of the local `Mathematics for Computer Scienticists I-III' lectures. In particular, attendees should be familiar with

  • Discrete and continuous probability theory (marginals, conditional probability, random variables, expectation etc.)
    The first three chapters of: L. Wasserman: All of Statistics, Springer, (2004) provide the necessary background
  • Linear algebra (rank, linear systems, eigenvalues, eigenvectors (in particular for symmetric matrices), singular values, determinant)
    A quick reminder of the basic ideas of linear algebra can be found in the tutorial  of Mark Schmidt (I did not check it for correctness!). Apart from the LU factorization this summarizes all what is used in the lecture in a non-formal way.
  • Multivariate analysis (integrals, gradient, Hessian, extrema of multivariate functions)

Type: Core lecture (Stammvorlesung), 9 credit points. The course counts both as a core lecture in computer science and mathematics e.g. it can be used as lecture in mathematics if you study computer science and your minor is mathematics.

LITERATURE AND OTHER RESOURCES

The lecture will be partially based on the following books and partially on recent research papers:

  • R.O. Duda, P.E. Hart, and D.G.Stork: Pattern Classification, Wiley, (2000).
  • B. Schoelkopf and A. J. Smola: Learning with Kernels, MIT Press, (2002).
  • J. Shawe-Taylor and N. Christianini: Kernel Methods for Pattern Analysis, Cambridge University Press, (2004).
  • C. M. Bishop: Pattern recognition and Machine Learning, Springer, (2006).
  • T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning, Springer, second edition, (2008).
  • L. Devroye, L. Gyoerfi, G. Lugosi: A Probabilistic Theory of Pattern Recognition, Springer, (1996).
  • L. Wasserman: All of Statistics, Springer, (2004).
  • S. Boyd and L. Vandenberghe: Convex Optimization, Cambridge University Press, (2004).

Other resources:

NEWS

Exam Results - Second exam: here (Update 19.04.2017)

Test exam
A test exam can be downloaded here.

Organization of the exam on Friday, 03.03, 14.00-17.00.

  • List of admitted students: PDF  
  • If you are admitted and can't register in HISPOS (for example Erasmus students) send a registration email to glaser@cs.uni-saarland.de. Please include your matriculation number.
  • Location: Günther Hotz Hörsaal
  • Please bring your student identity card - otherwise you are not allowed to the exam !
  • Please bring paper for the exam.
  • Be there at 14.00 in order to check your name in the list of allowed candidates.
  • It is a closed book exam - no notes, books or pocket calculators are allowed.
  • Mobile phones, tablets, laptops and other electronic devices have to be turned off.

Group Assignment: can be found here. Currently the groups are overbooked, but given the attendance in the friday lecture we expect that a large fraction of registered students does not show up. If this is not the case we will open up a new group. Students who registered later than Friday, 12.00 are not taken into account

Google Group for the Lecture: We have set up a google group for the lecture. The idea is that discussions and comments/corrections are spread to all of you more quickly. You need to subscribe to the group to post or view messages. You can subscribe from any email account; if you are not using google accounts to subscribe, send a mail to subscribe and then give a blank reply to the "join request" mail you would receive (do not click on "join this group" button in that mail!). Members can post messages here: post

There are posts about regulations regarding retaking the lecture already, please subscribe to read them.

IMPORTANT: the lecture counts as mathematics lecture in the area "applied mathematics". The option "Mathematics" has been forgotten in the registration system. We correct this asap - sorry for that.

DROP OUT: Students who have registered but have changed their decision would help us tremendously if you write to Antoine Gautier a short email including your matrikelnumber that you are *not* taking the course.

Exercise 0: For the ones who want to prepare for the machine learning lecture, I recommend to do Exercise 0 (but really do it and don't just look at the solution !). Based on past experience, there is a high correlation between people who can successfully solve Exercise 0 and the ones who pass the course. That said if you cannot solve this exercise, then I recommend to first repeat/learn the necessary multivariate calculus and linear algebra and then take the course next time. We will talk about Exercise 0 in the first tutorial - you do not have to submit it.