Machine Learning

Wintersemester 2009/2010


News


  • Final Grades

  • Re-exam:Results

  • Klausureinsicht: You can check your re-exam on Friday, 16.04., 14-18, E1 1, Room 225.

Time & Location


Lecture: We, 14.15-16, and Fr, 10.15-12, E1 3, HS III

Exercise groups:
  • Group A, Wednesday, 10-12, Seminar room 15, E1 3, tutor: Martin Slawski
  • Group B, Wednesday, 16-18, Seminar room 16, E1 3, tutor: Thomas Buehler
  • Group C, Thursday, 14-16, Seminar room 3 (216), E2 4, tutor: Radu Curticapean

Exams and Grading


  • Exams:     Mid-term: 11.12. , 14-18 Uhr Hoersaal 1, E1 3     End-term: 12.2. , 14-18 Uhr Hoersaal 1, E1 3    Re-exam: 29.3.

  • Grading:

    • 50% of the points in the exercises (up to that point) are needed to take part in the exams (mid-term/end-term/re-exam). In order to being admitted for the endterm and re-exam, you need to have presented properly once a solution in the exercise groups.
    • An exam is passed if you get at least 50% of the points.
    • The grading is based on the two best results out of the mid-term, end-term and re-exam.

Lecturer


Jun.-Prof. Dr. Matthias Hein

Office Hours: Mo, 16-18, Do, 16-18

General Information


In a broader perspective machine learning tries to automatize the process of empirical sciences - namely extracting knowledge about natural phenomena from measured data with the goal to either understand better the underlying processes or to make good predictions. Machine learning methods are therefore widely used in different fields: bioinformatics, computer vision, information retrieval, computer linguistics, robotics,...

The lecture gives a broad introduction into machine learning methods. After the lecture the students should be able to solve and analyze learning problems.

List of topics (tentative)
  • Reminder of probability theory
  • Bayesian decision theory
  • Linear classification and regression
  • Kernel methods
  • Model selection and evaluation of learning methods
  • Feature selection
  • Nonparametric methods
  • Boosting, Decision trees
  • Neural networks
  • Semi-supervised learning
  • Unsupervised learning (Clustering, Independent Component Analysis)
  • Dimensionality Reduction and Manifold Learning
  • Bayesian learning
  • Graphical Models (tentative)
  • Statistical learning theory
Previous knowledge of machine learning and probability theory is useful but not required. The participants should be familiar with the basics of linear algebra and analysis.

Type: Core lecture (Stammvorlesung), 9 credit points

Lecture material


Incremental lecture notes (last update: 27.01.2010).

The practical exercises will be in Matlab.

Slides and Exercises


14.10. - Introduction

          

16.10. - Reminder: Probability Theory

    Exercise 1     Solution 1    

21.10. - Prob. Th. continued/Bayesian Decision Theory

            Decision boundary demo

23.10. - Bayesian Decision Theory

    Exercise 2     Solution 2    

28.10. - Empirical risk minimization

           

30.10. - Linear Regression

    Exercise 3     Solution 3     Data

03.11. - Introduction to Optimization

           

06.11. - Linear Classification

    Exercise 4     Solution 4     Data for Ex. 8

11.11. - Linear Classification (ctd.)

           

13.11. - Kernels

    Exercise 5     Solution 5     Data for Ex. 10/11

18.11. - Kernels (RKHS+Representer Th.)

           

20.11. - Kernels on structured objects

    Exercise 6     Solution 6     Data for Exercise 14

25.11. - Kernels on structured objects

           

27.11. - Evaluation, ROC-Curve, AUC

    Exercise 7     Solution 7     Data for Exercise 15 and 16    

02.12. - Evaluation, ROC-Curve, AUC (cont.)

               

04.12. - Class. Comparison, Model selection

    Exercise 8     Solution 8     Data for Exercise 19

09.12. - Lecture canceled

           

11.12. - Midterm exam

           

16.12. - Feature selection I

           

18.12. - Feature selection II

           

06.01. - Boosting

           

08.01. - Decision Trees, Neural Networks
and Nearest Neighbor Methods

    Exercise 9     Solution 9     Data for Exercise 20/21

13.01. - Semi-supervised Learning

           

15.01. - K-Means and Spectral Clustering

    Exercise 10     Solution 10    

20.01. - K-Means and Spectral Clustering (cont.)

           

22.01. - Hierarchical Clustering

    (Last) Exercise 11     Solution 11     Data for Sheet 11

27.01. - Dimensionality Reduction

           

03.02. - Statistical Learning Theory I

           


Literature and other resources


  • The lecture will be partially based on the following books and partially on recent research papers:

    • R.O. Duda, P.E. Hart, and D.G.Stork: Pattern Classification, Wiley, (2000).

    • B. Schoelkopf and A. J. Smola: Learning with Kernels, MIT Press, (2002).

    • J. Shawe-Taylor and N. Christianini: Kernel Methods for Pattern Analysis, Cambridge University Press, (2004).

    • C. M. Bishop: Pattern recognition and Machine Learning, Springer, (2006).

    • T. Hastie, R. Tibshirani, J. Friedman: The Elements of Statistical Learning, Springer, second edition, (2008).

    • L. Devroye, L. Gyoerfi, G. Lugosi: A Probabilistic Theory of Pattern Recognition, Springer, (1996).

    • L. Wasserman: All of Statistics, Springer, (2004).

    • S. Boyd and L. Vandenberghe: Convex Optimization, Cambridge University Press, (2004).


  • Other resources:
    • Matlab is available on cip[101-114] and cip[220-238].studcs.uni-sb.de, gpool[01-27].studcs.uni-sb.de
      The path is /usr/local/matlab/bin.
      For the sun workstations you have to select in the menu Applications/studcsApplications/Matlab
      Access from outside should be possible via ssh: ssh -X username@computername.studcs.uni-sb.de

    • Material for Matlab: