This course is not limited to the field of computational biology. It covers topics in Machine Learning that are relevant for computer scientists in general as well as for other scientists involved in data analysis and modelling.
Teacher: Thomas Lengauer
Tutor: Adrian Alexa
Language: English
Lecture: 
Wednesday, 11:00  12:45, Building 46, Room 024 (MPI Building) Starting October 19th, 2005 Last lecture on February 15th, 2006 

Tutorial:  Friday, 9:15  10:45 (biweekly), Building 46, Room 021 (MPI Building)  
Office hours: 

This course covers a subject that is relevant for computer scientists
in general as well as for other scientists involved in data analysis
and modelling. It is not limited to the field of computational biology.
The course will be the first part of a two semester course on Statistical Learning.
The first part (WS 2005/2006) will concentrate on chapters 15 and 710 of the book
The Elements of Statistical
Learning, Springer 2001, the follow up course during SS 2006 will continue
with the remaining chapters. In both semesters, there will be two hours of lecture per
week and one hour of tutorial (V2/Ü1), however, the tutorial will actually be two
hours every second week.
Both parts of this lecture fulfil the requirements for the curricula of
computer science and bioinformatics as optional course with 6 resp. 4
credit points (Spezialvorlesung, 6 bzw. 4 Leistungspunkte).
The course is targeted to advanced students in math, computer science and general science with mathematical background. Students should know linear algebra and have basic knowledge of statistics.
You need a cumulative 50% of the points in the homework assingments to be admitted to the written exam. A score of 50% in the exam is then considered a passing grade.
Hastie, Tibshirani, Friedman:
The Elements of
Statistical Learning, Springer 2001. The readers of the course are
encouraged to acquire this book.
More information on this book, as well as a contents listing can be found
here.
The tutorial focuses on both, the material presented in the lecture
and the homework assignments. Usually, a very brief reiteration of parts
of the lecture is given; the focus will be on the last assignment, though.
Homework assignments will cover theoretical proofs and programming
excercises with roughly equal weight.
The programming language that we use is
R  a language for statistical computing.
It is freely available for Windows and Linux and  as a vectorized programming
language  is ideally suited for the problems we
will encounter. There are also many freely available packages (or libraries) to perform
a variety of classification and regression tasks, or to visualize the results of
statistical analyses in a convenient way.