index2.html
Core Lecture "Information Retrieval and Data Mining" WS 2011/12
Lecturers
Dr. Martin Theobald and
Dr.
Pauli Miettinen
Teaching Assistants
Sarath Kumar Kondreddi
Erdal Kuzey
Faraz Makari Manshadi
Niket Tandon
Tomasz Tylenda
Mohamed Yahya
Lectures
- Tuesdays 14–16 and Thursdays 16–18, Building: E1.3, HS-002
News & Announcements
The final results of the exam and reexam can be found here [PDF].
The list of students having qualified for the final exam (inluding the number of bonuses) is available here [PDF].
If you do not see your matriculation number in the list but you think you should be on this list,
or if you think your bonus points are wrong, please contact the lecturers.
The results of the third short test can be found here [PDF].
The limits for passing the test and for obtaining a bonus are 8 and 15 points, respectively.
The results of the second short test can be found here [PDF].
The limits for passing the test and for obtaining a bonus are 8 and 15 points, respectively.
The results of the first short test can be found here [PDF].
The limits for passing the test and for obtaining a bonus were 8 and 15 points, respectively.
Schedule & Slides
Assignments
- 1st Assignment, due on Thursday, October 27 [PDF],
solution [PDF]
- 2nd Assignment, due on Thursday, November 3 [PDF],
solution [PDF]
- 3rd Assignment, due on Thursday, November 10 [PDF],
solution [PDF]
- 4th Assignment, due on Thursday, November 18 [PDF],
solution [PDF]
- 5th Assignment, due on Thursday, November 24 [PDF]
solution [PDF]
- 6th Assignment, due on Thursday, December 1 [PDF],
solution [PDF]
- 7th Assignment, due on Thursday, December 8 [PDF],
solution [PDF]
- 8th Assignment, due on Thursday, December 15 [PDF],
solution [PDF]
- 9th Assignment, due on Thursday, January 12, 2012 [PDF],
solution [PDF]
- 10th Assignment, due on Thursday, January 19, 2012 [PDF],
solution [PDF]
- 11th Assignment, due on Thursday, January 27, 2012 [PDF],
solution [PDF]
- 12th Assignment, due on Thursday, February 02, 2012 [PDF],
solution [PDF]
- 13th Assignment, due on Thursday, February 09, 2012 [PDF],
solution [PDF]
Tests & Solutions
- Quiz Sheet, Tuesday, October 18 [PDF], Solution [PDF]
- First Short Test, Thursday, November 17, 2011 Solution [PDF]
- Second Short Test, Tuesday, December 20, 2011 Solution [PDF]
- Third Short Test, Tuesday, January 31, 2012 Solution [PDF]
Tutoring Groups
The assignments to tutoring groups are now available.
(Last change: 2011.11.02 16:02. No more changes are possible.)
Content
The lecture teaches mathematical models and
algorithms that form the basis of search engines for the Web,
intranets, and digital libraries, and for data mining and analysis
tools. Information Retrieval and Data Mining are technologies for
searching, analyzing and automatically organizing text documents,
multi-media documents, and structured or semistructured data.
Prerequisites
Students planning to attend the course should be familiar with basic
models and methods from linear algebra (e.g. singular-value
decomposition), probability theory and statistics (e.g.
Bayesian networks and Markov chains), and combinatorics.
Requirements for Passing the Course
The requirements for passing the course and obtaining credit points are:
- You must convincingly present three correct solutions to the exercises in the tutoring group.
In order to present an exercise in the tutoring groups, you must return the assignment sheet on the Thursday before the execises take place and have a correct solution for the exercise you intend to present.
- You must pass at least two out of three short tests that will
be offered during the semester. The tests will be in written form, each with three or
four questions that repeat material from the lecture and the assignments. Each test
will last 45 to 60 minutes and will be on the following dates: November
17, December 20, and January 31.
- You must pass a final exam to be held on February 21 in written form.
Grading System
Your grades will be primarily determined by the final exam. You can earn bonus
points that will improve your grade in the following ways, where one bonus point
corresponds to a third mark in the German grading system (so that three bonus points
will improve your grade from the final exam by a full mark, e.g., from 3.0 (C) to
2.0 (B) or from 2.7 (C+) to 1.7 (B+)).
- You can earn one bonus point by presenting a correct solution to one
of the exercises in your tutoring group, in addition to the three presentations
which are mandatory for passing.
- The tests will be assessed with one of the following
coarse-grained grades: very good, pass, failed. Each test that is graded as very good
will earn you one bonus point. You can earn up to 3 bonus points this way.
Literature
- Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze.
Introduction to Information Retrieval, Cambridge University
Press, 2008. (Website)
- R. Baeza-Yates, R. Ribeiro-Neto.
Modern Information Retrieval: The concepts and technology behind search,
Addison-Wesley, 2010.
- W. Bruce Croft, Donald Metzler, Trevor Strohman.
Search Engines: Information Retrieval in Practice,
Addison-Wesley, 2009.
(Website)
- Mohammed J. Zaki, Wagner Meira Jr.
Fundamentals of Data Mining Algorithms,
manuscript (pdf, requires username and password)
- Pang-Ning Tan, Michael Steinbach, Vipin Kumar.
Introduction to Data Mining,
Addison-Wesley, 2006.
(Website)
Further Reading
- Jiawei Han, Micheline Kamber, Jian Pei. Data Mining - Concepts and
Techniques, 3rd ed., Morgan Kaufmann, 2011. (Website)
- Stefan Büttcher, Charles L. A. Clarke, Gordon V. Cormack.
Information Retrieval: Implementing and Evaluating Search Engines,
MIT Press, 2010 (Website)
- Christopher M. Bishop. Pattern Recognition and Machine Learning,
Springer, 2006.
Background on Statistics and Probability Theory
- Larry Wasserman. All of Statistics, Springer, 2004.
(Website)
- Trevor Hastie, Robert Tibshirani, Jerome Friedman.
The elements of statistical learning, 2nd edition, Springer, 2009.