index2.html
Core Lecture "Information Retrieval and Data Mining" WS 2009/10
Lecturer: Prof. Dr.-Ing. Gerhard Weikum
Teaching assistants:
Avishek Anand
Rawia Awadallah
Laura Dietz
Shady Elbassuoni
Lizhen Qu
Stephan Seufert
Bilyana Taneva
Yafang Wang
NOTE: Final Grades for the course can be found here . Please come to Ms. Schaaf (room 402) to pick up your 'Schein'.
Content:
The lecture teaches mathematical models and
algorithms that form the basis for search engines for the Web,
intranets, and digital libraries and for data mining and analysis
tools. Information Retrieval and Data Mining are technologies for
searching, analyzing and automatically organizing text documents,
multi-media documents, and structured or semistructured data.
Prerequisites:
Students planning to attend the course should be familiar with basic
models and methods from linear algebra (e.g. singular-value
decomposition) as well as probability theory and statistics (e.g.
Bayesian networks and Markov chains).
Requirements for Passing the Course:
The requirements for passing the course and obtaining credit points are:
- You must convincingly present a correct solution to two of
the exercises in the tutoring group.
- You must pass at least two out of three tests that will
be offered during the semester. The tests will be in written form, each with two or
three questions that repeat material from the lecture and the assignments. Each test
will last 45 to 60 minutes and will be on the following dates: November
12, December 10, and January 28.
- You must pass a final exam, in oral or written form (more likely
in oral form, lasting 15-20 minutes).
Grading System:
Your grades will be primarily determined by the final exam. You can earn bonus
points that will improve your grade in the following ways, where one bonus point
corresponds to a third mark in the German grading system (so that three bonus points
will improve your grade from the final exam by a full mark, e.g., from 3.0 (C) to
2.0 (B) or from 2.7 (C+) to 1.7 (B+)).
- Each time you convincingly present a correct solution to one
of the exercises in your tutoring group, in addition to the two mandatory presentations
for passing, will earn you one bonus point. You can earn up to 2 bonus points this
way.
- The tests will be assessed with one of the following
coarse-grained grades: very good, ok, failed. Each test that is graded as very good
will earn you one bonus point. You can earn up to 3 bonus points this way.
Slides:
- The lecture slides will be made available here
during the course of the semester.
Assignments:
- The assignment sheets will be made available here during the course of the semester.
Third Exam:
The results for the third test can be found here . All students with 3.5 points or more have passed the test. Students with 7.5 points or more recieve the bonus point. The exam review takes place on February 4(Thursday) after the lecture in the Rotunda(4th floor MPI).
Literature
- Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze.
Introduction to Information Retrieval, Cambridge University
Press, 2008. (Website)
- Soumen Chakrabarti. Mining the Web, Morgan Kaufmann,
2002. (Website)
- Jiawei Han, Micheline Kamber. Data Mining - Concepts and
Techniques, Morgan Kaufmann, 2000. (Website)
- Background on Statistics:
Larry Wasserman. All of Statistics, Springer, 2004. (Website)
- Background on Machine Learning:
Richard O. Duda, Peter E. Hart, David G. Stork. Pattern
Classification, 2nd Edition, Wiley & Sons, 2001. (Website)
Further Reading (Advanced Material)
- David A. Grossman, Ophir Frieder: Information Retrieval - Algorithms and Heuristics, Springer, 2004.
- W. Bruce Croft, Donald Metzler, Trevor Strohman: Search Engines - Information Retrieval in Practice, Addison-Wesley, 2010.
(Website)
- Amy N. Langville, Carl D. Meyer: Google's PageRank and Beyond - The Science of Search Engine Rankings, Princeton University Press, 2006.
(Website)
- Bing Liu: Web Data Mining, Springer, 2008.
- Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer, 2006.
- Trevor Hastie, Robert Tibshirani, Jerome Friedman: The Elements of Statistical Learning, Springer, 2001.