max planck institut
mpii logo Minerva of the Max Planck Society

Data Mining and Matrices, Summer 2013

Lecturers: Rainer Gemulla, Pauli Miettinen


Re-exam will be on Tuesday, 8 October, from 10 am until noon in room 023, building E1 4. You are allowed to use a printout of the lecture slides as well as hand-written notes (but nothing else).


Many data mining tasks operate on dyadic data, i.e., data involving two types of entities (e.g., users and products, or objects and attributes); such data can be naturally represented in terms of a matrix. Matrix decompositions, where we (approximately) represent the data matrix as a product of two (or more) factor matrices, can be used to perform many common data mining tasks. In this lecture we explore the use of matrix decompositions for denoising, discovery of latent structure, and visualization, among others. We cover data mining tasks such as prediction, clustering and pattern mining and application areas such as recommender systems and topic modelling.

Data Matrix Mining
Book 1 5 0 3
Book 2 0 0 7
Book 3 4 6 5
Avatar The Matrix Up
Alice 4 2
Bob 3 2
Charlie 5 3
A document–term matrix            An incomplete rating matrix
Hot Topics
in IR
IR &
DM &
Student A 1 1 0
Student B 1 1 1
Student C 0 1 1
Jan. June Sept.
Saarbrücken –1 11 10
Helsinki –6.5 10.9 8.7
Cape Town 15.7 7.8 8.7
A student–course matrix            Cities and their average minimum temperatures

List of topics (tentative):
  • Singular value decomposition (SVD)
  • Non-negative matrix factorization (NMF)
  • Semi-discrete decomposition (SDD)
  • Boolean matrix decomposition (BMF)
  • Independent component analysis (ICA)
  • Matrix completion
  • Probabilistic matrix factorization
  • Graphs
  • Tensors



You must register in HISPOS. Please also register via e-mail to receive news and updates from us.


Basic knowledge of linear algebra.

Requirements for the certificate

Lecture notes


Suggested reading