Decoration
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

Data Mining and Matrices, Summer 2013

Lecturers: Rainer Gemulla, Pauli Miettinen

News

Re-exam will be on Tuesday, 8 October, from 10 am until noon in room 023, building E1 4. You are allowed to use a printout of the lecture slides as well as hand-written notes (but nothing else).

Content

Many data mining tasks operate on dyadic data, i.e., data involving two types of entities (e.g., users and products, or objects and attributes); such data can be naturally represented in terms of a matrix. Matrix decompositions, where we (approximately) represent the data matrix as a product of two (or more) factor matrices, can be used to perform many common data mining tasks. In this lecture we explore the use of matrix decompositions for denoising, discovery of latent structure, and visualization, among others. We cover data mining tasks such as prediction, clustering and pattern mining and application areas such as recommender systems and topic modelling.

Data Matrix Mining
Book 1 5 0 3
Book 2 0 0 7
Book 3 4 6 5
          
Avatar The Matrix Up
Alice 4 2
Bob 3 2
Charlie 5 3
A document–term matrix            An incomplete rating matrix
 
Hot Topics
in IR
IR &
DM
DM &
Matrices
Student A 1 1 0
Student B 1 1 1
Student C 0 1 1
          
Jan. June Sept.
Saarbrücken –1 11 10
Helsinki –6.5 10.9 8.7
Cape Town 15.7 7.8 8.7
A student–course matrix            Cities and their average minimum temperatures

List of topics (tentative):
  • Singular value decomposition (SVD)
  • Non-negative matrix factorization (NMF)
  • Semi-discrete decomposition (SDD)
  • Boolean matrix decomposition (BMF)
  • Independent component analysis (ICA)
  • Matrix completion
  • Probabilistic matrix factorization
  • Graphs
  • Tensors

Organization

Registration

You must register in HISPOS. Please also register via e-mail to receive news and updates from us.

Prerequisites

Basic knowledge of linear algebra.

Requirements for the certificate

Lecture notes

Assignments

Suggested reading