Seminar "Techniques for Non-Traditional Data Management"
Dr. Thomas Neumann,
Dr. Ralf Schenkel
Organization
- All available topics have been taken.
- The first
meeting took place on
Tuesday, October 21, at 14 c.t. in room 433, building E 1.4. Slides from the meeting
- Regular meetings
are on Tuesdays at 16 c.t. in room 433 (rotunda 4th floor),
building E 1.4., starting December 2.
- Prerequisite for attending the
seminar is some knowledge of database systems or information retrieval
in general. We
recommend that
participants have successfully participated in the basic course
"Informationssysteme", in a database systems course, or in an
information retrieval course.
- Unfortunately, some VLDB 2008 papers are not yet freely
available; if you want to have a look at them before signing up for the
seminar, contact us, we'll send you the file(s) by email. We checked
that any linked papers are available from the MPI-INF
network (last check: October 21, 2008, 2pm). If you encounter any
problems accessing a paper, please
contact us.
Contents of the Seminar
The seminar discusses very recent
scientific
conference papers on hot topics in important fields within the broad
area of data management. Most of these papers are likely to make impact
on research, applications and/or commercial systems in the future.
Topics covered include advanced relational problems (top-k processing,
query optimization, and transaction management), data management beyond
the relational model (RDF engines, probabilistic databases, and
semistructured data), new architectures for data management
(distributed systems and novel hardware), and recent developments in
Web information systems (ranking and social networks).
Requirements for the Certificate
- Attend all talks - not just your own. We will keep track of
participation! If you are sick, please let us know in advance by
writing a short mail.
- Read your papers and other related literature.
- Prepare a 45 minutes talk about your topic that introduces the
matter to your fellow students. This is about twice the size of a
conference talk, so there should be enough time to present some
background information on the topic. Even though there are usually
several
papers listed for each topic, you are not expected to talk in detail
about all of them; in fact, this is usually impossible given the time
limit. Instead, try to pick the most interesting,
challenging or futuristic contribution(s) from at least one of them for
the presentation.
You are very welcome to discuss any potential weaknesses or problems of
the paper(s) in your talk. If you are unsure about what to present, ask
your tutor. Note that, even though the conference slides of some papers
are available on the Web, we expect that you prepare your own slides
(which may be, of course, inspired by the original slides).
You must send
your slides to and discuss them with your tutor by the Friday before
your talk (4pm) at the latest, otherwise your talk will be cancelled
(this is a hard deadline).
- Both
the slides and the presentation
itself must be given in English.
Otherwise, some students will not be able to follow all talks, which is
one of the main purposes of the seminar. After the presentations, there
will be a discussion in which all fellow students are encouraged to ask
questions. We will keep track of your participation (i.e., if you ask
questions) and, of course, the answers of the presenter.
- For each talk, a second student will be preselected as an
opponent. His or her role is to prepare tough questions to challenge
the paper presented in the talk (not the talk itself or the speaker!).
To make life a little easier, the preliminary version of the
slides will be sent to the opponent on the Friday before the talk.
However, as interaction is an important part of science, we expect that
every participant actively participates in the discussions.
- Two weeks after the talk, the presenter and the opponent together have to you have to submit
a short (usually not longer than 5
pages) summary of the topic of the talk. The focus of this report
should be on pointing out strengths
and weaknesses of the approach presented in the paper(s), not just
summarizing the paper(s).
- After your talk, there will be another meeting with your tutor
and Thomas and/or Ralf to give feedback on the talk and the report.
- In other words: Your final grade will be influenced by the
following components: Your oral presentation, the knowledge about your
topic (your answers to questions after the presentation), the questions
you asked as opponent, your general
participation in the seminar, and your two written reports (one in the
role of presenter, one in the role of opponent).
Schedule
Advanced Relational Problems
Tuesday, November 18, 2008, 16:15: Michael Maurer on Top-k Processing (tutor Thomas Neumann, opponent Torsten Hey)
- Lin Guo, Sihem Amer Yahia, Raghu Ramakrishnan Jayavel
Shanmugasundaram, Utkarsh Srivastava, Erik Vee: Efficient Top-K Processing over
Query-Dependent Functions. VLDB 2008
- Nilesh Bansal, Sudipto Guha, Nick Koudas: Ad-hoc aggregations of ranked lists in the
presence of hierarchies. SIGMOD 2008, pp. 67-78
- Ming Hua, Jian Pei, Wenjie Zhang, Xuemin Lin: Ranking queries on uncertain data: a
probabilistic threshold approach. SIGMOD 2008, pp. 673-686
- Lei Zou, Lei Chen: Dominant
Graph: An Efficient Indexing Structure to Answer Top-K Queries.
ICDE 2008, pp. 536-545
Tuesday, November 25, 2008, 16:15: Joachim Müller on Query Optimization
and Tuning
(tutor Thomas Neumann, opponent Sarath Kumar Kondreddi)
- Tuesday, December 2, 2008, 16:15: Jörg Schad on Transactions (tutor Ralf Schenkel, opponent Vinay Setty) slides
- David Lomet, Mingsheng Hong, Rimma Nehme, Rui Zhang: Transaction
Time Indexing with Version Compression. VLDB 2008
- Hyun J. Moon, Carlo A. Curino, Alin Deutsch, Chien-Yi Hou,
Carlo Zaniolo: Managing and Querying
Transaction-time Databases under Schema Evolution. VLDB 2008
- Michael J. Cahill, Uwe Röhm, Alan David Fekete: Serializable isolation for snapshot
databases. SIGMOD 2008, pp. 729-738
Beyond the Relational Data Model
- Tuesday, December 9, 2008, 16:15: Stefan Schuh on RDF Engines (tutor Thomas Neumann, opponent Sathess Thirunavukkarasu) slides
Tuesday, January 6, 2009, 16:15: Torsten
Hey on Uncertainty & Probabilistic DB (tutor Martin Theobald, opponent Alekh Jindal)
- Tuesday, January 13, 2009, 16:15: Sarath Kumar Kondreddi on Semistructured
Data and XML (tutor
Ralf Schenkel, opponent Dogan Karaoglan) slides
- Hongzhi Wang, Jianzhong Li, Jizhou Luo, Hong Gao: Hash-based Subgraph Query Processing
Method for Graph-structured XML Documents. VLDB 2008
- Kostas Lillis, Evaggelia Pitoura: Cooperative XPath caching.
SIGMOD 2008, pp. 327-338
New Architectures
- Tuesday, January 20, 2009, 16:15: Vinay Setty on P2P and Distribution (tutor
Klaus Berberich, opponent Alekh Jindal) slides
Tuesday, January 27, 2009, 16:15: Sathess Thirunavukkarasu on Databases on
New Hardware (tutor
Ralf Schenkel, opponent Sarath Kumar Kondreddi)
Web 1.0 and 2.0
- Tuesday, February 3, 2009, 16:15: Alekh Jindal on Web Ranking (tutor Andreas Broschart, opponents Jörg Schad and Sarath Kumar Kondreddi) slides
- Tuesday, February 10, 2009, 16:15: Dogan Karaoglan on Social Networks (tutor Ralf
Schenkel, opponent Stefan Schuh)
slides
- Sihem Amer Yahia, Michael Benedikt, Laks V.S. Lakshmanan, Julia
Stoyanovich: Efficient Network-Aware Search in
Collaborative Tagging Sites. VLDB 2008
- Paul
Heymann, Daniel Ramage, Hector Garcia-Molina: Social Tag Prediction. SIGIR
2008
- Ralf Schenkel, Tom Crecelius, Mouna Kacimi, Sebastian Michel,
Thomas Neumann, Josiane X. Parreira, Gerhard Weikum: Efficient Top-k Querying Over Social
Tagging Networks. SIGIR 2008 (just as background info,
do not present this)