Lectures: | Wednesday 10:15-12 (first lecture) or 14:15-16:00 (all remaining lectures), room 029 (ground floor of the MPI-SWS building E1.5) | |||
---|---|---|---|---|
Lecturers: | Paweł Gawrychowski | |||
Textbooks: |
|
|||
Tutorials: | Monday 10:15-12:00 rotunda at 3rd floor (of the MPI building), Monday 14:15-16:00 room 022 (ground floor of the MPI building), Wednesday 10:15-12:00 room 023 (ground floor of the MPI building) | |||
Tutors: | Blaga Davidova and Igor Stassiy | |||
Exam dates: | Exam: between 29.07 and 09.08, see the questions; Re-exam: 26.09 (by appointment, write me an email) |
Don't know what algorithms on strings are? Look here to get some idea.
This will be an introduction to the world of stringology, or algorithms on strings. Strings are one of the basic data types, and the ability to quickly process them is crucial given the massive use of text processing in many applications, especially those connected to computational biology. Moreover, efficient solutions to many problems in this area are full of simple yet beutiful ideas, and one can find many algorithmic jewels there.
The aim of the course will be to present a basic string processing toolkit, including:
While the course will be geared towards the more theoretical aspects, we will try to see how some of the solutions we discussed work in practice. Hence you should have at least a basic familiarity with some programming language.
Date | Topics | Lecture notes | |
---|---|---|---|
Apr 17, 10:15-12:00 | Introduction, basic concepts, Karp-Rabin algorithm. | Slides | |
Apr 24 | Knuth-Morris-Pratt algorithm and some of its extensions. | Problemset 1 Hints for problemset 1 Slides | |
May 1 | No lecture. | ||
May 8 | Porat-Porat algorithm. Longest common subsequence vs edit distance. | Problemset 2 Hints for problemset 2 Slides | |
May 15 | Longest common subsequence in linear space. Edit distance in O(nd) as used in diff. | Problemset 3 Hints for problemset 3 Slides | |
May 22 | Bit packing in LCS. Introduction to suffix arrays. | Problemset 4 Hints for problemset 4 Slides | |
May 29 | Suffix arrays in linear time. | Problemset 5 Hints for problemset 5 Slides | |
June 5 | No lecture. | ||
June 12 | Decreasing the search time to O(m+lgn). Range minimum queries in constant time. | Problemset 6 Hints for problemset 6 Slides | |
June 19 | Pattern matching with don't cares/mismatches via FFT and kangaroo hopping. | Slides | |
June 26 | Introduction to equations on words. | Problemset 7 Hints for problemset 7 Slides | |
July 3 | Lempel-Ziv compression with applications to detecting regularities. | Problemset 8 Hints for problemset 8 Slides | |
July 10 | Burrows-Wheeler transform with applications to indexing. | Problemset 9 Hints for problemset 9 Slides | |
July 17 | Approximating the shortest superstring. Recap. | Slides |
Prerequisites: | An undergraduate course in algorithms is a must. Additionally, some basic familiarity with programming (in any reasonable language) would be welcome. | |||
---|---|---|---|---|
Class Info: | This is a 6-credit-point advanced lecture. There will be one 90 minutes lecture and one 90 minutes tutorial session per week. | |||
Grading policy: | Your final grade will be determined by your performance in homework problems, programming assignments, and final exam:
|