One requirement to pass the course will be to finish a programming project. For the duration of the course, teams of 2-3 students will be formed who work together on developing a little application for the the Microsoft Kinect camera, a new 3D depth sensor. The Kinect camera does not only capture video frames, like a standard video camera, but it also captures depth maps, i.e. 2.5 D geometry. The camera was originally developed as a game controller, but Microsoft makes available an API that gives access to the data captured by the cameras, as well as some data that are usually extracted from the camera in the context of a gaming, such a joint positions. Having access to the data, developers can now build their own 3D applications that go beyond gaming, e.g, for 3D modeling, interaction. telepresence and so on. In this lecture, we want to give you the chance to develop such an application on your own. The programming projects will be graded and contribute to 40% of the overall grade for the course.
This is an advanced lecture and we will not teach you programming, or how to use certain
libraries that are commonly used in vision and/or graphics. We expect that you have programming experience in C++,
and we expect that you have experience with
respective libraries, e.g., from previous classes in graphics and/or vision. If you lack these skills you should be prepared to
acquire the knowledge about using these tools.
In MPII room 210, there is our Kinect Lab in which you can work on your projects. The lab features three computers, each equipped with a Kinect. Each computer runs Windows7 and has the following software / libraries installed that may be relevant to your project:
There are two project options. The first option is to complete the default task described below. In this option, the default task is the minimial requirement and there is an optional advanced task which meant to be built upon the default task. The results of optional advanced tasks are properly evaluated and reflected to the final grade. The second option is to design your own project.
Write a software that scans the shape of a static 3D object with a Kinect camera, and then visualizes the model on screen. The Kinect can only capture the geometry of the object from one side. So, to get the entire 3D shape, you will have to fuse the depth scans from different sides. You will have to think about how to align the scans in 3D and how to visualize the aligned point cloud. You may also have to think on how the best arrangement of scanner and object is: For instance, you could move the object by hand in front of the Kinect, or you could move the Kinect by hand around the object that stands on a table. Each of these operation modes will have its own challenges. Algorithm-wise you will have to think about how to align the individual scans, i.e. how to find the rigid body transforms forms that align the scans. The lecture will cover this. In the end, you should be able to show the 3D object on screen using a simple graphical interface in which you can rotate/translate the scanned shape.
Candidates for advanced task (not limited to)
Maybe you have an idea of your own on what project you want to do with the Kinect - a simple game, a telepresence application, teleconferencing system etc. In such a case, we encourage you to follow your own ideas and turn this into your course project. Your own idea will have to be of a similar scope and on a similar level of complexity as the essential task (3D scanning). Decisions on special projects are taken on a case-by-case basis after a thorough discussion between the students and the instructors.
The following are the criteria for successful projects. Each submitted project contains: