Images convey a broad spectrum of personal information. If such images are shared on social media platforms, this personal information is leaked which conflicts with the privacy of depicted persons. Therefore, we aim for automated approaches to redact such private information and thereby protect privacy of the individual. By conducting a user study we find that obfuscating the image regions related to the private information leads to privacy while retaining utility of the images. Moreover, by varying the size of the regions different privacy-utility trade-offs can be achieved. Our findings argue for a redaction by segmentation paradigm. Hence, we propose the first sizable dataset of private images "in the wild" annotated with pixel and instance level labels across a broad range of privacy classes. We present the first model for automatic redaction of diverse private information. It is effective at achieving various privacy-utility trade-offs within 83% of the performance of manual redaction.



    author = {Orekondy, Tribhuvanesh and Fritz, Mario and Schiele, Bernt},
    title = {Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images},
    booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2018}


We make the dataset available for academic and non-commercial use under a Creative Commons Attribution-NonCommercial 4.0 International License. For images, the original licenses of the authors apply.

Fold Images Annotations Weak Annotations (from Google Cloud Vision API)
Train link link link
Val link link link
Test link link link

Annotations in the dataset are based on a format similar to COCO. As a result, segmentations are represented using an RLE encoding scheme. This can be easily and efficiently decoded using the COCO API.


This research was partially supported by the German Research Foundation (DFG CRC 1223).