We are co-organizing the Large-Scale Scene Understanding (LSUN) workshop at CVPR’17 conference in Honolulu, Hawaii. LSUN comprises several challenges in the context of scene understanding and we are hosting the saliency prediction challenge for the SALICON dataset. The challenge is designed to evaluate the performance of algorithms predicting visual saliency in natural scene images. The motivation of the challenge includes (1) to facilitate attention study in context and with non-iconic views, (2) to provide larger-scale human attentional data, and (3) to encourage the development of methods that leverage multiple annotation modalities from Microsoft COCO. Saliency prediction results could, in turn, benefit other tasks like recognition and captioning – humans make multiple fixations to understand the visual input in natural scenes. Teams will be competing against each other by training their algorithms on the SALICON dataset and their results will be compared against human behavioral data. We are looking forward to receiving submissions based on novel and context-aware saliency prediction models.
- Training and validation data available: June 2017
- Challenge system open: July 6, 2017
- Challenge submission deadline: July 19, 2017
- LSUN workshop at CVPR, announcement of winners on stage, public leaderboard going live: July 26, 2017
The current release comprises 10,000 training images and 5,000 validation images with saliency annotations. For training and validation sets, we provide the color images in JPG format, image resolution, and ground truth (including gaze trajectory, fixation points, and saliency map). The test set with 5,000 images is released without ground-truth. All images are selected from the 2014 release of the COCO dataset.Download Images (3.0G)
The ground-truth saliency annotations include fixations generated from mouse trajectories. In this LSUN’17 release, we replaced the original velocity-based fixation detection algorithm with a new algorithm based on Cluster Fix that resulted in more eye-like fixations. To improve the data quality, isolated fixations with low local density have been excluded. Each image’s ground truth is in a separate MATLAB file: [image_name].mat
The training and validation sets, provided with ground truth, contain the following data fields:
- image: The name of the image.
- resolution: The image resolution, [height, width].
- gaze: The ground truth gaze data from subjects. Each structure corresponds to one subject, which contains:
- location: the image location of the raw gaze points, [x,y].
- timestamp: the timestamp (millisecond) of each raw gaze point.
- fixations: the fixation points, [x,y].
The testing data contains only the image and resolution fields.Download Fixations (1.4G)
We also provide the ground truth fixation maps for the training and validation sets. The fixation maps are grayscale PNG images: [image_name].pngDownload Fixation Maps (0.4G)
To participate in the LSUN’17 saliency prediction challenge, please train and validate your model on the training and validation sets. For each test image, please output the saliency maps in PNG format (i.e., [image_name].png) and compress them in a .zip package.
When zipping the saliency maps, make sure no parent directory in included. For instance, try using something like:
cd path_to_maps; zip ../submission.zip *.png
The zip file will be like:
To submit the results, you will need to create an account on CodaLab and register for the SALICON Saliency Prediction Challenge (LSUN 2017). Open the Participate tab to upload your submission. Once uploaded, your submissions will be evaluated automatically. To maintain the consistency of evaluation results, we use the same evaluation tools as the MIT saliency benchmark. The Matlab code can be obtained from GitHub. We will use four evaluation metrics to rank the submissions: shuffled AUC (SAUC), information gain (IG), normalizedscanpath saliency (NSS), and linear correlation coefficients (CC). Other evaluation scores like AUC, similarity (SIM) and Kullback–Leibler divergence (KL) will also computed but not used for ranking.
Participants can also use the previous release of SALICON (Matlab files and saliency maps, used in ’15 and ’16 challenges) to train their models. However, the evaluation will be based on the new data.
The annotations in this dataset belong to the VIP lab and are licensed under a Creative Commons Attribution 4.0 License.