SALICON – DATASET
Eye tracking is commonly used in visual neuroscience and cognitive science to answer related questions such as visual attention and decision making. Computational models that predict where to look have direct applications to a variety of computer vision tasks. Due to the inherently complex nature of both the stimuli and the human cognitive process, we envision that bigger eye-tracking data can advance the understanding of these questions and emulating the way humans do. The scale of current eye-tracking experiments, however, are limited as it requires a customized device to track gaze accurately. With our novel psychophysical and crowdsourcing paradigm, SALICON dataset offers a large set of saliency annotations on the popular Microsoft Common Objects in Context (MS COCO) image database. These data complement the task-specific annotations to advance the ultimate goal of visual understanding.
MS COCO is a new large-scale image dataset that highlights non-iconic views and objects in context. It presents a rich set of task-specific annotations for image recognition, segmentation, and captioning. The rich contextual information enables joint studies of image saliency and semantics. For example, by highlighting important objects, our data naturally rank the existing object categories, and suggest new categories of interests.
We designed a new mouse-contingent multi-resolutional paradigm based on neurophysiological and psychophysical studies of peripheral vision to simulate the natural viewing behavior of humans. The new paradigm allowed using a general-purpose mouse instead of eye tracker to record viewing behaviors. The experiment is deployed on the Amazon Mechanical Turk to enable large-scale data collection. The aggregation of the mouse trajectories from different viewers indicates the probability distribution of visual attention.
The paradigm was validated with controlled laboratory as well as large-scale online data. Comparisons on the OSIE dataset (700 natural images) show that the two systems are highly similar in the output maps for attention annotation. With the achieved similarity, the new method provides reasonable ground truth for saliency prediction and other computer vision tasks. For saliency benchmarking, model rankings have shown consistent across datasets (OSIE and SALICON) and in multiple scenarios (eye tracking, and mouse tracking in laboratory setting as well as through Amazon Mechanic Turk).
We plan to provide more annotations for the MS COCO dataset, by expanding the database periodically. In this first release we provide 10,000 training data. The next 10,000 for validation and test will be available soon. The test data will only be used for evaluating saliency algorithms on demand in this website.
Apart from the data, we will also offer a MATLAB toolkit to assist the data processing and model evaluation.