SAVAM — Semiautomatic Visual-Attention Modeling

Introduction

The maps of attention can be applied in many fields: user interface design, computer graphics, video processing, etc. Many technologies, algorithms and filters can be improved using information about the saliency distribution. During our work we have created the database of human eye-movements captured while viewing various videos (static and dynamic scenes, shots from cinema-like films and scientific databases)

Features/Benefits

High quality

Diversity

Please note: while the database contains S3D videos actually, only the left view was demonstrated to observers.

Data post-processing

To improve data’s accuracy several levels of verification and correction were applied.

The test sequence was divided into three five-minute parts. Before each part, we carried out the calibration procedure. The observer followed a target that was placed successively at 13 locations across the screen. Next, we validated the calibration by measuring the error of the gaze position at four points. If the estimated error was greater than 0.3 angular degrees, we restarted the calibration.

To reduce inter-video influence we inserted cross-fade by adding a black frame between adjacent scenes. Additionally, to measure observer’s fatigue we placed a special pattern after each three-scene part. We asked observers to track a stimulus, enabling us to measure the squared tracking error, which we defined as the fatigue value. On the next step, we improve the accuracy of determining the position of gaze using transformation, which is obtained by averaging of eye tracking data on calibrate pattern.

To understand the influence of an observer’s fatigue on fixations at the end of a sequence, we asked eight observers to view the whole sequence a second time with the scenes appearing in reverse order.

Downloads

ICCP Paper (2017)

Accepted version of the paper: Download

Supplementary materials: final compression examples pdf zip

ICIP Paper (2014)

Accepted version of the paper: Download

Published version of the paper: IEEE link

Saliency-aware video encoder

A fork of x264 video encoder supporting custom saliency maps as an additional input to improve quality of salient objects.

View on GitHub

Robust Saliency Map Comparison

Saliency maps comparison method invariant to most common transforms:

The Base of Gaze Map

To download the database, please fill-in the request form.
You will get the download link for all data via e-mail.

Reference

Citation

Y. Gitman, M. Erofeev, D. Vatolin, A. Bolshakov, A. Fedorov. “Semiautomatic Visual-Attention Modeling and Its Application to Video Compression”. 2014 IEEE International Conference on Image Processing (ICIP). Paris, France, pp. 1105-1109.

Bibtex

  @INPROCEEDINGS {
    Gitm1410:Semiautomatic,
    AUTHOR    = "Yury Gitman and Mikhail Erofeev and Dmitriy Vatolin
                 and Andrey Bolshakov and Alexey Fedorov",
    TITLE     = "Semiautomatic {Visual-Attention} Modeling and Its 
                 Application to Video Compression",
    BOOKTITLE = "2014 IEEE International Conference on Image Processing
                 (ICIP) (ICIP 2014)",
    ADDRESS   = "Paris, France",
    PAGES     = "1105-1109",
    DAYS      =  27,
    MONTH     =  oct,
    YEAR      =  2014,
    KEYWORDS  = "Saliency;Visual attention;Eye-tracking;Saliencyaware 
                 compression;H.264",
  }

Application to video compression

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Proposed method, 1920x1080, 1500 kbps

Acknowledgments

This work was supported by the Intel/Cisco Video Aware Wireless Networking (VAWN) Program. We acknowledge Institute of Information Transmission Problems for help with eye tracking.

23 Dec 2017
See Also
Automatic local color correction in S3D video
Stereo video may contain a huge color discrepancy. Most of the problems are hard to eliminate because of possible different distortions in each area in the frame.
Swapped views detection in S3D movies
Channel mismatch is one of the most painful-for-viewer problems in stereo 3D movies. This problem is very hard to detect and simply to eliminate: just swap the views.
Automatic sharpness mismatch detection and compensation in stereo
Algorithm works frame-wise: for a pair of source image's views it outputs whether it contains significant sharpness mismatch or no.
Call for HEVC codecs 2019
Fourteen modern video codec comparison
MSU Video Quality Measurement Tool: Why upgrade?
MSU Video Quality Measurement Tool (PSNR, MSE, VQM, SSIM)
MSU Video Quality Measurement Tool (VQMT) is a program for objective video quality assessment.
Site structure