A Simple Model for Subject Behavior in Subjective Experiments

by   Zhi Li, et al.

In a subjective experiment to evaluate the perceptual audiovisual quality of multimedia and television services, raw opinion scores offered by subjects are often noisy and unreliable. Recommendations such as ITU-R BT.500, ITU-T P.910 and ITU-T P.913 standardize post-processing procedures to clean up the raw opinion scores, using techniques such as subject outlier rejection and bias removal. In this paper, we analyze the prior standardized techniques to demonstrate their weaknesses. As an alternative, we propose a simple model to account for two of the most dominant behaviors of subject inaccuracy: bias (aka systematic error) and inconsistency (aka random error). We further show that this model can also effectively deal with inattentive subjects that give random scores. We propose to use maximum likelihood estimation (MLE) to jointly estimate the model parameters, and present two numeric solvers: the first based on the Newton-Raphson method, and the second based on alternating projection. We show that the second solver can be considered as a generalization of the subject bias removal procedure in ITU-T P.913. We compare the proposed methods with the standardized techniques using real datasets and synthetic simulations, and demonstrate that the proposed methods have advantages in better model-data fit, tighter confidence intervals, better robustness against subject outliers, shorter runtime, the absence of hard coded parameters and thresholds, and auxiliary information on test subjects. The source code for this work is open-sourced at https://github.com/Netflix/sureal.


A JND-based Video Quality Assessment Model and Its Application

Based on the Just-Noticeable-Difference (JND) criterion, a subjective vi...

Parameterized Image Quality Score Distribution Prediction

Recently, image quality has been generally describedby a mean opinion sc...

Confidence Interval Estimators for MOS Values

For the quantification of QoE, subjects often provide individual rating ...

Subjective Assessment Experiments That Recruit Few Observers With Repetitions (FOWR)

Recent studies have shown that it is possible to characterize subject bi...

Valid Inference Corrected for Outlier Removal

Ordinary least square (OLS) estimation of a linear regression model is w...

On a statistical approach to mate choices in reproduction

We provide a probabilistic approach to modeling the movements of subject...

Please sign up or login with your details

Forgot password? Click here to reset