No-Reference Video Quality Assessment using Multi-Level Spatially Pooled Features

by   Franz Götz-Hahn, et al.

Video Quality Assessment (VQA) methods have been designed with a focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either artificial or authentically distorted. We introduce a new in-the-wild VQA dataset that is substantially larger and diverse: FlickrVid-150k. It consists of a coarsely annotated set of 153,841 videos having 5 quality ratings each, and 1600 videos with a minimum of 89 ratings each. Additionally, we propose new efficient VQA approaches (MLSP-VQA) relying on multi-level spatially pooled deep features (MLSP). They are extremely well suited for training at scale, compared to deep transfer learning approaches. Our best method MLSP-VQA-FF improves the Spearman Rank-order Correlation Coefficient (SRCC) performance metric on the standard KonVid-1k in-the-wild benchmark dataset to 0.83 surpassing the best existing deep-learning model (0.8 SRCC) and hand-crafted feature-based method (0.78 SRCC). We further investigate how alternative approaches perform under different levels of label noise, and dataset size, showing that MLSP-VQA-FF is the overall best method. Finally, we show that MLSP-VQA-FF trained on FlickrVid-150k sets the new state-of-the-art for cross-test performance on KonVid-1k and LIVE-Qualcomm with a 0.79 and 0.58 SRCC, respectively, showing excellent generalization.


page 1

page 4

page 7

page 8


Unified Quality Assessment of In-the-Wild Videos with Mixed Datasets Training

Video quality assessment (VQA) is an important problem in computer visio...

Light-VQA: A Multi-Dimensional Quality Assessment Model for Low-Light Video Enhancement

Recently, Users Generated Content (UGC) videos becomes ubiquitous in our...

Effective Aesthetics Prediction with Multi-level Spatially Pooled Features

We propose an effective deep learning approach to aesthetics quality ass...

SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement

In recent years, several video quality assessment (VQA) methods have bee...

Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach

The proliferation of in-the-wild videos has greatly expanded the Video Q...

Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video Quality Assessment

The proliferation of videos collected during in-the-wild natural setting...

VMAF And Variants: Towards A Unified VQA

Video quality assessment (VQA) is now a fastgrowing subject, beginning t...

Please sign up or login with your details

Forgot password? Click here to reset