Red Carpet to Fight Club: Partially-supervised Domain Transfer for Face Recognition in Violent Videos

by   Yunus Can Bilge, et al.

In many real-world problems, there is typically a large discrepancy between the characteristics of data used in training versus deployment. A prime example is the analysis of aggression videos: in a criminal incidence, typically suspects need to be identified based on their clean portrait-like photos, instead of their prior video recordings. This results in three major challenges; large domain discrepancy between violence videos and ID-photos, the lack of video examples for most individuals and limited training data availability. To mimic such scenarios, we formulate a realistic domain-transfer problem, where the goal is to transfer the recognition model trained on clean posed images to the target domain of violent videos, where training videos are available only for a subset of subjects. To this end, we introduce the WildestFaces dataset, tailored to study cross-domain recognition under a variety of adverse conditions. We divide the task of transferring a recognition model from the domain of clean images to the violent videos into two sub-problems and tackle them using (i) stacked affine-transforms for classifier-transfer, (ii) attention-driven pooling for temporal-adaptation. We additionally formulate a self-attention based model for domain-transfer. We establish a rigorous evaluation protocol for this clean-to-violent recognition task, and present a detailed analysis of the proposed dataset and the methods. Our experiments highlight the unique challenges introduced by the WildestFaces dataset and the advantages of the proposed approach.


page 2

page 6

page 8


Attention Transfer from Web Images for Video Recognition

Training deep learning based video classifiers for action recognition re...

Domain Adapting Ability of Self-Supervised Learning for Face Recognition

Although deep convolutional networks have achieved great performance in ...

Deep face recognition with clustering based domain adaptation

Despite great progress in face recognition tasks achieved by deep convol...

Domain Adaptive Video Semantic Segmentation via Cross-Domain Moving Object Mixing

The network trained for domain adaptation is prone to bias toward the ea...

Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification

Forensic audio analysis for speaker verification offers unique challenge...

Cross-Identity Motion Transfer for Arbitrary Objects through Pose-Attentive Video Reassembling

We propose an attention-based networks for transferring motions between ...

Please sign up or login with your details

Forgot password? Click here to reset