Attacking Automatic Video Analysis Algorithms: A Case Study of Google Cloud Video Intelligence API

by   Hossein Hosseini, et al.

Due to the growth of video data on Internet, automatic video analysis has gained a lot of attention from academia as well as companies such as Facebook, Twitter and Google. In this paper, we examine the robustness of video analysis algorithms in adversarial settings. Specifically, we propose targeted attacks on two fundamental classes of video analysis algorithms, namely video classification and shot detection. We show that an adversary can subtly manipulate a video in such a way that a human observer would perceive the content of the original video, but the video analysis algorithm will return the adversary's desired outputs. We then apply the attacks on the recently released Google Cloud Video Intelligence API. The API takes a video file and returns the video labels (objects within the video), shot changes (scene changes within the video) and shot labels (description of video events over time). Through experiments, we show that the API generates video and shot labels by processing only the first frame of every second of the video. Hence, an adversary can deceive the API to output only her desired video and shot labels by periodically inserting an image into the video at the rate of one frame per second. We also show that the pattern of shot changes returned by the API can be mostly recovered by an algorithm that compares the histograms of consecutive frames. Based on our equivalent model, we develop a method for slightly modifying the video frames, in order to deceive the API into generating our desired pattern of shot changes. We perform extensive experiments with different videos and show that our attacks are consistently successful across videos with different characteristics. At the end, we propose introducing randomness to video analysis algorithms as a countermeasure to our attacks.


page 3

page 7

page 9


Deceiving Google's Cloud Video Intelligence API Built for Summarizing Videos

Despite the rapid progress of the techniques for image classification, v...

Adversarially Robust Frame Sampling with Bounded Irregularities

In recent years, video analysis tools for automatically extracting meani...

Automatic cinematography for 360 video

We describe our method for automatic generation of a visually interestin...

Identifying and Resisting Adversarial Videos Using Temporal Consistency

Video classification is a challenging task in computer vision. Although ...

Patternless Adversarial Attacks on Video Recognition Networks

Deep neural networks for classification of videos, just like image class...

Countering Inconsistent Labelling by Google's Vision API for Rotated Images

Google's Vision API analyses images and provides a variety of output pre...

Vronicle: A System for Producing Videos with Verifiable Provenance

Demonstrating the veracity of videos is a longstanding problem that has ...

Please sign up or login with your details

Forgot password? Click here to reset