ExSample: Efficient Searches on Video Repositories through Adaptive Sampling
Capturing and processing video is increasingly common as cameras become cheaper to deploy. At the same time, rich video understanding methods have progressed greatly in the last decade. As a result, many organizations now have massive repositories of video data, with applications in mapping, navigation, autonomous driving, and other areas. Because state-of-the-art object detection methods are slow and expensive, our ability to process even simple ad-hoc object search queries ('find 100 traffic lights in dashcam video') over this accumulated data lags far behind our ability to collect it. Processing video at reduced sampling rates is a reasonable default strategy for these types of queries, however, the ideal sampling rate is both data and query dependent. We introduce ExSample, a low cost framework for object search over unindexed video that quickly processes search queries by adapting the amount and location of sampled frames to the particular data and query being processed. ExSample prioritizes the processing of frames in a video repository so that processing is focused in portions of video that most likely contain objects of interest. It continually re-prioritizes processing based on feedback from previously processed frames. On large, real-world datasets, ExSample reduces processing time by up to 6x over an efficient random sampling baseline and by several orders of magnitude over state-of-the-art methods that train specialized per-query surrogate models. ExSample is thus a key component in building cost-efficient video data management systems.
READ FULL TEXT