Recent DETR-based video grounding models have made the model directly pr...
Aerial-to-ground image synthesis is an emerging and challenging problem ...
Compositional zero-shot learning (CZSL) aims to recognize unseen composi...
Modern data augmentation using a mixture-based technique can regularize ...
Recent progress in deterministic prompt learning has become a promising
...
In this paper, we efficiently transfer the surpassing representation pow...
Temporal Action Localization (TAL) is a challenging task in video
unders...
Online Temporal Action Localization (On-TAL) aims to immediately provide...
Given an untrimmed video and a language query depicting a specific tempo...
Online stereo adaptation tackles the domain shift problem, caused by
dif...
This paper presents Probabilistic Video Contrastive Learning, a
self-sup...
The rise of deep neural networks has led to several breakthroughs for
se...
Shape deformation of targets in SAR image due to random orientation and
...
Over the past few years, image-to-image (I2I) translation methods have b...
This paper addresses the problem of single image de-raining, that is, th...
We address the problem of few-shot semantic segmentation (FSS), which ai...
This manual is intended to provide a detailed description of the DIML/CV...
Video prediction, forecasting the future frames from a sequence of input...
Domain generalization aims to learn a prediction model on multi-domain s...
We propose a novel framework for fine-grained object recognition that le...
We propose a novel cost aggregation network, called Cost Aggregation wit...
This paper presents a novel method, termed Bridge to Answer, to infer co...
We present a novel unsupervised framework for instance-level image-to-im...
In this paper, we address the problem of separating individual speech si...
Stereo matching is one of the most popular techniques to estimate dense ...
Existing techniques to adapt semantic segmentation networks across the s...
The goal of video summarization is to select keyframes that are visually...
Existing techniques to encode spatial invariance within deep convolution...
Convolutional neural networks (CNNs) based approaches for semantic align...
Traditional techniques for emotion recognition have focused on the facia...
The recent advance of monocular depth estimation is largely based on dee...
We present semantic attribute matching networks (SAM-Net) for jointly
es...
We present recurrent transformer networks (RTNs) for obtaining dense
cor...
This paper presents a deep architecture for dense semantic correspondenc...
Techniques for dense semantic correspondence have provided limited abili...
We present a descriptor, called fully convolutional self-similarity (FCS...
Regularization-based image restoration has remained an active research t...
Establishing dense correspondences between multiple images is a fundamen...
Edge-preserving smoothing (EPS) can be formulated as minimizing an objec...
We present a method for jointly predicting a depth map and intrinsic ima...
We present a novel descriptor, called deep self-convolutional activation...