In this paper, we address the problem of referring expression comprehens...
In this paper, we address self-supervised representation learning from h...
Fashion is the way we present ourselves to the world and has become one ...
With the prevalence of RGB-D cameras, multi-modal video data have become...
In this paper, we address unsupervised pose-guided person image generati...
Temporal modeling in videos is a fundamental yet challenging problem in
...
Despite the fact that many 3D human activity benchmarks being proposed, ...
Human action recognition is an important task in computer vision. Extrac...