We design a new family of hybrid CNN-ViT neural networks, named FasterVi...
Video compression is a central feature of the modern internet powering
t...
Conventional CNNs for texture synthesis consist of a sequence of
(de)-co...
Multi-scale inference is commonly used to improve the results of semanti...
Unsupervised landmark learning is the task of learning semantic keypoint...
We propose a novel approach for image segmentation that combines Neural
...
Video-to-video synthesis (vid2vid) aims at converting an input semantic
...
Prediction and interpolation for long-range video data involves the comp...
Learning to synthesize high frame rate videos via interpolation requires...
Most scene graph generators use a two-stage pipeline to detect visual
re...
Semantic segmentation requires large amounts of pixel-wise annotations t...
In this paper, we present a simple yet effective padding scheme that can...
We propose an efficient and interpretable scene graph generator. We cons...
We present an approach for high-resolution video frame prediction by
con...
This article describes the model we built that achieved 1st place in the...
We study the problem of video-to-video synthesis, whose goal is to learn...
Existing deep learning based image inpainting methods use a standard
con...
We present a new method for synthesizing high-resolution photo-realistic...