In this paper we introduce the Temporo-Spatial Vision Transformer (TSViT...
In training machine learning models for land cover semantic segmentation...
This report presents design considerations for automatically generating
...
In this paper we propose a fully-supervised pretraining scheme based on
...
We propose a compact architecture based on fully convolutional neural
ne...
Synthesising 3D facial motion from speech is a crucial problem manifesti...