We introduce the Qwen-VL series, a set of large-scale vision-language mo...
Can we better anticipate an actor's future actions (e.g. mix eggs) by kn...
Recent work has demonstrated the effectiveness of formulating decision m...
As travel demand increases and urban traffic condition becomes more
comp...
In this work, we explore a scalable way for building a general represent...
We present RND-SCI, a novel framework for compressive hyperspectral imag...
Fine-tuning visual models has been widely shown promising performance on...
We present a simple, efficient, and scalable unfolding network, SAUNet, ...
Quantile regression and conditional density estimation can reveal struct...
Roundoff error problems have occurred frequently in interpolation method...
Fine-grained object retrieval aims to learn discriminative representatio...
We present an approach to efficiently and effectively adapt a masked ima...
In this paper, we present a regression-based pose recognition method usi...
Scene Graph, as a vital tool to bridge the gap between language domain a...
To promote the development of underwater robot picking in sea farms, we
...