Xiaoshuai Sun

research

∙ 09/04/2023

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

With ever increasing parameters and computation, vision-language pre-tra...

0 Qiong Wu, et al. ∙

research

∙ 08/31/2023

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation

In 3D Referring Expression Segmentation (3D-RES), the earlier approach a...

0 Changli Wu, et al. ∙

research

∙ 08/11/2023

Continual Face Forgery Detection via Historical Distribution Preserving

Face forgery techniques have advanced rapidly and pose serious security ...

0 Ke Sun, et al. ∙

research

∙ 08/06/2023

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

In recent years, 3D representation learning has turned to 2D vision-lang...

0 Haowei Wang, et al. ∙

research

∙ 07/31/2023

Towards General Visual-Linguistic Face Forgery Detection

Deepfakes are realistic face manipulations that can pose serious threats...

0 Ke Sun, et al. ∙

research

∙ 06/30/2023

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Deep neural networks often suffer from poor generalization due to comple...

0 Peng Mi, et al. ∙

research

∙ 06/01/2023

Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting

Pre-trained language models (PLMs) have played an increasing role in mul...

0 Shubin Huang, et al. ∙

research

∙ 05/24/2023

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models

Recently, growing interest has been aroused in extending the multimodal ...

0 Gen Luo, et al. ∙

research

∙ 03/28/2023

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

Text-driven 3D stylization is a complex and crucial task in the fields o...

0 Yiwei Ma, et al. ∙

research

∙ 03/15/2023

Active Teacher for Semi-Supervised Object Detection

In this paper, we study teacher-student learning from the perspective of...

0 Peng Mi, et al. ∙

research

∙ 02/22/2023

Towards End-to-end Semi-supervised Learning for One-stage Object Detection

Semi-supervised object detection (SSOD) is a research hot spot in comput...

0 Gen Luo, et al. ∙

research

∙ 02/16/2023

Towards Efficient Visual Adaption via Structural Re-parameterization

Parameter-efficient transfer learning (PETL) is an emerging research spo...

0 Gen Luo, et al. ∙

research

∙ 02/13/2023

Towards Local Visual Modeling for Image Captioning

In this paper, we study the local visual modeling with grid features for...

0 Yiwei Ma, et al. ∙

research

∙ 01/09/2023

Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network

Panoptic Narrative Grounding (PNG) is an emerging cross-modal grounding ...

0 Haowei Wang, et al. ∙

research

∙ 10/11/2022

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

Deep neural networks often suffer from poor generalization caused by com...

0 Peng Mi, et al. ∙

research

∙ 07/16/2022

Clover: Towards A Unified Video-Language Alignment and Fusion Model

Building a universal video-language model for solving various video unde...

0 Jingjia Huang, et al. ∙

research

∙ 07/15/2022

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval

Video-text retrieval has been a crucial and fundamental task in multi-mo...

0 Yiwei Ma, et al. ∙

research

∙ 04/17/2022

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study

Most of the existing work in one-stage referring expression comprehensio...

0 Gen Luo, et al. ∙

research

∙ 04/16/2022

Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks

Despite the exciting performance, Transformer is criticized for its exce...

0 Gen Luo, et al. ∙

research

∙ 04/02/2022

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

Pixel synthesis is a promising research paradigm for image generation, w...

10 Jing He, et al. ∙

research

∙ 03/30/2022

SeqTR: A Simple yet Universal Network for Visual Grounding

In this paper, we propose a simple yet universal network termed SeqTR fo...

4 Chaoyang Zhu, et al. ∙

research

∙ 03/13/2022

Global2Local: A Joint-Hierarchical Attention for Video Captioning

Recently, automatic video captioning has attracted increasing attention,...

0 Chengpeng Dai, et al. ∙

research

∙ 03/12/2022

Differentiated Relevances Embedding for Group-based Referring Expression Comprehension

Referring expression comprehension (REC) aims to locate a certain object...

0 Fuhai Chen, et al. ∙

research

∙ 10/17/2021

Towards Language-guided Visual Recognition via Dynamic Convolutions

In this paper, we are committed to establishing an unified and end-to-en...

0 Gen Luo, et al. ∙

research

∙ 01/16/2021

Dual-Level Collaborative Transformer for Image Captioning

Descriptive region features extracted by object detection networks have ...

11 Yunpeng Luo, et al. ∙

research

∙ 12/13/2020

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

Transformer-based architectures have shown great success in image captio...

0 Jiayi Ji, et al. ∙

research

∙ 12/01/2020

Fast Class-wise Updating for Online Hashing

Online image hashing has received increasing research attention recently...

0 Mingbao Lin, et al. ∙

research

∙ 03/19/2020

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Referring expression comprehension (REC) and segmentation (RES) are two ...

0 Gen Luo, et al. ∙

research

∙ 12/07/2019

A Real-time Global Inference Network for One-stage Referring Expression Comprehension

Referring Expression Comprehension (REC) is an emerging research spot in...

19 Yiyi Zhou, et al. ∙

research

∙ 11/20/2019

SSAH: Semi-supervised Adversarial Deep Hashing with Self-paced Hard Sample Generation

Deep hashing methods have been proved to be effective and efficient for ...

0 Sheng Jin, et al. ∙

research

∙ 10/21/2019

Hadamard Codebook Based Deep Hashing

As an approximate nearest neighbor search technique, hashing has been wi...

0 Shen Chen, et al. ∙

research

∙ 10/18/2019

Toward 3D Object Reconstruction from Stereo Images

Inferring the 3D shape of an object from an RGB image has shown impressi...

39 Hongxun Yao, et al. ∙

research

∙ 10/14/2019

Sketch-Specific Data Augmentation for Freehand Sketch Recognition

Sketch recognition remains a significant challenge due to the limited tr...

9 Ying Zheng, et al. ∙

research

∙ 10/14/2019

Deep Semantic Parsing of Freehand Sketches with Homogeneous Transformation, Soft-Weighted Loss, and Staged Learning

In this paper, we propose a novel deep framework for part-level semantic...

47 Ying Zheng, et al. ∙

research

∙ 10/09/2019

Semantic-aware Image Deblurring

Image deblurring has achieved exciting progress in recent years. However...

6 Fuhai Chen, et al. ∙

research

∙ 08/07/2019

Scene-based Factored Attention for Image Captioning

Image captioning has attracted ever-increasing research attention in the...

0 Chen Shen, et al. ∙

research

∙ 08/06/2019

Semi-Supervised Adversarial Monocular Depth Estimation

In this paper, we address the problem of monocular depth estimation when...

8 Rongrong Ji, et al. ∙

research

∙ 06/04/2019

Information Competing Process for Learning Diversified Representations

Learning representations with diversified information remains an open pr...

0 Jie Hu, et al. ∙

research

∙ 05/31/2019

Supervised Online Hashing via Similarity Distribution Learning

Online hashing has attracted extensive research attention when facing st...

0 Mingbao Lin, et al. ∙

research

∙ 05/11/2019

Hadamard Matrix Guided Online Hashing

Online image hashing has received increasing research attention recently...

0 Mingbao Lin, et al. ∙

research

∙ 01/31/2019

Pix2Vox: Context-aware 3D Reconstruction from Single and Multi-view Images

Recovering the 3D representation of an object from single-view or multi-...

0 Hongxun Yao, et al. ∙

research

∙ 01/29/2019

Towards Optimal Discrete Online Hashing with Balanced Similarity

When facing large-scale image datasets, online hashing serves as a promi...

0 Mingbao Lin, et al. ∙

research

∙ 11/09/2018

Semantic and Contrast-Aware Saliency

In this paper, we proposed an integrated model of semantic-aware and con...

0 Xiaoshuai Sun, et al. ∙

research

∙ 05/08/2018

The Effectiveness of Instance Normalization: a Strong Baseline for Single Image Dehazing

We propose a novel deep neural network architecture for the challenging ...

0 Zheng Xu, et al. ∙

Xiaoshuai Sun

Featured Co-authors

Sign in with Google

Consider DeepAI Pro