Si Liu

research

∙ 09/18/2023

Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation

Audio visual segmentation (AVS) aims to segment the sounding objects for...

0 Shaofei Huang, et al. ∙

research

∙ 09/11/2023

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

Domain shifts such as sensor type changes and geographical situation var...

0 Bo Zhang, et al. ∙

research

∙ 08/31/2023

Towards Vehicle-to-everything Autonomous Driving: A Survey on Collaborative Perception

Vehicle-to-everything (V2X) autonomous driving opens up a promising dire...

0 Si Liu, et al. ∙

research

∙ 08/20/2023

Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation

Audio-visual navigation is an audio-targeted wayfinding task where a rob...

0 Jinyu Chen, et al. ∙

research

∙ 08/05/2023

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

When hearing music, it is natural for people to dance to its rhythm. Aut...

0 Qiaosong Qi, et al. ∙

research

∙ 06/29/2023

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic ...

0 Le Zhuo, et al. ∙

research

∙ 06/18/2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

In the era of extensive intersection between art and Artificial Intellig...

4 Ruibin Yuan, et al. ∙

research

∙ 05/26/2023

A Tale of Two Approximations: Tightening Over-Approximation for DNN Robustness Verification via Under-Approximation

The robustness of deep neural networks (DNNs) is crucial to the hosting ...

0 Zhiyi Xue, et al. ∙

research

∙ 04/14/2023

DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection

This paper presents a DETR-based method for cross-domain weakly supervis...

0 Zongheng Tang, et al. ∙

research

∙ 04/09/2023

Sparse Dense Fusion for 3D Object Detection

With the prevalence of multimodal learning, camera-LiDAR fusion has gain...

0 Yulu Gao, et al. ∙

research

∙ 03/21/2023

Boosting Verified Training for Robust Image Classifications via Abstraction

This paper proposes a novel, abstraction-based, certified training metho...

0 Zhaodi Zhang, et al. ∙

research

∙ 03/10/2023

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

Open-vocabulary object detection aims to provide object detectors traine...

0 Luting Wang, et al. ∙

research

∙ 03/02/2023

FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature Augmentation

One-to-one matching is a crucial design in DETR-like object detection fr...

0 Rongyao Fang, et al. ∙

research

∙ 01/18/2023

Efficient Black-box Checking of Snapshot Isolation in Databases

Snapshot isolation (SI) is a prevalent weak isolation level that avoids ...

0 Kaile Huang, et al. ∙

research

∙ 01/06/2023

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

Monocular 3D lane detection is a challenging task due to its lack of dep...

0 Shaofei Huang, et al. ∙

research

∙ 01/06/2023

Object as Query: Equipping Any 2D Object Detector with 3D Detection Ability

3D object detection from multi-view images has drawn much attention over...

0 Zitian Wang, et al. ∙

research

∙ 12/02/2022

Masked Contrastive Pre-Training for Efficient Video-Text Retrieval

We present a simple yet effective end-to-end Video-language Pre-training...

0 Fangxun Shu, et al. ∙

research

∙ 11/29/2022

Analyzing Infrastructure LiDAR Placement with Realistic LiDAR

Recently, Vehicle-to-Everything(V2X) cooperative perception has attracte...

0 Xinyu Cai, et al. ∙

research

∙ 11/21/2022

Video Background Music Generation: Dataset, Method and Evaluation

Music is essential when editing videos, but selecting music manually is ...

0 Le Zhuo, et al. ∙

research

∙ 11/21/2022

DualApp: Tight Over-Approximation for Neural Network Robustness Verification via Under-Approximation

The robustness of neural networks is fundamental to the hosting system's...

0 Yiting Wu, et al. ∙

research

∙ 11/21/2022

BBReach: Tight and Scalable Black-Box Reachability Analysis of Deep Reinforcement Learning Systems

Reachability analysis is a promising technique to automatically prove or...

0 Jiaxu Tian, et al. ∙

research

∙ 10/06/2022

Cross-Modality Domain Adaptation for Freespace Detection: A Simple yet Effective Baseline

As one of the fundamental functions of autonomous driving system, freesp...

0 Yuanbin Wang, et al. ∙

research

∙ 10/04/2022

Multi-view Human Body Mesh Translator

Existing methods for human mesh recovery mainly focus on single-view fra...

0 Xiangjian Jiang, et al. ∙

research

∙ 08/21/2022

Provably Tightest Linear Approximation for Robustness Verification of Sigmoid-like Neural Networks

The robustness of deep neural networks is crucial to modern AI-enabled s...

9 Zhaodi Zhang, et al. ∙

research

∙ 08/16/2022

PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation

Human pose estimation aims to accurately estimate a wide variety of huma...

0 Wentao Jiang, et al. ∙

research

∙ 08/11/2022

PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding

Panoptic Narrative Grounding (PNG) is an emerging task whose goal is to ...

1 Zihan Ding, et al. ∙

research

∙ 07/19/2022

Target-Driven Structured Transformer Planner for Vision-Language Navigation

Vision-language navigation is the task of directing an embodied agent to...

0 Yusheng Zhao, et al. ∙

research

∙ 07/12/2022

HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors

Conventional knowledge distillation (KD) methods for object detection ma...

0 Luting Wang, et al. ∙

research

∙ 06/08/2022

Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation

Referring video object segmentation aims to predict foreground labels fo...

0 Zihan Ding, et al. ∙

research

∙ 04/20/2022

Reinforced Structured State-Evolution for Vision-Language Navigation

Vision-and-language Navigation (VLN) task requires an embodied agent to ...

0 Jinyu Chen, et al. ∙

research

∙ 04/13/2022

3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection

3D visual grounding aims to locate the referred target object in 3D poin...

1 Junyu Luo, et al. ∙

research

∙ 03/30/2022

TR-MOT: Multi-Object Tracking by Reference

Multi-object Tracking (MOT) generally can be split into two sub-tasks, i...

0 Mingfei Chen, et al. ∙

research

∙ 03/26/2022

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

The task of Human-Object Interaction (HOI) detection could be divided in...

0 Yue Liao, et al. ∙

research

∙ 03/15/2022

Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation

In this paper, we present a novel Distribution-Aware Single-stage (DAS) ...

0 Zitian Wang, et al. ∙

research

∙ 11/16/2021

Video Background Music Generation with Controllable Music Transformer

In this work, we address the task of video background music generation. ...

0 Shangzhe Di, et al. ∙

research

∙ 10/12/2021

Improved Pillar with Fine-grained Feature for 3D Object Detection

3D object detection with LiDAR point clouds plays an important role in a...

0 Jiahui Fu, et al. ∙

research

∙ 08/11/2021

Mining the Benefits of Two-stage and One-stage HOI Detection

Two-stage methods have dominated Human-Object Interaction (HOI) detectio...

0 Aixi Zhang, et al. ∙

research

∙ 08/05/2021

TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding

Recently proposed fine-grained 3D visual grounding is an essential and c...

1 Dailan He, et al. ∙

research

∙ 06/08/2021

Discriminative Triad Matching and Reconstruction for Weakly Referring Expression Grounding

In this paper, we are tackling the weakly-supervised referring expressio...

5 Mingjie Sun, et al. ∙

research

∙ 05/26/2021

PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal

In this paper, we address the makeup transfer and removal tasks simultan...

7 Si Liu, et al. ∙

research

∙ 05/24/2021

Human-centric Relation Segmentation: Dataset and Solution

Vision and language understanding techniques have achieved remarkable pr...

3 Si Liu, et al. ∙

research

∙ 05/15/2021

Cross-Modal Progressive Comprehension for Referring Segmentation

Given a natural language expression and an image/video, the goal of refe...

0 Si Liu, et al. ∙

research

∙ 05/14/2021

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

Language-queried video actor segmentation aims to predict the pixel-leve...

0 Tianrui Hui, et al. ∙

research

∙ 03/08/2021

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

To address the challenging task of instance-aware human part parsing, a ...

11 Tianfei Zhou, et al. ∙

research

∙ 03/03/2021

General Instance Distillation for Object Detection

In recent years, knowledge distillation has been proved to be an effecti...

0 Xing Dai, et al. ∙

research

∙ 01/20/2021

Video Relation Detection with Trajectory-aware Multi-modal Features

Video relation detection problem refers to the detection of the relation...

0 Wentao Xie, et al. ∙

research

∙ 01/11/2021

ORDNet: Capturing Omni-Range Dependencies for Scene Parsing

Learning to capture dependencies between spatial positions is essential ...

8 Shaofei Huang, et al. ∙

research

∙ 12/07/2020

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps

When describing an image, reading text in the visual scene is crucial to...

0 Zhaokai Wang, et al. ∙

research

∙ 11/10/2020

Human-centric Spatio-Temporal Video Grounding With Visual Transformers

In this work, we introduce a novel task - Humancentric Spatio-Temporal V...

0 Zongheng Tang, et al. ∙

research

∙ 10/01/2020

Linguistic Structure Guided Context Modeling for Referring Image Segmentation

Referring image segmentation aims to predict the foreground mask of the ...

2 Tianrui Hui, et al. ∙

Si Liu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro