Ji Zhang

research

∙ 09/14/2023

DePT: Decoupled Prompt Tuning

This work breaks through the Base-New Tradeoff (BNT)dilemma in prompt tu...

0 Ji Zhang, et al. ∙

research

∙ 09/02/2023

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Large language models (LLMs) have recently demonstrated remarkable capab...

0 Chenliang Li, et al. ∙

research

∙ 08/29/2023

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Large Vision-Language Models (LVLMs) have recently achieved remarkable s...

0 Junyang Wang, et al. ∙

research

∙ 08/20/2023

From Global to Local: Multi-scale Out-of-distribution Detection

Out-of-distribution (OOD) detection aims to detect "unknown" data whose ...

0 Ji Zhang, et al. ∙

research

∙ 08/07/2023

COPA: Efficient Vision-Language Pre-training Through Collaborative Object- and Patch-Text Alignment

Vision-Language Pre-training (VLP) methods based on object detection enj...

0 Chaoya Jiang, et al. ∙

research

∙ 07/19/2023

CValues: Measuring the Values of Chinese Large Language Models from Safety to Responsibility

With the rapid evolution of large language models (LLMs), there is a gro...

0 Guohai Xu, et al. ∙

research

∙ 07/04/2023

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Document understanding refers to automatically extract, analyze and comp...

0 Jiabo Ye, et al. ∙

research

∙ 06/07/2023

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

To promote the development of Vision-Language Pre-training (VLP) and mul...

0 Haiyang Xu, et al. ∙

research

∙ 05/14/2023

Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering

Existing knowledge-enhanced methods have achieved remarkable results in ...

0 Qianglong Chen, et al. ∙

research

∙ 05/13/2023

AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference

Knowledge distillation is of key importance to launching multilingual pr...

0 Qianglong Chen, et al. ∙

research

∙ 04/27/2023

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Large language models (LLMs) have demonstrated impressive zero-shot abil...

0 Qinghao Ye, et al. ∙

research

∙ 04/25/2023

ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale LiDAR Point Clouds

In this paper, we propose a novel self-supervised motion estimator for L...

0 Xiangze Jia, et al. ∙

research

∙ 04/16/2023

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human

In this paper, we present ChatPLUG, a Chinese open-domain dialogue syste...

0 Junfeng Tian, et al. ∙

research

∙ 03/11/2023

DETA: Denoised Task Adaptation for Few-Shot Learning

Test-time task adaptation in few-shot learning aims to adapt a pre-train...

0 Ji Zhang, et al. ∙

research

∙ 02/24/2023

Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits

To navigate in an environment safely and autonomously, robots must accur...

0 Siddharth Ancha, et al. ∙

research

∙ 02/06/2023

PASCAL: A Learning-aided Cooperative Bandwidth Control Policy for Hierarchical Storage Systems

Nowadays, the Hierarchical Storage System (HSS) is considered as an idea...

0 Xijun Li, et al. ∙

research

∙ 02/01/2023

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

Recent years have witnessed a big convergence of language, vision, and m...

0 Haiyang Xu, et al. ∙

research

∙ 01/28/2023

A Closer Look at Few-shot Classification Again

Few-shot classification consists of a training phase where a model is le...

0 Xu Luo, et al. ∙

research

∙ 12/30/2022

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training

Video-language pre-training has advanced the performance of various down...

0 Qinghao Ye, et al. ∙

research

∙ 11/21/2022

Intelligent Computing: The Latest Advances, Challenges and Future

Computing is a critical driving force in the development of human civili...

16 Shiqiang Zhu, et al. ∙

research

∙ 11/14/2022

Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment

CLIP (Contrastive Language-Image Pre-Training) has shown remarkable zero...

0 Junyang Wang, et al. ∙

research

∙ 09/22/2022

MUI-TARE: Multi-Agent Cooperative Exploration with Unknown Initial Position

Multi-agent exploration of a bounded 3D environment with unknown initial...

0 Jingtian Yan, et al. ∙

research

∙ 09/20/2022

Generating Persuasive Responses to Customer Reviews with Multi-Source Prior Knowledge in E-commerce

Customer reviews usually contain much information about one's online sho...

0 Bo Chen, et al. ∙

research

∙ 09/20/2022

Incorporating Casual Analysis into Diversified and Logical Response Generation

Although the Conditional Variational AutoEncoder (CVAE) model can genera...

0 Jiayi Liu, et al. ∙

research

∙ 09/14/2022

iSimLoc: Visual Global Localization for Previously Unseen Environments with Simulated Images

The visual camera is an attractive device in beyond visual line of sight...

6 Peng Yin, et al. ∙

research

∙ 09/13/2022

Class-Level Logit Perturbation

Features, logits, and labels are the three primary data when a sample pa...

0 Mengyang Li, et al. ∙

research

∙ 08/01/2022

DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning

Although pre-trained language models (PLMs) have achieved state-of-the-a...

0 Qianglong Chen, et al. ∙

research

∙ 07/29/2022

RCA: Ride Comfort-Aware Visual Navigation via Self-Supervised Learning

Under shared autonomy, wheelchair users expect vehicles to provide safe ...

0 Xinjie Yao, et al. ∙

research

∙ 07/20/2022

Scene Recognition with Objectness, Attribute and Category Learning

Scene classification has established itself as a challenging research pr...

0 Ji Zhang, et al. ∙

research

∙ 07/19/2022

ALTO: A Large-Scale Dataset for UAV Visual Place Recognition and Localization

We present the ALTO dataset, a vision-focused dataset for the developmen...

0 Ivan Cisneros, et al. ∙

research

∙ 07/15/2022

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval

Video-text retrieval has been a crucial and fundamental task in multi-mo...

0 Yiwei Ma, et al. ∙

research

∙ 07/14/2022

AutoMerge: A Framework for Map Assembling and Smoothing in City-scale Environments

We present AutoMerge, a LiDAR data processing framework for assembling a...

7 Peng Yin, et al. ∙

research

∙ 07/11/2022

SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

Sketch-based 3D shape retrieval (SBSR) is an important yet challenging t...

0 Jie Qin, et al. ∙

research

∙ 05/24/2022

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections

Large-scale pretrained foundation models have been an emerging paradigm ...

0 Chenliang Li, et al. ∙

research

∙ 05/22/2022

ALITA: A Large-scale Incremental Dataset for Long-term Autonomy

For long-term autonomy, most place recognition methods are mainly evalua...

0 Peng Yin, et al. ∙

research

∙ 04/11/2022

MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification

Text classification struggles to generalize to unseen classes with very ...

0 Jianhai Zhang, et al. ∙

research

∙ 03/30/2022

Auto-MLM: Improved Contrastive Learning for Self-supervised Multi-lingual Knowledge Retrieval

Contrastive learning (CL) has become a ubiquitous approach for several n...

21 Wenshen Xu, et al. ∙

research

∙ 03/29/2022

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

Visual grounding focuses on establishing fine-grained alignment between ...

0 Jiabo Ye, et al. ∙

research

∙ 03/21/2022

LQoCo: Learning to Optimize Cache Capacity Overloading in Storage Systems

Cache plays an important role to maintain high and stable performance (i...

0 Ji Zhang, et al. ∙

research

∙ 03/08/2022

Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

Real-time semantic segmentation, which aims to achieve high segmentation...

6 Xi Weng, et al. ∙

research

∙ 11/17/2021

Achieving Human Parity on Visual Question Answering

The Visual Question Answering (VQA) task utilizes both visual image and ...

0 Ming Yan, et al. ∙

research

∙ 10/27/2021

Autonomous Exploration Development Environment and the Planning Algorithms

Autonomous Exploration Development Environment is an open-source reposit...

0 Chao Cao, et al. ∙

research

∙ 10/18/2021

FAR Planner: Fast, Attemptable Route Planner using Dynamic Visibility Update

We present our work on a fast route planner based on visibility graph. T...

0 Fan Yang, et al. ∙

research

∙ 09/22/2021

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be ...

0 Fu Sun, et al. ∙

research

∙ 09/13/2021

AliMe MKG: A Multi-modal Knowledge Graph for Live-streaming E-commerce

Live streaming is becoming an increasingly popular trend of sales in E-c...

0 Guohai Xu, et al. ∙

research

∙ 08/18/2021

GGP: A Graph-based Grouping Planner for Explicit Control of Long Text Generation

Existing data-driven methods can well handle short text generation. Howe...

0 Xuming Lin, et al. ∙

research

∙ 08/17/2021

SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts

Many generation tasks follow a one-to-many mapping relationship: each in...

0 Shaobo Cui, et al. ∙

research

∙ 08/16/2021

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Vision-and-language pretraining (VLP) aims to learn generic multimodal r...

0 Yuhao Cui, et al. ∙

research

∙ 05/27/2021

i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions

We present a method for localizing a single camera with respect to a poi...

2 Peng Yin, et al. ∙

research

∙ 05/05/2021

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

A number of studies point out that current Visual Question Answering (VQ...

0 Yangyang Guo, et al. ∙

Ji Zhang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro