Visual Question Answering (VQA) aims to automatically answer natural lan...
As the rapid progression of practical applications based on Large Langua...
Nowadays, the versatile capabilities of Pre-trained Large Language Model...
Modulo sampling or unlimited sampling has recently drawn a great deal of...
In this paper, we present VideoGen, a text-to-video generation approach,...
Local differential privacy techniques for numerical data typically trans...
Identification and analysis of symmetrical patterns in the natural world...
Data compression algorithms typically rely on identifying repeated seque...
Temporal video grounding (TVG) aims to retrieve the time interval of a
l...
In today's competitive and fast-evolving business environment, it is a
c...
Detecting adversarial samples that are carefully crafted to fool the mod...
Adversarial training is one of the best-performing methods in improving ...
Deploying pre-trained transformer models like BERT on downstream tasks i...
Language models have the potential to assess mental health using social ...
Current clustering-based Open Relation Extraction (OpenRE) methods usual...
Semantic matching is a mainstream paradigm of zero-shot relation extract...
The existing supervised relation extraction methods have achieved impres...
Multi-modal contrastive learning (MMCL) has recently garnered considerab...
Executing actions in a correlated manner is a common strategy for human
...
Dense retrieval is widely used for entity linking to retrieve entities f...
Prompting methods such as Chain-of-Thought (CoT) have shed new light on
...
Embedding models have shown great power in knowledge graph completion (K...
Pretrained language models have achieved remarkable success in various
n...
This paper studies the computational offloading of video action recognit...
Existing models for named entity recognition (NER) are mainly based on
l...
To help the visually impaired enjoy movies, automatic movie narrating sy...
Models trained with empirical risk minimization (ERM) are revealed to ea...
The recognition of dataset names is a critical task for automatic inform...
The increasing prevalence of gigapixel resolutions has presented new
cha...
Instruction tuning for large language models (LLMs) has gained attention...
Recent studies have shown that dual encoder models trained with the
sent...
The principle of continual relation extraction (CRE) involves adapting t...
ELECTRA, the generator-discriminator pre-training framework, has achieve...
Dataset bias, i.e., the over-reliance on dataset-specific literal heuris...
We present the UrbanBIS benchmark for large-scale 3D urban understanding...
The rapid growth of social media has caused tremendous effects on inform...
Recently, intelligent reflecting surface (IRS) and unmanned aerial vehic...
Large language models have unlocked strong multi-task capabilities from
...
Exponential growth in the amount of data generated by the Internet of Th...
This paper discusses the results for the second edition of the Monocular...
Deep learning-based human activity recognition (HAR) methods have shown ...
GPT series models, such as GPT-3, CodeX, InstructGPT, ChatGPT, and so on...
While generative modeling has been ubiquitous in natural language proces...
Large Language Models (LLMs) are popular for their impressive abilities,...
The number of IoT devices is expected to continue its dramatic growth in...
In recent years, contrastive learning achieves impressive results on
sel...
Maximal biclique enumeration is a fundamental problem in bipartite graph...
Visual Simultaneous Localization and Mapping (SLAM) has received signifi...
The GPT-3.5 models have demonstrated impressive performance in various
N...
Since reconfigurable intelligent surface (RIS) is considered to be a pas...