Vasudev Lal

research

∙ 06/28/2023

ICSVR: Investigating Compositional and Semantic Understanding in Video Retrieval Models

Video retrieval (VR) involves retrieving the ground truth video from the...

0 Avinash Madasu, et al. ∙

research

∙ 05/31/2023

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

Two-Tower Vision-Language (VL) models have shown promising improvements ...

0 Xiao Xu, et al. ∙

research

∙ 05/20/2023

Brain encoding models based on multimodal transformers can transfer across language and vision

Encoding models have been used to assess how the human brain represents ...

0 Jerry Tang, et al. ∙

research

∙ 05/18/2023

LDM3D: Latent Diffusion Model for 3D

This research paper proposes a Latent Diffusion Model for 3D (LDM3D) tha...

0 Gabriela Ben Melech Stan, et al. ∙

research

∙ 02/10/2023

Is multi-modal vision supervision beneficial to language?

Vision (image and video) - Language (VL) pre-training is the recent popu...

0 Avinash Madasu, et al. ∙

research

∙ 10/18/2022

Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs

The extraction of aspect terms is a critical step in fine-grained sentim...

23 Phillip Howard, et al. ∙

research

∙ 08/24/2022

Improving video retrieval using multilingual knowledge transfer

Video retrieval has seen tremendous progress with the development of vis...

4 Avinash Madasu, et al. ∙

research

∙ 06/17/2022

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning

Vision-Language (VL) models with the Two-Tower architecture have dominat...

9 Xiao Xu, et al. ∙

research

∙ 03/30/2022

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers

Breakthroughs in transformer-based models have revolutionized not only t...

0 Estelle Aflalo, et al. ∙

research

∙ 09/22/2021

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation

Self-supervised vision-and-language pretraining (VLP) aims to learn tran...

5 Yongfei Liu, et al. ∙

Vasudev Lal

Featured Co-authors

Sign in with Google

Consider DeepAI Pro