Paul Hongsuck Seo

research

∙ 03/31/2023

Zero-shot Referring Image Segmentation with Global-Local Context Features

Referring image segmentation (RIS) aims to find a segmentation mask give...

0 Seonghoon Yu, et al. ∙

research

∙ 03/29/2023

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR

Audiovisual automatic speech recognition (AV-ASR) aims to improve the ro...

5 Paul Hongsuck Seo, et al. ∙

research

∙ 03/25/2023

IFSeg: Image-free Semantic Segmentation via Vision-Language Model

Vision-language (VL) pre-training has recently gained much attention for...

0 Sukmin Yun, et al. ∙

research

∙ 03/21/2023

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

Existing works on open-vocabulary semantic segmentation have utilized la...

0 Seokju Cho, et al. ∙

research

∙ 02/27/2023

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

In this work, we introduce Vid2Seq, a multi-modal single-stage dense eve...

5 Antoine Yang, et al. ∙

research

∙ 11/18/2022

AVATAR submission to the Ego4D AV Transcription Challenge

In this report, we describe our submission to the Ego4D AudioVisual (AV)...

6 Paul Hongsuck Seo, et al. ∙

research

∙ 06/15/2022

AVATAR: Unconstrained Audiovisual Speech Recognition

Audio-visual automatic speech recognition (AV-ASR) is an extension of AS...

8 Valentin Gabeur, et al. ∙

research

∙ 04/01/2022

Learning Audio-Video Modalities from Image Captions

A major challenge in text-video and text-audio retrieval is the lack of ...

3 Arsha Nagrani, et al. ∙

research

∙ 01/20/2022

End-to-end Generative Pretraining for Multimodal Video Captioning

Recent video and language pretraining frameworks lack the ability to gen...

7 Paul Hongsuck Seo, et al. ∙

research

∙ 12/10/2020

Look Before you Speak: Visually Contextualized Utterances

While most conversational AI systems focus on textual dialogue only, con...

2 Paul Hongsuck Seo, et al. ∙

research

∙ 11/21/2019

Reinforcing an Image Caption Generator Using Off-Line Human Feedback

Human ratings are currently the most accurate way to assess the quality ...

0 Paul Hongsuck Seo, et al. ∙

research

∙ 10/03/2019

Regularizing Neural Networks via Stochastic Branch Layers

We introduce a novel stochastic regularization technique for deep neural...

0 Wonpyo Park, et al. ∙

research

∙ 09/28/2018

Confidence Calibration in Deep Neural Networks through Stochastic Inferences

We propose a generic framework to calibrate accuracy and confidence (sco...

0 Seonguk Seo, et al. ∙

research

∙ 08/06/2018

CPlaNet: Enhancing Image Geolocalization by Combinatorial Partitioning of Maps

Image geolocalization is the task of identifying the location depicted i...

2 Paul Hongsuck Seo, et al. ∙

research

∙ 08/06/2018

Attentive Semantic Alignment with Offset-Aware Correlation Kernels

Semantic correspondence is the problem of establishing correspondences a...

2 Paul Hongsuck Seo, et al. ∙

research

∙ 09/23/2017

Visual Reference Resolution using Attention Memory for Visual Dialog

Visual dialog is a task of answering a series of inter-dependent questio...

0 Paul Hongsuck Seo, et al. ∙

research

∙ 12/06/2016

MarioQA: Answering Questions by Watching Gameplay Videos

We present a framework to analyze various aspects of models for video qu...

0 Jonghwan Mun, et al. ∙

research

∙ 06/08/2016

Progressive Attention Networks for Visual Attribute Prediction

We propose a novel attention model that can accurately attend to target ...

0 Paul Hongsuck Seo, et al. ∙

research

∙ 11/18/2015

Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction

We tackle image question answering (ImageQA) problem by learning a convo...

0 Hyeonwoo Noh, et al. ∙

Paul Hongsuck Seo

Featured Co-authors

Sign in with Google

Consider DeepAI Pro