Alexander Hauptmann

research

∙ 09/18/2023

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin

Recent research in representation learning has shown that hierarchical d...

0 Gabriel Moreira, et al. ∙

research

∙ 03/16/2021

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

This paper studies zero-shot cross-lingual transfer of vision-language m...

7 Po-Yao Huang, et al. ∙

research

∙ 10/06/2020

Support-set bottlenecks for video-text representation learning

The dominant paradigm for learning video-text representations – noise co...

1 Mandela Patrick, et al. ∙

research

∙ 08/11/2020

Robust Long-Term Object Tracking via Improved Discriminative Model Prediction

We propose an improved discriminative model prediction method for robust...

16 Seokeon Choi, et al. ∙

research

∙ 07/30/2020

From A Glance to "Gotcha": Interactive Facial Image Retrieval with Progressive Relevance Feedback

Facial image retrieval plays a significant role in forensic investigatio...

9 Xinru Yang, et al. ∙

research

∙ 05/06/2020

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting

Unsupervised machine translation (MT) has recently achieved impressive r...

1 Po-Yao Huang, et al. ∙

research

∙ 03/12/2020

ZSTAD: Zero-Shot Temporal Activity Detection

An integral part of video analysis and surveillance is temporal activity...

12 Lingling Zhang, et al. ∙

research

∙ 01/29/2020

Gun Source and Muzzle Head Detection

There is a surging need across the world for protection against gun viol...

16 Zhong Zhou, et al. ∙

research

∙ 12/13/2019

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

This paper studies the problem of predicting the distribution over multi...

32 Junwei Liang, et al. ∙

research

∙ 09/30/2019

Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations

With the aim of promoting and understanding the multilingual version of ...

0 Po-Yao Huang, et al. ∙

research

∙ 09/17/2019

Improving the Learning of Multi-column Convolutional Neural Network for Crowd Counting

Tremendous variation in the scale of people/head size is a critical prob...

1 Zhi-Qi Cheng, et al. ∙

research

∙ 09/16/2019

Learning Spatial Awareness to Improve Crowd Counting

The aim of crowd counting is to estimate the number of people in images ...

32 Zhi-Qi Cheng, et al. ∙

research

∙ 07/11/2019

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos

Contextual reasoning is essential to understand events in long untrimmed...

0 Shizhe Chen, et al. ∙

research

∙ 06/02/2019

Unsupervised Bilingual Lexicon Induction from Mono-lingual Multimodal Data

Bilingual lexicon induction, translating words from the source language ...

0 Shizhe Chen, et al. ∙

research

∙ 05/26/2019

Technical Report of the Video Event Reconstruction and Analysis (VERA) System - Shooter Localization, Models, Interface, and Beyond

Every minute, hundreds of hours of video are uploaded to social media si...

2 Junwei Liang, et al. ∙

research

∙ 05/26/2019

Technical Report of the DAISY System -- Shooter Localization, Models, Interface, and Beyond

Nowadays a huge number of user-generated videos are uploaded to social m...

2 Junwei Liang, et al. ∙

research

∙ 04/04/2019

ExCL: Extractive Clip Localization Using Natural Language Descriptions

The task of retrieving clips within videos based on a given natural lang...

0 Soham Ghosh, et al. ∙

research

∙ 02/11/2019

Peeking into the Future: Predicting Future Person Activities and Locations in Videos

Deciphering human behaviors to predict their future paths/trajectories a...

9 Junwei Liang, et al. ∙

research

∙ 11/29/2018

Perceiving Physical Equation by Observing Visual Scenarios

Inferring universal laws of the environment is an important ability of h...

1 Siyu Huang, et al. ∙

research

∙ 11/29/2018

Traffic Danger Recognition With Surveillance Cameras Without Training Data

We propose a traffic danger recognition model that works with arbitrary ...

0 Lijun Yu, et al. ∙

research

∙ 09/16/2018

CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis

This paper presents a novel dataset for traffic accidents analysis. Our ...

0 Ankit Shah, et al. ∙

research

∙ 09/16/2018

Accident Forecasting in CCTV Traffic Camera Videos

This paper presents a novel dataset for traffic accidents analysis.Our g...

0 Ankit Shah, et al. ∙

research

∙ 09/01/2018

Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset

Moments capture a huge part of our lives. Accurate recognition of these ...

0 Ankit Shah, et al. ∙

research

∙ 08/22/2018

Stacked Pooling: Improving Crowd Counting by Boosting Scale Invariance

In this work, we explore the cross-scale similarity in crowd counting sc...

1 Siyu Huang, et al. ∙

research

∙ 06/22/2018

RUC+CMU: System Report for Dense Captioning Events in Videos

This notebook paper presents our system in the ActivityNet Dense Caption...

0 Shizhe Chen, et al. ∙

research

∙ 06/05/2018

Focal Visual-Text Attention for Visual Question Answering

Recent insights on language and vision with neural networks have been su...

2 Junwei Liang, et al. ∙

research

∙ 04/19/2018

GNAS: A Greedy Neural Architecture Search Method for Multi-Attribute Learning

A key problem in deep multi-attribute learning is to effectively discove...

1 Siyu Huang, et al. ∙

research

∙ 04/17/2018

Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video

We tackle the problem of learning concept classifiers from videos on the...

2 Ryota Hinami, et al. ∙

research

∙ 08/31/2017

Video Captioning with Guidance of Multimodal Latent Topics

The topic diversity of open-domain videos leads to various vocabularies ...

0 Shizhe Chen, et al. ∙

research

∙ 08/04/2017

MemexQA: Visual Memex Question Answering

This paper proposes a new task, MemexQA: given a collection of photos or...

3 Lu Jiang, et al. ∙

Alexander Hauptmann

Featured Co-authors

Sign in with Google

Consider DeepAI Pro