ConfLab: A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions in the Wild

by   Chirag Raman, et al.

Recording the dynamics of unscripted human interactions in the wild is challenging due to the delicate trade-offs between several factors: participant privacy, ecological validity, data fidelity, and logistical overheads. To address these, following a 'datasets for the community by the community' ethos, we propose the Conference Living Lab (ConfLab): a new concept for multimodal multisensor data collection of in-the-wild free-standing social conversations. For the first instantiation of ConfLab described here, we organized a real-life professional networking event at a major international conference. Involving 48 conference attendees, the dataset captures a diverse mix of status, acquaintance, and networking motivations. Our capture setup improves upon the data fidelity of prior in-the-wild datasets while retaining privacy sensitivity: 8 videos (1920x1080, 60 fps) from a non-invasive overhead view, and custom wearable sensors with onboard recording of body motion (full 9-axis IMU), privacy-preserving low-frequency audio (1250 Hz), and Bluetooth-based proximity. Additionally, we developed custom solutions for distributed hardware synchronization at acquisition, and time-efficient continuous annotation of body keypoints and actions at high sampling rates. Our benchmarks showcase some of the open research tasks related to in-the-wild privacy-preserving social data analysis: keypoints detection from overhead camera views, skeleton-based no-audio speaker detection, and F-formation detection.


page 2

page 6

page 16

page 19

page 20

page 23

page 24


No-audio speaking status detection in crowded settings via visual pose-based filtering and wearable acceleration

Recognizing who is speaking in a crowded scene is a key challenge toward...

A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings

Existing data acquisition literature for human behavior research provide...

PAL: Intelligence Augmentation using Egocentric Visual Context Detection

Egocentric visual context detection can support intelligence augmentatio...

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

We present a framework for modeling interactional communication in dyadi...

MMASD: A Multimodal Dataset for Autism Intervention Analysis

Autism spectrum disorder (ASD) is a developmental disorder characterized...

Recessive Social Networking: Preventing Privacy Leakage against Reverse Image Search

This work investigates the image privacy problem in the context of socia...

Please sign up or login with your details

Forgot password? Click here to reset