Michael Ryoo

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Sergey Levine
379 publications
Xi Chen
293 publications
Chelsea Finn
167 publications
Federico Tombari
130 publications
Peng Xu
128 publications
Fahad Shahbaz Khan
127 publications
Salman Khan
105 publications
Yao Lu
73 publications
Ofir Nachum
63 publications
Krzysztof Choromanski
63 publications
Igor Mordatch
55 publications

research

∙ 07/28/2023

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

We study how vision-language models trained on Internet-scale data can b...

0 Anthony Brohan, et al. ∙

research

∙ 07/20/2023

Language-based Action Concept Spaces Improve Video Self-Supervised Learning

Recent contrastive language image pre-training has led to learning highl...

0 Kanchana Ranasinghe, et al. ∙

research

∙ 12/13/2022

RT-1: Robotics Transformer for Real-World Control at Scale

By transferring knowledge from large, diverse, task-agnostic datasets, m...

0 Anthony Brohan, et al. ∙

research

∙ 04/01/2022

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Large foundation models can exhibit unique capabilities depending on the...

1 Andy Zeng, et al. ∙

research

∙ 12/02/2021

Self-supervised Video Transformer

In this paper, we propose self-supervised training for video transformer...

0 Kanchana Ranasinghe, et al. ∙

research

∙ 04/14/2021

Adaptive Intermediate Representations for Video Understanding

A common strategy to video understanding is to incorporate spatial and m...

0 Juhana Kangaspunta, et al. ∙

research

∙ 02/05/2018

Musical Chair: Efficient Real-Time Recognition Using Collaborative IoT Devices

The prevalence of Internet of things (IoT) devices and abundance of sens...

0 Ramyad Hadidi, et al. ∙

Success!

An error occurred

Michael Ryoo

Featured Co-authors

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Language-based Action Concept Spaces Improve Video Self-Supervised Learning

RT-1: Robotics Transformer for Real-World Control at Scale

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Self-supervised Video Transformer

Adaptive Intermediate Representations for Video Understanding

Musical Chair: Efficient Real-Time Recognition Using Collaborative IoT Devices

Sign in with Google

Consider DeepAI Pro