Testing the Depth of ChatGPT's Comprehension via Cross-Modal Tasks Based on ASCII-Art: GPT3.5's Abilities in Regard to Recognizing and Generating ASCII-Art Are Not Totally Lack

07/28/2023
by   David Bayani, et al.
0

Over the eight months since its release, ChatGPT and its underlying model, GPT3.5, have garnered massive attention, due to their potent mix of capability and accessibility. While a niche-industry of papers have emerged examining the scope of capabilities these models possess, the information fed to and extracted from these networks has been either natural language text or stylized, code-like language. Drawing inspiration from the prowess we expect a truly human-level intelligent agent to have across multiple signal modalities, in this work we examine GPT3.5's aptitude for visual tasks, where the inputs feature content provided as ASCII-art without overt distillation into a lingual summary. We conduct experiments analyzing the model's performance on image recognition tasks after various transforms typical in visual settings, trials investigating knowledge of image parts, and tasks covering image generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2023

Diffusion idea exploration for art generation

Cross-Modal learning tasks have picked up pace in recent times. With ple...
research
05/09/2023

ImageBind: One Embedding Space To Bind Them All

We present ImageBind, an approach to learn a joint embedding across six ...
research
05/24/2023

Transferring Visual Attributes from Natural Language to Verified Image Generation

Text to image generation methods (T2I) are widely popular in generating ...
research
08/17/2022

Understanding Attention for Vision-and-Language Tasks

Attention mechanism has been used as an important component across Visio...
research
05/24/2023

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

The recent popularity of text-to-image diffusion models (DM) can largely...
research
11/22/2021

L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transfo...

Please sign up or login with your details

Forgot password? Click here to reset