Towards Understanding the Interplay of Generative Artificial Intelligence and the Internet

by   Gonzalo Martinez, et al.

The rapid adoption of generative Artificial Intelligence (AI) tools that can generate realistic images or text, such as DALL-E, MidJourney, or ChatGPT, have put the societal impacts of these technologies at the center of public debate. These tools are possible due to the massive amount of data (text and images) that is publicly available through the Internet. At the same time, these generative AI tools become content creators that are already contributing to the data that is available to train future models. Therefore, future versions of generative AI tools will be trained with a mix of human-created and AI-generated content, causing a potential feedback loop between generative AI and public data repositories. This interaction raises many questions: how will future versions of generative AI tools behave when trained on a mixture of real and AI generated data? Will they evolve and improve with the new data sets or on the contrary will they degrade? Will evolution introduce biases or reduce diversity in subsequent generations of generative AI tools? What are the societal implications of the possible degradation of these models? Can we mitigate the effects of this feedback loop? In this document, we explore the effect of this interaction and report some initial results using simple diffusion models trained with various image datasets. Our results show that the quality and diversity of the generated images can degrade over time suggesting that incorporating AI-created data can have undesired effects on future versions of generative models.


page 12

page 13


Testing of Detection Tools for AI-Generated Text

Recent advances in generative pre-trained transformer large language mod...

Designing Participatory AI: Creative Professionals' Worries and Expectations about Generative AI

Generative AI, i.e., the group of technologies that automatically genera...

Playing with Words: Comparing the Vocabulary and Lexical Richness of ChatGPT and Humans

The introduction of Artificial Intelligence (AI) generative language mod...

Understanding Place Identity with Generative AI

Researchers are constantly leveraging new forms of data with the goal of...

Game of Tones: Faculty detection of GPT-4 generated content in university assessments

This study explores the robustness of university assessments against the...

Instantaneously Trained Neural Networks

This paper presents a review of instantaneously trained neural networks ...

Fact-Checking of AI-Generated Reports

With advances in generative artificial intelligence (AI), it is now poss...

Please sign up or login with your details

Forgot password? Click here to reset