On the caveats of AI autophagy

Xiaodan Xing, Fadong Shi, Jiahao Huang, Yinzhe Wu, Yang Nan, Sheng Zhang, Yingying Fang, Michael Roberts, Carola Bibiane Schönlieb, Javier Del Ser, Guang Yang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Generative artificial intelligence (AI) technologies and large models are producing realistic outputs across various domains, such as images, text, speech and music. Creating these advanced generative models requires significant resources, particularly large and high-quality datasets. To minimize training expenses, many algorithm developers use data created by the models themselves as a cost-effective training solution. However, not all synthetic data effectively improve model performance, necessitating a strategic balance in the use of real versus synthetic data to optimize outcomes. Currently, the previously well-controlled integration of real and synthetic data is becoming uncontrollable. The widespread and unregulated dissemination of synthetic data online leads to the contamination of datasets traditionally compiled through web scraping, now mixed with unlabelled synthetic data. This trend, known as the AI autophagy phenomenon, suggests a future where generative AI systems may increasingly consume their own outputs without discernment, raising concerns about model performance, reliability and ethical implications. What will happen if generative AI continuously consumes itself without discernment? What measures can we take to mitigate the potential adverse effects? To address these research questions, this Perspective examines the existing literature, delving into the consequences of AI autophagy, analysing the associated risks and exploring strategies to mitigate its impact. Our aim is to provide a comprehensive perspective on this phenomenon advocating for a balanced approach that promotes the sustainable development of generative AI technologies in the era of large models.

Original languageEnglish
Pages (from-to)172-180
Number of pages9
JournalNature Machine Intelligence
Volume7
Issue number2
DOIs
Publication statusPublished - Feb 2025

Fingerprint

Dive into the research topics of 'On the caveats of AI autophagy'. Together they form a unique fingerprint.

Cite this