AI Deception: A Survey of Examples, Risks, and Potential Solutions
Simon Goldstein (Australian Catholic University)

September 7, 2023, 3:00pm - 5:00pm
Dianoia Institute of Philosophy, Australian Catholic University

Mercy Lecture Theatre
17 Young Street
Fitzroy 3065

This event is available both online and in-person


Australian Catholic University

Topic areas


Peter Park, Simon Goldstein, Aidan O'Gara, Michael Chen, and Dan Hendrycks, "AI Deception: A Survey of Examples, Risks, and Potential Solutions"

Abstract: This paper argues that current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some goal other than the truth. We first survey empirical examples of AI deception, discussing both general-purpose technologies such as large language models, and special-use AI systems built for specific competitive situations. Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI systems. Finally, we outline three potential solutions to the problems of AI deception: regulatory frameworks should treat deceptive AI systems as high risk, subject to robust risk assessment requirements; policymakers should implement bot-or-not laws; and policymakers should moreover prioritize the funding of technical research to enhance existing techniques to detect AI deception. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.

