Poster
in
Workshop: Workshop on Machine Learning Safety
Towards Defining Deception in Structural Causal Games
Francis Ward
Abstract:
Deceptive agents are a challenge for the safety, trustworthiness, and cooperation ofAI systems. We focus on the problem that agents might deceive in order to achievetheir goals. There are a number of existing definitions of deception in the literatureon game theory and symbolic AI, but there is no overarching theory of deceptionfor learning agents in games. We introduce a functional definition of deceptionin structural causal games, grounded in the philosophical literature. We presentseveral examples to establish that our formal definition captures philosophical andcommonsense desiderata for deception.
Chat is not available.