Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)
Measuring AI Agent Autonomy: Towards a Scalable Approach With Code Inspection
Merlin Stein · Peter Cihon · Gagan Bansal · Sam Manning
Keywords: [ code review ] [ AutoGen ] [ autonomy ] [ assessment ] [ code inspection ] [ AI systems ] [ AI agents ] [ evaluation ]
AI agents are systems that can achieve complex goals autonomously. Assessingan AI agent’s level of autonomy is key to understanding its benefits and risks.Current assessments of autonomy largely focus on specific risks and use run-timeevaluations. Inspired by practices in other industries, we articulate levels of agentautonomy. We introduce a code-based assessment of autonomy – wherein theorchestration code used to run an AI system is scored according to a taxonomythat assesses attributes of autonomy: impact and oversight. We demonstrate thisapproach with the AutoGen framework and select applications.