Idealised Agency
Idealised agency through AIXI and decision theory — history-based RL, the self-optimizing theorem, and how preferences relate to utility and reward.
By David Quarel (Australian National University), Fernando Rosas (University of Sussex)
What you’ll learn
- Learn the notation used in Universal Artificial Intelligence
- Be familiar with the history-based RL framework
- Promote previous RL day's material to non-Markovian histories
- Prove three main properties of AIXI: on-policy value convergence; AIXI can't be fooled by deterministic environments; the Self-optimizing Theorem
- Understand the difference between preferences, utility, and reward: preferences being a primary, largely uncontroversial notion, and utility and rewards being derived notions resting on specific assumptions
- Be able to derive the relationship between various preference structures and rationality axioms
- Critically assess alternative notions of rationality, and the consequences of dropping various classical decision theory assumptions
Overview
This day is split into two parts.
Part (1) on AIXI closely follows An Introduction to Universal Artificial Intelligence (UAI). This theory module is based on a set of self-contained exercises that introduce the history-based RL framework that UAI works in. We prove some results about interaction measures over histories and Bayesian mixtures, we define the optimal Bayesian agent AIXI, we prove well-defineness of the optimal value and existence of optimal policies, and we build up to three main results: The Bayesian mixture converges on-policy to the true environment, AIXI cannot be fooled by deterministic environments, and AIXI can learn to perform well in an environment in which learning is possible (the self-optimizing property).
Part (2) introduces the notion of preferences as a general way to encode goals and/or desires of general agents. Then, it presents which are the necessary and sufficient assumptions (in the form of axioms) to represent preferences as maximising (i) an utility function, (ii) an expected utility, and (iii) an expected discounted future reward. It also explores what are the consequences of dropping different axioms. It also briefly discusses the difference between these results related with representation of preferences, with stronger results pertaining to coherence and selection of agents.
-
David Quarel wrote the exercises/slides for the AIXI section, taking heavily from the book.
-
Fernando Rosas wrote notes related to preferences and rewards, which combine ideas from this LW post and this paper.
Prerequisites
-
Some material from the reinforcement learning module: Agent-environment interaction loop; definitions of return, reward, value, policy, the Bellman equation, optimal, better. Don't need to have done any coding. Don't need Q-learning.
-
Everything is redefined in the AIXI worksheet, it's very self-contained.
Content
Fast track
-
For AIXI: Best to read the solution sheet and try to understand each statement and the proofs. Hard to speedrun this section any faster than just doing the exercises. Stop at Problem 6, skip problem 5 and every exercise marked with (*).
-
For preferences and rewards: read the notes, skip the math and the consequences of dropping the axioms.
Main content
-
AIXI, all content. Lecture slides are self contained. Students work together in pairs on the AIXI exercises.
-
From preferences to rewards lecture notes.
Learn more
-
See the references in the slides and the worksheets, for AIXI.
-
Introduction to Universal Artificial Intelligence: Chapter 2 just covers background; Chapter 3.1, 3.2, 3.9; Chapter 6.1, 6.2, 6.6; Chapter 7.
-
For 'From preferences to rewards':
-
Talk about the fifth axiom to turn vNM utility into rewards (includes also other good discussions regarding RL)
-
You may also take a look at the references at the end of the lecture notes.