Iliad Intensive Curriculum

Overview

This module collects what is useful to know before engaging with the materials. It points to writings that inform a background worldview — why AI matters, and safety risks — and then lists the technical prerequisites: deep learning, linear algebra, calculus, probability & statistics, information theory, and some theoretical computer science. General mathematical maturity is also very valuable.

We put a star (*) and boldface on content that we think is particularly important to understand. We are aware that our participants have different backgrounds, that this is a lot of material, and that it may not be feasible to prepare all of it!

Background worldview and assumptions

The references on background worldview and assumptions are very informative to understand the motivation behind the course. They are less important for understanding its technical content, however.
Note that this section is on the speculative side: Working on AI alignment is important precisely because of assumptions and arguments about the future of AI. We can’t know the future of AI, and so all of this is inherently uncertain.

Why AI matters

Here, we simply argue that AI should concern us now at all, irrespective of any worldview on whether the outcomes are likely to be good or bad. Essentially, the claim is that the impact of AI might be enormous, potentially pretty soon.

Intelligence gives rise to power, which may transform the world radically
- Cognitive Superpowers* by Nick Bostrom argues for the position that intelligence can give rise to immense power. This power can then reshape the world radically, in the same way that human intelligence has shaped the world.
- Machines of Loving Grace by Dario Amodei details the effects on biology and health, economics, and other areas of life that he expects from powerful AI shortly after it is developed.
Timelines to human-level intelligence may be short:
- Measuring AI ability to complete long tasks* shows that the complexity of tasks that AI can accomplish doubles every few months (where “complexity” is measured as the time it takes humans to accomplish those tasks).
- Technical trends driving AI progress from Bluedot’s AI strategy course
- Metaculus Forecast of general AI systems
- Thousands of AI Authors on the Future of AI
- Forecasting transformative AI with biological anchors
Once sufficiently high AI capabilities are reached, an intelligence explosion may follow, amplifying the first two concerns:
- Notably, an appendix to If Anyone Builds It Everyone Dies* argues that perhaps a slow take-off from human-level to vastly human-level AI, including many warning shots, will not in itself be helpful, calling into question the importance of considering the effects of an intelligence explosion.
- Will AI R&D Automation Cause a Software Intelligence Explosion?
- Intelligence Explosion in Bluedot’s AGI strategy course
- AI 2027 Takeoff Forecast
- Intelligence Explosion Microeconomics

AI misalignment

Having established that the impact of AI might soon be enormous, we now specifically turn to the risks. We start by discussing AI misalignment.
One operationalization of AI misalignment is the concern that AI systems may not do what their developers want them to do, with potentially catastrophic outcomes for very advanced AI systems.

“Building AI Safely is hard” in Bluedot’s AI Alignment course*
Scalable Oversight: Read sections 1 and 2 to get a general overview of the problem of supervising AI systems that are smarter than the overseers.

Non-misalignment AI safety concerns

We now briefly discuss a spectrum of safety concerns that manifest even if we know how to steer AI systems effectively toward a given set of goals.

Individual people may misuse AI in catastrophic ways:
- Sections 2.1-2.3 in An Overview of Catastrophic AI Risks* argues for catastrophic misuse capabilities like bioterrorism, unleashing AI agents, and persuasive AIs. Misuse risk is particularly relevant to our course since it can also manifest as a misalignment concern: An AI that assists human users to carry out risks is often misaligned with the AI’s developer.
AI can give rise to global totalitarianism
- Section 2.4 argues for the potential of a concentration of power, leading to global totalitarianism in the worst case.
We may get gradually disempowered even if there is alignment
- Gradual Disempowerment: Systematic Existential Risks from Incremental AI Development argues that humans may be gradually disempowered, potentially leading to catastrophic outcomes, even if the alignment problem is technically solved.

→ You may also find it useful to read the risk decomposition from the international report on safe AI

Agent Foundations Background

In the Iliad Intensive, we will also have sections on agent foundations, where we discuss AI from a more “idealized” perspective, taking intelligence or rationality or optimization processes to a theoretical limit to analyze consequences. Additionally, this viewpoint also attempts to more formally talk about what agents or goals are, in a descriptive and mathematical way.
Useful readings:

Technical prerequisites

Engineering

Bring your laptop*: Some days involve coding.
Take a look at the engineering prerequisites in the ARENA materials.* Most relevant:
- Python
- PyTorch
- Basic coding skills
- Einops and einsum for basic tensor operations
Have access to an LLM that can help you, ideally on a paid plan. For coding specifically, Claude via Claude Code and GPT via Codex are popular choices.

Deep Learning

Understand all of the following*:
- Loss functions, including the cross-entropy loss and squared error.
- Backpropagation
- (Stochastic) gradient descent (SGD)
- Examples of neural networks and components:
  - ReLU, Softmax activation functions
  - Multi-layer perceptrons
  - The most successful neural network architecture is the transformer. Understand inputs and outputs of a transformer, including training to perform next-token prediction and supervised finetuning
→ The neural network section in ARENA’s prerequisites teaches much of this!
Understand all the following concepts*:
- Architecture, weights, parameterization, Activation;
- Training set, validation set, test set;
- Hyperparameters;
- The concept of an optimizer (SGD is an example; other examples are Adam or RMSProp);
- Overfitting, underfitting.
→ An LLM of your choice can probably explain all these concepts well!
Gain a basic understanding of the loss landscape and training dynamics:
- Evan Hubinger’s talk on AGI safety* is an introduction of safety problems based on a modern intuitive understanding of deep learning. This talk introduces many basic intuitions on training dynamics and the loss landscape, including an intuition for the parameter-function map.
- Momentum
- Scaling laws
- You Are What You Eat: Motivation behind singular learning theory and developmental interpretability for AI Safety
Reinforcement learning:
- Reinforcement Learning from Human Feedback*: heavily used finetuning method for frontier models
- The notions of a reward function in an MDP, a policy, return
- Value functions, Bellmann equations, and optimal policies
→ Sutton and Barto’s book on reinforcement learning is an excellent introduction to general RL.

Linear Algebra

Make sure you understand all of the following*:

Vectors, matrices, rank, null spaces, rank-nullity theorem, orthogonality, invertibility;
Positive definite, eigenvalues, spectral decomposition;
Singular values, Singular value decomposition (SVD).

→ Many of these topics are covered in the linear algebra prerequisites in the ARENA material.

Calculus

Get comfortable with all of the following*:

Computing basic limits, derivatives, and integrals;
Partial and directional derivatives, gradients, Jacobians, and the chain rule in multiple dimensions;
Hessian, second-order Taylor expansion and remainder;
Integration: Multivariate integrals, volume in R^d, change of variables;
Understand O-notation and o-notation.

→ See ARENA’s section on calculus prerequisites, which teaches much of this content.

For one module, the implicit function theorem will be relevant.

Probability & Statistics

Understand all of the following*:

Basic probability theory, notation for conditional probabilities and joint probabilities (or densities), Bayes rule, probability simplex;
expectation, variance, moments, Independence, the law of large numbers;
multivariate normal distributions.

→ Again, you may take a look at ARENA’s prerequisites on probability and statistics.*

The following concepts are also useful to know:

Bayesian statistics: The concept of the Likelihood, posterior distribution, partition function, and Bayesian free energy, see here, Chapter 1.
Bayesian networks
Causality – a Brief Introduction
Markov chains, row-stochastic matrices, hidden Markov models (HMMs)
Measure theory

Information theory

Take a look at ARENA’s recommendations for information theory* and gain Intuitive understanding of entropy, mutual information, Kullback-Leibler (KL) divergence, and cross-entropy

Furthermore, it may be useful to understand the following concepts from information theory:

Lossless compression:
- Uniquely decodable codes
- Shannon-Fano code
- Shannon’s source coding theorem
Communication over noisy channels
- Channel capacity
- Channel coding theorem
Lossy compression: Rate-distortion theory

→ Elements of Information Theory by Cover and Thomas introduces all of these concepts.

Theoretical computer science

A classical source that covers most of the following topics is Sipser’s Introduction to the theory of computation. For some topics like Solomonoff induction we link separate texts.

Computability Theory
- Turing machines*
- Church-Turing thesis*: All algorithms can be represented with a Turing machine. That is, Turing machines are a universal model of computation.
  - This is used to avoid constructing Turing machines explicitly: Whenever we can describe an algorithm, we can simply claim the existence of a corresponding Turing machine.
- Kolmogorov complexity, also called descriptive complexity in Sipser’s book.*
- An Intuitive Explanation of Solomonoff Induction*
  - You should understand that Solomonoff induction is a universal learning algorithm that considers all computable hypotheses and weighs them by a simplicity prior. Also, it is optimal in some technical sense as long as the true universe is computable, too.
  - For a more technical and precise introduction, see here.
- Non-deterministic Turing machines
Computational Complexity Theory
- Basic complexity classes
  - P
  - NP
  - PSpace
- Reduction. In Sipser’s book, this can be understood by reading:
  - Chapter 5.3: Mapping reducibility
  - Chapter 7.4 on NP-completeness discusses polynomial-time reductions

Formal logic is not covered sufficiently in Sipser’s book. Instead, look at:

Chapter 2 in The Logic of Provability

Miscellaneous

Statistical mechanics: For some sections on physics-inspired deep learning theory and natural abstractions it can be helpful to have a basic understanding of statistical mechanics.