Neuro-Inspired Active Inference Agent

Modular, general-purpose generative architectures for intelligent behaviour.

Overview

This page outlines the architecture and inference procedures used in our neuro-inspired active inference agent, developed as part of our work on general-purpose intelligent systems. The agent is based on predictive coding, variational message passing, and the free energy principle.

Generative Model Structure

The agent is structured as a 7-module generative model, with components handling perception, dynamics, action selection, and learning. The latent state evolves according to:

\[ x_{t+1} = f(x_t, a_t) + \omega_t, \quad y_t = g(x_t) + \epsilon_t \]

where \( x_t \) is the hidden state, \( a_t \) the action, \( y_t \) the sensory input, and \( \omega_t, \epsilon_t \) are noise terms.

Free Energy Minimisation

The agent selects actions and updates beliefs to minimise variational free energy:

\[ \mathcal{F}(q) = \mathbb{E}_q [\log q(x) - \log p(x, y)] \]

Action is selected by planning over expected free energy (EFE):

\[ G(a) = \mathbb{E}_q [ D_{KL}[q(s) \| p(s)] - \mathbb{E}_{q(s)}[\log p(o|s)] ] \]

Inference Loop

Each timestep consists of:

Receive sensory input \( y_t \) (possibly partial/noisy)
Infer hidden state \( x_t \) via variational update (e.g., Laplace)
For each candidate action \( a \), roll out future trajectory and compute expected free energy \( G(a) \)
Select action that minimises \( G(a) \)
Apply action \( a_t \), observe outcome, repeat

Pong Agent

The Pong agent plays a 2D paddle game using an active inference loop. It observes a partial and noisy visual input of the ball and paddle positions, and uses belief updates and action selection to intercept the ball.

Observation: Noisy 1D ball and paddle coordinates
Hidden state: Belief over ball trajectory and paddle location
Action: Move paddle up/down
Planning: Multi-step EFE rollouts to simulate future ball-paddle outcomes
Control: Select actions that bring ball contact into high-probability state space

The agent uses a low-rank Laplace posterior over state variables and updates belief online during gameplay.

Drone Agent

This agent navigates a 2D/3D environment (gridworld or room maze) toward a target using visual or proprioceptive cues. The belief update and action selection loop is as follows:

Observation: egocentric image or sparse distance sensors (noisy)
Hidden state: agent's location, orientation, and velocity
Actions: move forward, turn left/right, hover
Planning: expected free energy minimisation over predicted future states using dynamics model
Goal: minimise uncertainty and reach preferred state (goal location)