Interactive demo
This demo shows a simple active inference agent whose behaviour emerges from the negotiation of multiple concurrent objectives rather than a single controller. Safety, goal-seeking, exploration, and energy maintenance each run in parallel, propose candidate actions, and are then integrated by a polyphonic coordination layer into a single negotiated move.
Tip: if the video doesn’t autoplay on mobile, tap once — it’s playsinline.
Each voice represents a distinct objective with its own short-horizon evaluation of possible futures. Rather than collapsing everything into a single fixed reward, the agent maintains these pressures explicitly and integrates them online.
At each step the agent updates beliefs about hidden structure in the world, rolls imagined futures forward using an internal transition model, and selects actions that minimise an expected cost or surprise functional.
What makes this demo different is that these computations are distributed across several concurrent objectives rather than folded into one monolithic controller.
Per voice: rollout → score imagined futures → convert to \(q_k(a)\) → integrate into the negotiated policy.
Each voice proposes an action distribution \(q_k(a)\). A coordination layer then assigns mixture weights \(\pi_k\) according to current context, such as local threat, distance to goal, uncertainty, and battery urgency. The final negotiated action posterior is:
\[ q(a) \;=\; \sum_{k=1}^{K}\pi_k\,q_k(a) \]
In other words, the chosen action is a weighted synthesis of several simultaneously active objectives.
The weights evolve online rather than being fixed in advance, allowing the agent to shift smoothly between caution, goal pursuit, exploration, and homeostatic regulation as the situation changes.
Each voice evaluates short imagined futures using a cost functional that captures its own priorities. In active inference terms, this resembles expected free energy in that it trades off pragmatic value against information-seeking pressure.
\[ G(\pi) \approx \underbrace{\mathbb{E}[\text{risk / cost}]}_{\text{pragmatic}} \;-\; \underbrace{\mathbb{E}[\text{information gain}]}_{\text{epistemic}} \]
Different voices place different emphasis on these terms, giving each one a distinct behavioural tendency.
The exact implementation here is intentionally simple. The point of the demo is not formal completeness, but to show how multiple active objectives can be kept explicit and negotiated online.
Real agents do not optimise a single objective in isolation. They balance safety, goals, uncertainty, energy, and other constraints across multiple time-scales. Polyphonic intelligence treats this multiplicity as a core architectural feature rather than an afterthought.
The result is behaviour that is often more interpretable, more adaptable to changing context, and less brittle than systems built around a single dominant controller. Because each voice remains visible, the agent’s decisions can be inspected in terms of which pressures were active and how they were resolved.