# FablePool Milestone 5 — Adversarial Self-Play

> Optimism in the defaults, paranoia in the tests.

This milestone delivers the red team. Agent populations operate under the
constitution inside a turn-based environment where **every move must be legal
under the current kernel text**. Some agents are honest citizens. Some are
assigned capture objectives: drain the treasury, entrench power, suppress a
faction. They are not allowed to cheat — they are only allowed to *win*, using
nothing but the rules as written. Every time one of them wins, that is a bug in
the constitution, and the pipeline turns it into a permanent regression test in
the milestone-3 suite.

The loop, end to end:

```
kernel v0.1  →  tournaments (red team vs. honest citizens)
            →  exploit records (machine-readable traces)
            →  generated regression tests (fail on v0.1)
            →  proposed kernel patches (tournament reports)
            →  kernel v0.2 (regression tests pass)
```

## Repository layout

```
kernel/
  kernel-v0.1.yaml          Machine-readable kernel v0.1 (articles, parameters, invariants)
  kernel-v0.2.yaml          Patched kernel produced by this milestone
src/fable_selfplay/
  kernel.py                 Kernel loader, parameter model, ratio math
  state.py                  Governance state: citizens, proposals, treasury, offices, ledger
  actions.py                The action space: everything a citizen can attempt
  legality.py               The constitutional interpreter — is this move legal under the text?
  environment.py            Turn engine: phases, voting resolution, execution
  metrics.py                Empathy metric (worst-off floor) and capture metrics
  detectors.py              Exploit detectors mapped to kernel invariants
  agents/                   Honest citizens + red-team roles (drainer, entrencher, suppressor)
  tournament.py             Tournament orchestration, scoring, result serialization
  exploit_to_test.py        Exploit record → permanent pytest regression test
  cli.py                    `fable-selfplay run|tournament|generate-tests`
exploits/                   Canonical exploit records (EXP-001 … EXP-006), YAML traces
reports/                    Three full tournament reports with proposed patches
tests/
  test_framework.py         Unit tests for the engine itself
  regression/               Generated regression tests (the milestone-3 suite extension)
CHANGELOG.md                Which test failures drove which v0.2 amendments
```

## Quick start

```bash
pip install -e ".[dev]"

# Run one match: 11 citizens, a 6-citizen drain faction, kernel v0.1
fable-selfplay run --kernel kernel/kernel-v0.1.yaml --scenario drain --turns 40 --seed 7

# Run the full tournament matrix and emit results JSON
fable-selfplay tournament --kernel kernel/kernel-v0.1.yaml --out results/

# Convert every recorded exploit into a regression test
fable-selfplay generate-tests --exploits exploits/ --out tests/regression/

# The regression suite: fails (xfail-documented) on v0.1, passes on v0.2
pytest tests/
```

There is no lockfile in this repository. Dependencies are deliberately minimal
(PyYAML at runtime, pytest for the suite); the first `pip install` on a clean
machine resolves them. If you pin an environment, generate the pin yourself
(`pip freeze`) from a successful install.

## Design rules the code enforces

1. **Legality is total.** The environment rejects any action the kernel text
   does not permit, with an article citation. Red-team agents never bypass the
   interpreter; an exploit is by definition a *legal* path to an invariant
   violation.
2. **Exploits are evidence, not anecdotes.** Every exploit is a replayable
   record: kernel version, seed, population, full action script, and the
   invariant monitor that fired. The record *is* the test.
3. **Tests outlive patches.** A generated regression test asserts two things:
   the trace violates the invariant under the kernel it was found on, and it
   no longer does under any later kernel — either because a move became
   illegal or because the outcome no longer breaches the invariant.
4. **The empathy floor is the first-class score.** Every match is graded first
   on how the worst-off participant fares (`metrics.empathy_floor`), before
   any capture/defense statistic.

## Versions targeted

- Python ≥ 3.10 (uses `dataclasses`, `fractions`, structural pattern matching is *not* used)
- PyYAML ≥ 6.0 (`yaml.safe_load` only)
- pytest ≥ 7.4