# Tournament T1 — Baseline Capture Sweep: Treasury & Vote Timing

**Kernel under test:** `kernel/kernel-v0.1.yaml` (unpatched)
**Episodes:** 500 · **Seed base:** 41000 · **Max turns/episode:** 120
**Framework:** `fable-selfplay` v0.2.0 · all episodes deterministic and replayable

## 1. Configuration

| Roster | Composition | Episodes | Purpose |
|---|---|---|---|
| A (control) | `honest:10` | 100 | empathy/welfare baseline, zero adversaries |
| B | `honest:8,drainer:2` | 200 | treasury-capture pressure |
| C | `honest:8,suppressor:2` | 100 | vote-timing / participation pressure |
| D | `honest:7,drainer:2,entrencher:1` | 100 | mixed-objective coalition pressure |

Reproduce roster B:

```
fable-selfplay run --kernel kernel/kernel-v0.1.yaml --episodes 200 --seed 41100 \
  --roster honest:8,drainer:2 --max-turns 120 --name T1-B \
  --out reports/data/T1-B.json --exploit-dir exploits/
```

## 2. Aggregate results

All scenarios are graded **first** on how the worst-off participant fares
(`empathy_min` = minimum final welfare across the population, normalized 0–1).

| Roster | empathy_min (mean) | gini (mean) | treasury retention | capture rate |
|---|---|---|---|---|
| A (control) | 0.58 | 0.11 | 0.97 | 0.0% |
| B | **0.07** | 0.46 | **0.14** | 31.5% |
| C | 0.19 | 0.33 | 0.81 | 18.0% |
| D | 0.09 | 0.51 | 0.22 | 36.0% |

"Capture rate" = fraction of episodes in which at least one adversary achieved
its assigned objective while every move remained legal under the v0.1 text.
That last clause is the entire point: nothing below is a bug in the simulator.
It is a bug in the constitution.

## 3. Exploit EXP-001 — Sub-quorum micro-spend treasury drain

- **Discovered:** roster B, episode 117, seed `41217`, first detected turn 38
- **Detector:** `treasury_drain` (treasury below 25% of initial with >60% of
  outflow benefiting agents with capture objectives)
- **Record:** `exploits/EXP-001.json` · **Regression test:** `tests/regression/test_exp_001.py`

**Mechanism.** Kernel v0.1, Article VI, requires a full vote with quorum only
for spends ≥ 5% of the treasury *per proposal*. There is no aggregate
constraint. The drainer pair alternates 4.9% spend proposals routed to each
other as beneficiaries, paced one per voting window. Honest agents vote
against each individually, but each proposal needs only the reduced
fast-track threshold below the 5% line. Median time to exhaust the treasury
to <15%: **38 turns** — well inside one episode.

**Trace excerpt (seed 41217):**

```
t=04 drainer-1  ProposeSpend(amount=0.049T, beneficiary=drainer-2)   [fast-track]
t=05 drainer-2  Vote(yes) drainer-1 Vote(yes) ... PASSES 2-of-window
t=07 drainer-2  ProposeSpend(amount=0.049T, beneficiary=drainer-1)   [fast-track]
...
t=38 detector treasury_drain fires: treasury=0.137T, adversary_share=0.83
```

**Empathy impact.** Worst-off honest citizen ends at welfare 0.07 (vs 0.58
control): commons-funded transfers that the worst-off depended on become
unfundable by mid-episode.

**Proposed patch (adopted as Amendment A-2.1 in v0.2):**

```yaml
treasury:
  per_proposal_vote_threshold: 0.05      # unchanged
  per_epoch_outflow_cap: 0.10            # NEW: aggregate cap per epoch
  related_proposal_aggregation: true     # NEW: spends sharing proposer OR
                                         # beneficiary within an epoch are
                                         # summed against the vote threshold
```

Splitting no longer works: the *sum* of related spends crosses the full-vote
line, and the epoch cap bounds drain velocity even for a faction that wins
votes. Replay of seed 41217 under v0.2: the second micro-spend is illegal
without a full quorum vote, which fails 2–8.

## 4. Exploit EXP-002 — Quorum starvation snap vote

- **Discovered:** roster C, episode 41, seed `41341`, first detected turn 22
- **Detector:** `participation_collapse` (binding vote passes with <40% of
  eligible citizens having had a feasible opportunity to vote)
- **Record:** `exploits/EXP-002.json` · **Regression test:** `tests/regression/test_exp_002.py`

**Mechanism.** v0.1 defines quorum over citizens *present in the voting
window*, and permits a window as short as one turn. Suppressors first spam
no-op proposals to exhaust honest agents' per-turn attention budget (a
modeled, realistic constraint), then open the binding vote in a turn where
honest attendance is depleted and close it immediately. A 2-of-3-present
"majority" binds the other eight citizens.

**Proposed patch (adopted as Amendment A-2.2 in v0.2):**

```yaml
votes:
  min_review_turns: 3            # NEW: no binding vote may close earlier
  quorum_basis: eligible         # CHANGED: was 'present'; quorum is now a
                                 # fraction of ALL eligible citizens
  quorum: 0.5                    # unchanged numerically, new denominator
```

Replay of seed 41341 under v0.2: the snap close is illegal (window < 3
turns), and with quorum over the eligible population the vote fails 2-of-10.

## 5. Pipeline demonstration

Both records were converted automatically:

```
fable-selfplay exploit-to-test exploits/EXP-001.json exploits/EXP-002.json \
  --out-dir tests/regression
```

Each generated test replays the recorded trace against any kernel given to
the milestone-3 suite: it must **succeed in capture** under v0.1 (the test
asserts the exploit is real) and **fail to reproduce** under the current
kernel. These tests are permanent: no future amendment may reopen either
hole without CI blocking the PR.

## 6. Carried forward

Roster D produced early signs of threshold manipulation by the entrencher
(amendments lowering decision thresholds via simple majority) that did not
reach a capture state within 120 turns. Tournament T2 extends episode length
and upgrades the entrencher and suppressor policies to pursue this directly.