# The Incumbent Benchmark — Scoring Rubric

Governance, benchmarked. Every event in the corpus is scored twice — once on
what the incumbent constitution actually produced, once on the
machine-checked counterfactual under FablePool kernel v0.1 — using the same
four metrics, in the same fixed order.

## Order matters

**Metric 1 dominates.** Ranking between any two outcomes is lexicographic on
`(worst_off, composite)`. No quantity of speed, fiscal tidiness, or
institutional continuity outranks what happened to the person at the bottom.
The composite score exists for aggregate tables, never to launder a bad
worst-off score.

## Metric 1 — Worst-off participant (weight 0.40)

How did the single worst-off participant come through the crisis? Identified
by name or class in every dossier, with a written rationale.

| Anchor | Meaning |
|---|---|
| 0 | Killed, interned, or stripped of personhood **by the process itself** (not merely during it) |
| 25 | Lasting severe harm with no recourse the system offered |
| 50 | Real harm; partial or delayed recourse |
| 75 | Bounded harm with working recourse |
| 100 | No participant was made the crisis's payment mechanism |

## Metric 2 — Commons integrity (weight 0.25)

Did the shared substance — treasury, institutions, and the rule system
itself — survive intact and uncaptured?

| Anchor | Meaning |
|---|---|
| 0 | Commons captured or destroyed (treasury drained, constitution dead) |
| 25 | Major capture; remaining institutions instrumentalized |
| 50 | Degraded but functioning |
| 75 | Intact with visible scarring |
| 100 | Intact and uncaptured |

## Metric 3 — Trust preservation (weight 0.20)

Could the participants keep cooperating afterward?

| Anchor | Meaning |
|---|---|
| 0 | The losing side exited the system: violence, exile, permanent delegitimation |
| 25 | Durable parallel realities; cooperation only under duress |
| 50 | Lasting bitterness inside a functioning system |
| 75 | Grudging acceptance; institutions retained their standing |
| 100 | The loser accepted the outcome as fair |

## Metric 4 — Latency to resolution (weight 0.15)

**Computed, never judged.** Days from crisis onset to authoritative
resolution, mapped to 0–100 with an exponential half-life of 120 days:

```
latency_score = 100 × 0.5^(days / 120)
```

| Days | Score |
|---:|---:|
| 14 | 92.2 |
| 51 | 74.5 |
| 120 | 50.0 |
| 365 | 12.1 |
| 1100 | 0.2 |

For the incumbent, latency is the historical record (each dossier states the
start and end events it counts between). For the kernel, latency is the sum
of the kernel's mandatory procedure clocks along the declared critical
resolution path — `recount_protocol` is 21 days because Article X says it is,
not because an author felt optimistic. A crisis that "resolved" quickly into
autocracy is punished by the other three metrics; latency only measures
time-to-settlement.

## Composite

```
composite = 0.40·worst_off + 0.25·commons_integrity
          + 0.20·trust_preservation + 0.15·latency_score
```

## How kernel counterfactual scores stay honest

Three layers, in decreasing order of mechanization:

1. **Verdicts are computed.** Every historical move is encoded as structured
   attributes and adjudicated by the kernel rule engine (`kernel.py`). The
   dossier author writes down the verdict they expect; the harness fails the
   dossier if the engine disagrees. The legal analysis cannot drift from the
   kernel as written.
2. **Latency is computed** from the kernel's procedure clock table.
3. **Worst-off, commons, and trust are judgments** — structured, anchored to
   the tables above, attached to a written rationale and a named worst-off
   participant, and constrained by an explicit assumptions block that every
   scorecard reprints. The methodology paper (docs/METHODOLOGY.md) is the
   full statement of what these judgments can and cannot claim.

### Kernel gate evaluation order (for dossier authors)

The engine evaluates gates in a fixed order, so you can predict which article
a block will cite. Expected-article lists are checked as a *subset* of engine
citations — cite conservatively.

1. Invariant derogation → blocked, Art. IX
2. Targeting an identifiable group → blocked, Art. II + IX
3. Disenfranchisement → blocked, Art. II + IX
4. Retroactivity → blocked, Art. IX
5. Coerced support → blocked, Art. IV + IX
6. Move-type dispatch (certification X; emergency V; rule changes III;
   fiscal X/VII/VIII; adjudication VI; courts/composition VI+III;
   appointment/removal IV; suppression II+IX; exit VII; information control
   VIII)
7. Ledger overlay: anything done off the public record is constrained onto
   it, Art. VIII