# FablePool Interoperability Guide

This document is for engineers building a **second implementation** of the
FablePool protocol that interoperates with this reference implementation:
same wire format, same log semantics, same sync behavior, same capability
rules.

## 0. Normative sources

Where this prose and the test vectors disagree, **the vectors win**:

- `conformance/vectors/canonicalization.json` — input documents and their
  exact canonical byte serializations (hex) and content hashes.
- `conformance/vectors/signing.json` — fixed Ed25519 keys, messages,
  signatures, and key/author identifier derivations.
- `conformance/vectors/op_scenarios.json` — full op sequences with expected
  accept/reject verdicts and expected derived state, covering chain
  validation, refutation cascade, sync convergence, and capability
  grant/revoke behavior.

`conformance/run_conformance.py` executes all three against this reference
implementation; `conformance/README.md` explains how to point the same
vectors at your own code. **Your implementation conforms when it reproduces
every vector byte-for-byte and verdict-for-verdict.**

## 1. Data model in one paragraph

All state is a set of **ops**. Each op is authored by exactly one key, carries
a monotonically increasing per-author sequence number and a hash pointer to
the author's previous op (forming one append-only chain per author), and is
signed by that key over its canonical bytes. The global state of a node is the
union of all valid ops it has ingested; **derived state is a deterministic
function of that set**, so two nodes with the same op set always agree.

## 2. Canonicalization

Ops are JSON. Before hashing or signing, an op is serialized to **canonical
bytes**, JCS-style (RFC 8785 intent):

- UTF-8 encoding, no insignificant whitespace.
- Object keys sorted lexicographically.
- No duplicate keys; duplicate keys make a document invalid.
- Strings serialized with minimal escaping (only what JSON requires).
- Numbers in protocol fields are integers; implementations MUST NOT emit
  non-integer numbers in op envelopes. Fractional quantities used by the
  protocol (e.g. confidence) are carried as scaled integers or strings as
  defined by the claim body schema exercised in the vectors.
- `null`, `true`, `false` serialized literally.

The **content identifier** of any document is the SHA-256 of its canonical
bytes. An op's id is the content identifier of the op with its signature
field removed (the signature signs the id's preimage; it cannot sign itself).

Validate your canonicalizer against every case in
`canonicalization.json` — including the Unicode, nesting, and key-ordering
edge cases — before anything else. Every other guarantee depends on
byte-exact agreement here.

## 3. Keys, identifiers, signatures

- Signature scheme: **Ed25519** (RFC 8032). The reference implementation uses
  the `cryptography` library's `Ed25519PrivateKey` / `Ed25519PublicKey`.
- An **author identifier** is derived from the Ed25519 public key as
  exercised in `signing.json` (hash-of-public-key form; reproduce the vector
  derivations exactly).
- The signature is computed over the canonical bytes of the op **with the
  signature field absent**, then attached. Verification: detach the
  signature, re-canonicalize, verify against the author's public key.
- Implementations MUST reject ops whose author identifier does not match the
  presented public key, whose signature fails, or whose canonical bytes do
  not hash to the claimed op id.

## 4. The op envelope and op kinds

Every op carries, at minimum: a protocol version, an op kind, the author
identifier, the per-author sequence number, the hash of the author's previous
op (absent/null for the first), an author-asserted timestamp, a kind-specific
body, and the signature. Exact field names and shapes are fixed by
`op_scenarios.json`.

Op kinds and their semantics:

| Kind | Semantics |
|------|-----------|
| **evidence** | Immutable imported fact. Body records adapter id, source identifier, and payload. Evidence is never shipped to delegates. |
| **claim** | Derived statement: subject, predicate, value, confidence, the **rule id** that produced it, and the list of op ids it derives from (evidence and/or prior claims). These references are the edges of the derivation graph. |
| **refute** | Marks a target claim invalid, with an optional reason. Authored by the user. |
| **correct** | Refutes a target claim and asserts a replacement value in one atomic op. |
| **cap-grant** | Capability grant: delegate public key / author id, scope (claim-predicate allowlist), optional expiry. |
| **cap-revoke** | References a grant op id and terminates it. |
| **ack** | Signed acknowledgment by a delegate that it has ingested a specific op (used for verifiable revocation, §7). |

**Validation rules (MUST):**

1. Signature and op-id integrity per §3.
2. Per-author chain integrity: sequence numbers strictly increase by 1;
   the previous-hash pointer matches the author's prior accepted op. Ops that
   fork an author's chain are rejected (the reference implementation rejects
   the later-arriving branch and surfaces the fork as an integrity alert —
   the scenarios in `op_scenarios.json` pin the exact behavior).
3. Referential validity at *application* time, not ingest time: a claim
   referencing an op the node hasn't seen yet is stored but held
   **pending** until its references arrive (sync can deliver out of order
   across authors; never within an author's chain).
4. Unknown op kinds: store and re-gossip (forward compatibility), but never
   derive from them and never count them toward derived state.

## 5. Derivation, provenance, confidence, cascade

Derived state is computed by a deterministic reduce over the op set:

1. Collect all valid evidence ops.
2. Apply derivation rules (each rule has a stable id) to produce claims; each
   claim records its rule id and input op ids — this is its **provenance**.
3. Confidence is assigned by the rule from the strength/recency/agreement of
   its inputs and is monotonically non-increasing along derivation depth
   (a claim is never more confident than its weakest input).
4. A **refute** op invalidates its target claim and, transitively, every
   claim whose provenance includes an invalidated claim (**cascade
   invalidation**). Re-derivation then runs over the surviving inputs;
   because rules are deterministic, all nodes converge on the same
   post-correction state.
5. A **correct** op is a refute plus a user-asserted claim (rule id =
   user-assertion) with maximal confidence; downstream rules may re-derive
   from the corrected value.

The cascade scenarios in `op_scenarios.json` define expected invalidation
sets exactly; reproduce them.

## 6. Sync protocol and conflict handling

Sync is a transport-agnostic, two-message-per-round reconciliation:

1. **Heads exchange.** Each peer sends a map of `author id → (latest seq,
   latest op id)` for every author chain it knows.
2. **Delta fetch.** Each peer requests, per author, all ops above its own
   high-water mark, and validates each ingested op per §4 (signatures,
   chains, dedupe by op id).
3. Repeat until heads match. The reference `fablepool/sync.py` +
   `fablepool/transport.py` implement this over an in-process/file transport;
   any transport that delivers the same messages interoperates.

**Convergence guarantee:** state is a deterministic function of the op set
(§5), so once op sets are equal, derived state is equal. The test
`tests/test_sync_convergence.py` and the sync scenarios in the vectors pin
this down, including interleaved and repeated partial syncs.

**Conflict handling.** "Conflicts" are concurrent ops about the same claim
subject/predicate from different devices of the same user. Resolution is
deterministic and order-independent:

- A refutation or correction always dominates the claims it targets,
  regardless of arrival order or timestamp.
- Among concurrent *assertions* of the same subject/predicate, the winner is
  chosen by a total order: user-asserted beats derived; then higher
  confidence; then later author timestamp; then lexicographically greater op
  id as the final tiebreak. Losers remain in the log (auditable) but do not
  contribute to derived state.
- Nothing is ever merged destructively; resolution is a pure function any
  node recomputes identically.

## 7. Capabilities, redaction, and verifiable revocation

**Grant.** A `cap-grant` op signed by the user names the delegate's key and a
scope: an allowlist of claim predicates (e.g. `fitness/*`). Optional expiry
timestamp.

**Serving a capability.** When a delegate syncs, the serving node filters:

- Only **claim** ops whose predicate is in scope (plus the grant chain the
  delegate needs to validate its own access).
- **Evidence is never sent.** Each in-scope claim is delivered in a
  **redacted projection**: its provenance references are replaced by the
  referenced ops' content hashes (opaque commitments). The delegate can
  verify the claim's signature and later audit provenance if the user chooses
  to reveal it, but learns nothing about the evidence from the slice itself.
- Redacted projections are themselves signed by the serving node so the
  delegate can prove what it was given.

**Revoke.** A `cap-revoke` op references the grant. From the moment a user
node ingests the revocation:

1. It MUST refuse to serve the capability (heads exchange with the delegate
   includes the revocation; delta requests under the revoked grant are
   answered with the revocation op and nothing else).
2. A **conforming delegate** MUST, upon ingesting the revocation: stop
   serving or acting on the revoked scope, mark its copies revoked, and emit
   a signed **ack** op naming the revocation's op id.
3. The user's nodes ingest the ack. Revocation is then **verifiably honored**:
   the user holds the delegate's signature over the revocation id. A delegate
   that never acks is visibly non-compliant — that is the enforceable
   guarantee. (The protocol cannot erase bytes from a defector's disk; see
   non-goals.)

`tests/test_capability_revocation.py` and the capability scenarios in
`op_scenarios.json` pin the exact filtering, redaction, refusal, and ack
behaviors.

## 8. Export and portability

A node's entire state is its op log. Export = the ordered op set; import =
ingest with full validation. Any conforming implementation can adopt another
node's export wholesale. There is no hidden state: derived claims, indexes,
and capability tables are all recomputable from the log.

## 9. Versioning and extension rules

- Every op carries a protocol version. Implementations MUST reject ops with a
  higher major version and MUST accept (store, gossip, ignore semantically)
  unknown op kinds and unknown *non-critical* body fields within the current
  major version.
- New derivation rules are additive: a rule id never changes meaning. A new
  implementation may ship extra rules; cross-implementation convergence of
  *derived* state requires agreeing on the rule set, which is why rule ids
  appear in claim provenance — disagreements are detectable and attributable.

## 10. Conformance procedure for a second implementation

1. Pass every case in `canonicalization.json` (byte-exact).
2. Pass every case in `signing.json` (signatures and identifier derivations).
3. Replay every scenario in `op_scenarios.json`: feed the listed ops in the
   listed order, assert each accept/reject verdict, then assert the expected
   derived state, sync outcomes, and capability behavior.
4. Cross-test live: sync your node against this reference node over the file
   transport in both directions; both must converge, and a grant issued by
   one must be honored and revocable by the other.

`conformance/README.md` documents the runner's report format so results can
be published alongside your implementation.