# FablePool Interoperability Guide This document is for engineers building a **second implementation** of the FablePool protocol that interoperates with this reference implementation: same wire format, same log semantics, same sync behavior, same capability rules. ## 0. Normative sources Where this prose and the test vectors disagree, **the vectors win**: - `conformance/vectors/canonicalization.json` — input documents and their exact canonical byte serializations (hex) and content hashes. - `conformance/vectors/signing.json` — fixed Ed25519 keys, messages, signatures, and key/author identifier derivations. - `conformance/vectors/op_scenarios.json` — full op sequences with expected accept/reject verdicts and expected derived state, covering chain validation, refutation cascade, sync convergence, and capability grant/revoke behavior. `conformance/run_conformance.py` executes all three against this reference implementation; `conformance/README.md` explains how to point the same vectors at your own code. **Your implementation conforms when it reproduces every vector byte-for-byte and verdict-for-verdict.** ## 1. Data model in one paragraph All state is a set of **ops**. Each op is authored by exactly one key, carries a monotonically increasing per-author sequence number and a hash pointer to the author's previous op (forming one append-only chain per author), and is signed by that key over its canonical bytes. The global state of a node is the union of all valid ops it has ingested; **derived state is a deterministic function of that set**, so two nodes with the same op set always agree. ## 2. Canonicalization Ops are JSON. Before hashing or signing, an op is serialized to **canonical bytes**, JCS-style (RFC 8785 intent): - UTF-8 encoding, no insignificant whitespace. - Object keys sorted lexicographically. - No duplicate keys; duplicate keys make a document invalid. - Strings serialized with minimal escaping (only what JSON requires). - Numbers in protocol fields are integers; implementations MUST NOT emit non-integer numbers in op envelopes. Fractional quantities used by the protocol (e.g. confidence) are carried as scaled integers or strings as defined by the claim body schema exercised in the vectors. - `null`, `true`, `false` serialized literally. The **content identifier** of any document is the SHA-256 of its canonical bytes. An op's id is the content identifier of the op with its signature field removed (the signature signs the id's preimage; it cannot sign itself). Validate your canonicalizer against every case in `canonicalization.json` — including the Unicode, nesting, and key-ordering edge cases — before anything else. Every other guarantee depends on byte-exact agreement here. ## 3. Keys, identifiers, signatures - Signature scheme: **Ed25519** (RFC 8032). The reference implementation uses the `cryptography` library's `Ed25519PrivateKey` / `Ed25519PublicKey`. - An **author identifier** is derived from the Ed25519 public key as exercised in `signing.json` (hash-of-public-key form; reproduce the vector derivations exactly). - The signature is computed over the canonical bytes of the op **with the signature field absent**, then attached. Verification: detach the signature, re-canonicalize, verify against the author's public key. - Implementations MUST reject ops whose author identifier does not match the presented public key, whose signature fails, or whose canonical bytes do not hash to the claimed op id. ## 4. The op envelope and op kinds Every op carries, at minimum: a protocol version, an op kind, the author identifier, the per-author sequence number, the hash of the author's previous op (absent/null for the first), an author-asserted timestamp, a kind-specific body, and the signature. Exact field names and shapes are fixed by `op_scenarios.json`. Op kinds and their semantics: | Kind | Semantics | |------|-----------| | **evidence** | Immutable imported fact. Body records adapter id, source identifier, and payload. Evidence is never shipped to delegates. | | **claim** | Derived statement: subject, predicate, value, confidence, the **rule id** that produced it, and the list of op ids it derives from (evidence and/or prior claims). These references are the edges of the derivation graph. | | **refute** | Marks a target claim invalid, with an optional reason. Authored by the user. | | **correct** | Refutes a target claim and asserts a replacement value in one atomic op. | | **cap-grant** | Capability grant: delegate public key / author id, scope (claim-predicate allowlist), optional expiry. | | **cap-revoke** | References a grant op id and terminates it. | | **ack** | Signed acknowledgment by a delegate that it has ingested a specific op (used for verifiable revocation, §7). | **Validation rules (MUST):** 1. Signature and op-id integrity per §3. 2. Per-author chain integrity: sequence numbers strictly increase by 1; the previous-hash pointer matches the author's prior accepted op. Ops that fork an author's chain are rejected (the reference implementation rejects the later-arriving branch and surfaces the fork as an integrity alert — the scenarios in `op_scenarios.json` pin the exact behavior). 3. Referential validity at *application* time, not ingest time: a claim referencing an op the node hasn't seen yet is stored but held **pending** until its references arrive (sync can deliver out of order across authors; never within an author's chain). 4. Unknown op kinds: store and re-gossip (forward compatibility), but never derive from them and never count them toward derived state. ## 5. Derivation, provenance, confidence, cascade Derived state is computed by a deterministic reduce over the op set: 1. Collect all valid evidence ops. 2. Apply derivation rules (each rule has a stable id) to produce claims; each claim records its rule id and input op ids — this is its **provenance**. 3. Confidence is assigned by the rule from the strength/recency/agreement of its inputs and is monotonically non-increasing along derivation depth (a claim is never more confident than its weakest input). 4. A **refute** op invalidates its target claim and, transitively, every claim whose provenance includes an invalidated claim (**cascade invalidation**). Re-derivation then runs over the surviving inputs; because rules are deterministic, all nodes converge on the same post-correction state. 5. A **correct** op is a refute plus a user-asserted claim (rule id = user-assertion) with maximal confidence; downstream rules may re-derive from the corrected value. The cascade scenarios in `op_scenarios.json` define expected invalidation sets exactly; reproduce them. ## 6. Sync protocol and conflict handling Sync is a transport-agnostic, two-message-per-round reconciliation: 1. **Heads exchange.** Each peer sends a map of `author id → (latest seq, latest op id)` for every author chain it knows. 2. **Delta fetch.** Each peer requests, per author, all ops above its own high-water mark, and validates each ingested op per §4 (signatures, chains, dedupe by op id). 3. Repeat until heads match. The reference `fablepool/sync.py` + `fablepool/transport.py` implement this over an in-process/file transport; any transport that delivers the same messages interoperates. **Convergence guarantee:** state is a deterministic function of the op set (§5), so once op sets are equal, derived state is equal. The test `tests/test_sync_convergence.py` and the sync scenarios in the vectors pin this down, including interleaved and repeated partial syncs. **Conflict handling.** "Conflicts" are concurrent ops about the same claim subject/predicate from different devices of the same user. Resolution is deterministic and order-independent: - A refutation or correction always dominates the claims it targets, regardless of arrival order or timestamp. - Among concurrent *assertions* of the same subject/predicate, the winner is chosen by a total order: user-asserted beats derived; then higher confidence; then later author timestamp; then lexicographically greater op id as the final tiebreak. Losers remain in the log (auditable) but do not contribute to derived state. - Nothing is ever merged destructively; resolution is a pure function any node recomputes identically. ## 7. Capabilities, redaction, and verifiable revocation **Grant.** A `cap-grant` op signed by the user names the delegate's key and a scope: an allowlist of claim predicates (e.g. `fitness/*`). Optional expiry timestamp. **Serving a capability.** When a delegate syncs, the serving node filters: - Only **claim** ops whose predicate is in scope (plus the grant chain the delegate needs to validate its own access). - **Evidence is never sent.** Each in-scope claim is delivered in a **redacted projection**: its provenance references are replaced by the referenced ops' content hashes (opaque commitments). The delegate can verify the claim's signature and later audit provenance if the user chooses to reveal it, but learns nothing about the evidence from the slice itself. - Redacted projections are themselves signed by the serving node so the delegate can prove what it was given. **Revoke.** A `cap-revoke` op references the grant. From the moment a user node ingests the revocation: 1. It MUST refuse to serve the capability (heads exchange with the delegate includes the revocation; delta requests under the revoked grant are answered with the revocation op and nothing else). 2. A **conforming delegate** MUST, upon ingesting the revocation: stop serving or acting on the revoked scope, mark its copies revoked, and emit a signed **ack** op naming the revocation's op id. 3. The user's nodes ingest the ack. Revocation is then **verifiably honored**: the user holds the delegate's signature over the revocation id. A delegate that never acks is visibly non-compliant — that is the enforceable guarantee. (The protocol cannot erase bytes from a defector's disk; see non-goals.) `tests/test_capability_revocation.py` and the capability scenarios in `op_scenarios.json` pin the exact filtering, redaction, refusal, and ack behaviors. ## 8. Export and portability A node's entire state is its op log. Export = the ordered op set; import = ingest with full validation. Any conforming implementation can adopt another node's export wholesale. There is no hidden state: derived claims, indexes, and capability tables are all recomputable from the log. ## 9. Versioning and extension rules - Every op carries a protocol version. Implementations MUST reject ops with a higher major version and MUST accept (store, gossip, ignore semantically) unknown op kinds and unknown *non-critical* body fields within the current major version. - New derivation rules are additive: a rule id never changes meaning. A new implementation may ship extra rules; cross-implementation convergence of *derived* state requires agreeing on the rule set, which is why rule ids appear in claim provenance — disagreements are detectable and attributable. ## 10. Conformance procedure for a second implementation 1. Pass every case in `canonicalization.json` (byte-exact). 2. Pass every case in `signing.json` (signatures and identifier derivations). 3. Replay every scenario in `op_scenarios.json`: feed the listed ops in the listed order, assert each accept/reject verdict, then assert the expected derived state, sync outcomes, and capability behavior. 4. Cross-test live: sync your node against this reference node over the file transport in both directions; both must converge, and a grant issued by one must be honored and revocable by the other. `conformance/README.md` documents the runner's report format so results can be published alongside your implementation.