# FablePool — an open protocol for user-owned AI memory FablePool is a reference implementation of a **user-owned personal AI memory layer**: a local-first knowledge graph in which a person's own devices ingest evidence (calendar entries, notes, photo metadata), derive **claims** about the user, attach **provenance** to every claim, and let the user **inspect, refute, correct, export, synchronize, delegate, and revoke** those claims. This is not a chatbot. It is the substrate a personal AI would need if it were owned by the user instead of by an app vendor. Everything in the system — raw evidence, derived claims, corrections, capability grants and revocations — is a **signed operation in an append-only, hash-chained log** that syncs between user-controlled nodes. Claims form a **derivation graph**, so when the user refutes one claim, every claim derived from it is mechanically invalidated. Sharing is **capability-based**: a user can grant a third party a narrow, claim-only slice of the graph without revealing the underlying evidence, and revoke that access later. ## Project status All six funded milestones are delivered: | # | Milestone | Where it lives | |---|-----------|----------------| | 1 | Architecture, threat model, non-goals | This README; `docs/` | | 2 | Wire format & signed operation log schema | `fablepool/canonical.py`, `fablepool/ops.py`, `conformance/vectors/` | | 3 | Reference local node with import adapters | `fablepool/node.py`, `fablepool/adapters/`, `datasets/` | | 4 | Derivation engine: provenance, confidence, cascade invalidation | `fablepool/derive.py`, `fablepool/claims.py` | | 5 | Inspection & refutation interface | `fablepool/cli.py` (`fablepool` command / `python -m fablepool`) | | 6 | Sync, capability delegation, public demo | `fablepool/sync.py`, `fablepool/capability.py`, `fablepool/transport.py`, `demo/` | ## Quickstart Requires Python 3.10+. ```bash git clone cd fablepool python -m venv .venv && source .venv/bin/activate pip install -e ".[dev]" # first install also generates your environment's resolved versions # Run the test suite pytest # Run protocol conformance against the milestone-2 vectors python conformance/run_conformance.py # Run the end-to-end scripted demo (three nodes: phone, laptop, delegated coach) ./demo/run_demo.sh # or: python demo/run_demo.py ``` The demo walks the full story in one run — see [`docs/DEMO.md`](docs/DEMO.md) for the act-by-act narrative and how each act maps to a milestone: 1. **Phone** node imports calendar events and photo metadata; **laptop** node imports markdown notes (`datasets/`). 2. The derivation engine produces claims with provenance and confidence ("trains for a marathon", "sees Dr. Okafor quarterly", "planning a trip to Lisbon"), each traceable to specific evidence ops. 3. The user asks, via the CLI, *"what do you know about me and why?"* and gets claims with their full provenance chains. 4. The user **refutes** one claim; the refutation cascades and invalidates every downstream derived claim. 5. Phone and laptop **sync** their logs over the local transport and converge to the same state, including under concurrent conflicting edits. 6. The user **grants a capability** to a third-party "coach" node, scoped to fitness claims only. The coach receives claims with provenance reduced to opaque commitments — **no underlying evidence ever leaves the user's nodes**. 7. The user **revokes** the capability. On its next sync the coach node ingests the revocation, verifiably acknowledges it, and the user's nodes refuse to serve the revoked scope from that point on. The demo shows the coach's subsequent access attempt failing. ## Core concepts - **Operation (op):** the only unit of state change. A JSON object, canonicalized to deterministic bytes, signed with the author's Ed25519 key, identified by the hash of its canonical bytes, and chained to the author's previous op. Ops are never deleted or rewritten; corrections are new ops. - **Evidence:** raw imported facts (a calendar event, a note paragraph, photo EXIF) wrapped in an op. Evidence is honest about its source: every evidence op records the adapter and source identifier it came from. - **Claim:** a derived statement about the user (subject, predicate, value, confidence) that references the evidence ops and prior claims it was derived from, plus the rule that derived it. This is the **derivation graph**. - **Refutation / correction:** signed ops that mark a claim wrong (or replace its value). The derivation engine invalidates all transitive descendants of a refuted claim and re-derives deterministically from surviving evidence. - **Sync:** nodes exchange per-author log heads, fetch the ops they are missing, verify signatures and hash chains on ingest, and converge: any two nodes with the same op set compute the same derived state. Conflicts are resolved deterministically (see `INTEROP.md` §6). - **Capability:** a signed grant op naming a delegate key and a scope (a claim-predicate allowlist). Delegates receive only claims in scope, with evidence references replaced by hashes. A signed revoke op closes the grant; conforming delegates must acknowledge revocations and stop serving the scope, and granting nodes stop answering the capability. ## Repository layout ``` fablepool/ Library: canonicalization, keys, ops, store, claims, derivation, sync, capabilities, transports, node, CLI fablepool/adapters/ Import adapters: calendar (.ics), notes (.md), photo metadata (.json) demo/ End-to-end scripted three-node demo datasets/ Sample evidence datasets used by tests and the demo conformance/ Milestone-2 test vectors and a runner any implementation can use tests/ Test suite (canonicalization, ops, adapters, cascade, sync convergence, capability revocation, conformance) docs/ Demo walkthrough and milestone mapping INTEROP.md Guide for building a second, interoperable implementation ``` ## Interoperability FablePool is designed so a second implementation can interoperate. The **conformance vectors in `conformance/vectors/` are the normative ground truth** for byte-level behavior; [`INTEROP.md`](INTEROP.md) explains the data model, verification rules, sync protocol, and capability semantics, and how to run the vectors against your own code. ## Non-goals Stated up front so nobody mistakes this for what it is not: 1. **Not a chatbot or an LLM.** The derivation engine is deterministic rules over evidence. A personal AI would sit *on top of* this substrate; the substrate's job is provenance, consent, and portability. 2. **Not a hardened network stack.** The reference transport is local (in-process / filesystem). The protocol is transport-agnostic; production transports (TLS, Noise, relay servers) are out of scope here. 3. **Not key-management UX.** Demo keys are generated on disk, unencrypted. Real deployments need secure enclaves, passphrase wrapping, and recovery — deliberately out of scope for the reference implementation. 4. **Not data deletion from adversaries.** Revocation is *mechanical, not magical*: a conforming delegate verifiably stops receiving and serving the scope, and the user's nodes stop answering it. A malicious delegate that already copied claims cannot be forced to forget them. The protocol makes compliance auditable; it does not make defection impossible. 5. **Not differential privacy or anonymization.** Claim shapes themselves can leak information (see caveats below). Statistical privacy for shared slices is future work. 6. **Not a scalable graph database.** The store is a single-file, append-only log with in-memory indexes — correct and auditable, not tuned for millions of ops. 7. **Not production-audited cryptography.** Primitives are standard (Ed25519, SHA-256) via the `cryptography` library, but the composition has not been externally audited. ## Security caveats - **Local node compromise is game over.** Anyone with the user's signing key is the user, protocol-wise. Protect the key. - **Demo keys are plaintext on disk.** Fine for the demo; do not reuse them. - **Revocation lag.** A delegate is only provably revoked once it has synced past the revocation op and returned a signed acknowledgment. Between revoke and acknowledgment, the user's nodes already refuse the capability, but the delegate may still hold previously delivered claims. - **Metadata leakage in shared slices.** Even evidence-free claims reveal predicates, timestamps, and confidence values. Scope grants narrowly. - **Timestamps are claims, not proofs.** Op timestamps are author-asserted; ordering guarantees come from hash chains and deterministic tie-breaking, not wall clocks. - **Adapters trust their inputs.** A malicious `.ics` or markdown file can inject misleading evidence. Evidence provenance makes this auditable after the fact; it does not prevent it. ## License MIT — see [`LICENSE`](LICENSE).