# PMP — Personal Memory Protocol (Reference Node) PMP is an open protocol and reference implementation for a **user-owned personal AI memory layer**: a local-first, append-only, cryptographically signed operation log into which a person's own devices ingest evidence (calendars, notes, photo metadata, …), derive claims with provenance, and from which the user can inspect, correct, export, synchronize, delegate, and revoke. This repository currently contains the deliverables of milestones 1–3: | Milestone | Deliverable | Where | |---|---|---| | 1 | Architecture, threat model, non-goals | `docs/` (published in milestone 1) | | 2 | Wire format and signed operation log schema | `docs/` + `src/pmp/operations.py`, `src/pmp/canonical.py` | | 3 | **Reference local node with import adapters** (this milestone) | `src/pmp/`, `samples/`, `tests/`, `docs/setup.md`, `docs/adapter-authoring.md`, `docs/log-inspection.md` | ## What the reference node does today * **Append-only signed operation log** stored locally (SQLite + content-addressed operations). Every operation is canonically serialized, hashed, and signed with the node's Ed25519 key; the log forms a hash chain per author and is verified on read. * **Key management**: per-node Ed25519 identity keys generated on `init`, stored under the node directory with restrictive file permissions, never transmitted. * **Import adapters** that turn raw sources into provenance-tagged `evidence` operations: * **Calendar** — RFC 5545 ICS files (events, recurrence text, attendees, locations). * **Notes** — Markdown / plain-text files (front matter, headings, tags, dates). * **Photos** — mock EXIF metadata (JSON fixtures standing in for camera EXIF: timestamps, GPS, device model), so the pipeline is exercised end-to-end without binary image parsing. * **Idempotent ingestion**: re-importing the same source content does not create duplicate evidence (content-derived external IDs + dedup on import). * **CLI** for initializing a node, importing sources, listing/inspecting/verifying the log, and exporting operations as canonical JSON lines. * **Synthetic sample data** for a fictional user ("Avery Reyes") under `samples/`, plus a scripted demo (`scripts/demo.sh`). Out of scope for this milestone (coming in later milestones): the derivation engine, refutation/correction cascade, sync between nodes, and capability-based delegation. See the published non-goals document for permanent non-goals. ## Quick start ```bash # Python 3.10+ required python -m venv .venv && source .venv/bin/activate pip install -e ".[dev]" # Initialize a node (creates keys + empty log under ./avery-node) pmp init --node-dir ./avery-node # Import the sample datasets pmp import calendar samples/calendar/avery-personal.ics --node-dir ./avery-node pmp import calendar samples/calendar/avery-work.ics --node-dir ./avery-node pmp import notes samples/notes --node-dir ./avery-node pmp import photos samples/photos/avery-photos.json --node-dir ./avery-node # Inspect and verify pmp log --node-dir ./avery-node pmp verify --node-dir ./avery-node ``` Or run the whole thing in one shot: ```bash bash scripts/demo.sh ``` Full instructions, including troubleshooting and a tour of every CLI command, are in [`docs/setup.md`](docs/setup.md). ## Documentation * [`docs/setup.md`](docs/setup.md) — installation, node layout on disk, CLI tour, running the tests, troubleshooting. * [`docs/adapter-authoring.md`](docs/adapter-authoring.md) — how to write a new import adapter, the adapter contract, provenance requirements, idempotency rules, and testing guidance. * [`docs/log-inspection.md`](docs/log-inspection.md) — anatomy of an operation, how signing and hash chaining work, how to inspect and verify the log from the CLI and from Python, and how to audit provenance back to raw source bytes. ## Repository layout ``` src/pmp/ canonical.py # canonical JSON serialization + hashing (wire format, M2) keys.py # Ed25519 key generation, storage, signing, verification operations.py # operation model: build, sign, validate, (de)serialize oplog.py # append-only SQLite-backed log with chain verification node.py # node = keys + log + adapter orchestration + dedup adapters/ # base contract + ics_calendar, notes, photos adapters cli.py, __main__.py # command-line interface samples/ # synthetic datasets for the fictional user Avery Reyes scripts/demo.sh # end-to-end demo: init, import everything, inspect, verify tests/ # unit + integration tests (pytest) docs/ # developer documentation for this milestone ``` ## Design principles (carried through every milestone) 1. **Local-first.** A node is a directory on a device the user controls. No external services are required to run the reference node. 2. **Everything is a signed operation.** Evidence, and later claims, corrections, and grants, are append-only signed operations; nothing is ever edited or deleted in place. 3. **Provenance is mandatory.** Every evidence operation records the adapter, adapter version, source identity, source content hash, and import time, so any derived claim can be traced to raw bytes. 4. **Auditable and portable.** Operations are canonical JSON; export is a JSONL stream another implementation can verify and ingest with nothing but the published wire-format spec. ## Tests ```bash pip install -e ".[dev]" pytest ``` The test suite covers canonical serialization vectors, key handling, operation signing/verification, log append/verify/tamper-detection, every adapter against the sample datasets, idempotent re-import at the node level, and CLI behavior. ## License & contributing Open source under the license declared in `pyproject.toml`. Contributions that keep the reference node small, dependency-light, and faithful to the published wire format are welcome — start with `docs/adapter-authoring.md` if you want to add a source type.