# OMP Wire Format and Signed Operation Log — Overview

**Spec:** OMP (Open Memory Protocol) — Milestone 2
**Version:** `omp/0.2`
**Status:** Draft for implementation
**Audience:** Implementers building an interoperable OMP node from this document alone.

---

## 1. Purpose

This specification defines the **wire format** and **signed, append-only operation log** that
every OMP node speaks. It is deliberately implementation-agnostic: it defines bytes, hashes,
signatures, validation rules, and merge semantics — not storage engines, transports, or UIs.

Everything a personal-memory node does is expressed as one of **seven operation types**,
each a signed record in an append-only, multi-writer log owned by a single human identity:

| Type | Purpose |
|---|---|
| `evidence-ingest` | Register a piece of raw evidence (calendar entry, note, photo metadata, …) by content hash. |
| `claim-assert` | Assert a derived claim about the subject, with provenance and confidence. |
| `correction` | Supersede an existing claim, optionally pointing at a replacement claim. |
| `refutation` | Mark a claim or piece of evidence as false/invalid, with cascading effects. |
| `permission-grant` | Grant a capability (read / author / delegate / infer) over a scoped slice of the graph. |
| `revocation` | Mechanically revoke a previously issued grant. |
| `inference-call` | Record that an inference (model call) was made over specified inputs. |

The log is the **single source of truth**. Claims, permissions, device authorizations, and
the derivation graph are all *derived state* computed deterministically from the log by the
rules in this specification. Two conforming implementations that ingest the same set of
operations MUST compute identical derived state.

## 2. Document map

| File | Contents |
|---|---|
| `00-overview.md` | This document: scope, conventions, conformance classes. |
| `01-encoding-and-identifiers.md` | Canonical serialization, identifiers, hashing, signatures, size limits. |
| `02-operations.md` | The operation envelope and all seven operation bodies, field by field. |
| `03-log-and-merge.md` | Append-only log structure, hash chaining, the operation DAG, validation pipeline, total order, conflict/merge semantics, claim and grant state machines. |
| `04-keys-and-capabilities.md` | Key and identity model, device authorization, capability semantics, revocation semantics. |
| `05-versioning-errors-extensibility.md` | Version negotiation, the error-code registry, extension rules. |
| `06-worked-examples.md` | Worked byte-level examples, end to end. |
| `07-interop-checklist.md` | Numbered MUST/SHOULD checklist mapping to conformance vectors. |

Companion artifacts in this repository:

- `schemas/` — JSON Schema (draft 2020-12) definitions for every operation.
- `conformance/` — test-vector suite (valid, invalid, and defer cases) plus a reference
  verifier and a deterministic vector generator.

## 3. Conventions

The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**,
**SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** are to be interpreted as
described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.

- All byte sequences are described in network order.
- `hex(x)` means lowercase hexadecimal encoding of byte string `x`, two characters per byte.
- `sha256(x)` means the SHA-256 digest (FIPS 180-4) of byte string `x`, 32 bytes.
- `utf8(s)` means the UTF-8 encoding of string `s`, without BOM.
- `||` denotes byte-string concatenation.
- JSON terminology follows RFC 8259.

## 4. Conformance classes

This spec defines three conformance classes. The conformance suite labels each vector with
the classes it applies to.

1. **Producer** — software that creates and signs operations. A producer MUST emit
   operations that every conforming verifier accepts.
2. **Verifier** — software that validates individual operations and operation streams:
   canonical-form check, schema check, signature check, chain check, authorization check.
3. **Replicator** — a verifier that additionally merges multi-writer logs and computes
   derived state (claim statuses, grant statuses, derivation-graph taint) per
   `03-log-and-merge.md`.

A full OMP node is all three. A minimal interoperable consumer may be only a Verifier.

## 5. Design rules carried over from Milestone 1 (architecture & threat model)

These constraints from the accepted architecture are normative here:

1. **Append-only.** Operations are never edited or deleted. Corrections, refutations,
   and revocations are new operations. (Evidence *content* lives outside the log and is
   referenced by hash, so content can be destroyed without breaking the log — see
   `02-operations.md` §2.)
2. **Provenance is mandatory.** Every claim carries machine-readable derivation references;
   every inference over user data is itself logged.
3. **Deterministic merge.** Multi-node sync must converge without coordination; all
   conflict resolution is a pure function of the operation set.
4. **Capability-based sharing.** Access is granted by explicit, revocable, scoped grants;
   possession of log data never implies authority over it.
5. **No floats on the wire.** Canonical serialization admits only integers (confidence is
   expressed in parts-per-million), eliminating an entire class of cross-platform
   canonicalization bugs.
6. **Hash-addressed everything.** Operations are identified by the hash of their signed
   bytes' preimage; evidence is identified by the hash of its raw bytes.

## 6. Non-goals of this milestone

Explicitly out of scope here (covered by later milestones or never):

- Transport protocols and sync session handshakes (Milestone 6 defines the sync demo;
  this document defines only what makes two logs *mergeable*).
- Storage formats, indexes, blob-store layout.
- The derivation engine's rules for *producing* claims (Milestone 4). This document only
  defines how a claim and its provenance are *represented*.
- Projection/redaction wire format for sharing claim slices without evidence (the
  capability *semantics* are defined here; the projected-bundle format ships with the
  sync milestone).
- Encryption-at-rest and transport encryption (threat model, Milestone 1, requires both;
  neither changes the byte format defined here).
- Post-quantum signatures. The identifier scheme is algorithm-prefixed precisely so a
  future suite can be added without a breaking change.