# Built-in Deriver Reference

This document specifies the four reference derivers shipped in
`mnema/derive/derivers/`: **routines**, **places**, **relationships**, and
**preferences**. For each: what it consumes, what claims it emits, how the
identity key is constructed (normative for interop — see
`docs/derivation-model.md` §9.2), the heuristic, the confidence factors with
their priors, and known failure modes.

Shared conventions (from `derivers/base.py`):

* Every deriver is a pure function of its declared input record types; it
  receives only **valid** records (invalidated/refuted/retracted inputs are
  filtered by the engine before invocation).
* Every emitted candidate carries: `claim_type`, `identity_key`, `payload`,
  `inputs` (with roles), `rationale` (one to three sentences, populated from
  the actual observed quantities), and a `ConfidenceEstimate` (prior +
  factors) that the engine combines via `mnema.derive.confidence`.
* All string components of identity keys are NFC-normalized, lowercased,
  with internal whitespace collapsed to single spaces, and joined with `|`.
  Numeric components use the shortest round-trip decimal representation.
* All timestamps are handled in the user's home timezone (a node-level
  setting; the sample corpus uses `Europe/Berlin`). Day-of-week and
  time-of-day bucketing happen *after* timezone conversion.

Time-of-day buckets (shared): `night` 00:00–05:59, `morning` 06:00–11:59,
`afternoon` 12:00–17:59, `evening` 18:00–23:59 (start-time of the record
decides the bucket).

---

## 1. Routines (`routines.py`)

**Consumes:** calendar events (evidence). Pass-1 deriver (evidence only).

**Emits:** `routine.weekly`

A weekly routine is a cluster of calendar events sharing a normalized title
stem and time-of-day bucket, recurring on a stable subset of weekdays.

### Heuristic

1. Group events by *title stem*: lowercase, strip punctuation, drop trailing
   numerals/dates, collapse whitespace (so "Gym 💪", "gym", "Gym (w/ Sam)"
   stem differently only where the parenthetical adds words — "gym" vs
   "gym w sam"; partial-stem merging is deliberately *not* attempted).
2. Within a group, bucket by time-of-day; within a bucket, collect the set
   of weekdays with ≥ `MIN_PER_DAY` (default 2) occurrences.
3. Candidate fires when total occurrences ≥ `MIN_OCCURRENCES` (default 4)
   across ≥ `MIN_SPAN_WEEKS` (default 3) distinct ISO weeks.
4. **Regularity** = `1 − min(1, CV)` where CV is the coefficient of
   variation of inter-occurrence gaps per weekday.
5. **Recency** = `e^{−Δ/28d}` where Δ is days since the latest occurrence.

### Identity key

```
routine.weekly|<title_stem>|<sorted 3-letter weekdays, comma-joined>|<bucket>
e.g.  routine.weekly|gym|thu,tue|morning
```

Weekday granularity in the key means refuting "Tue/Thu gym" does not block
a later "Sat gym" claim — matching how users reason about routines.

### Payload

`{title_stem, weekdays, bucket, typical_start_local (median, "HH:MM"),
typical_duration_min (median), occurrences, first_seen, last_seen}`

### Confidence

| | |
|---|---|
| **Prior** | 0.30 |
| `occurrence_count` | saturating, `w_max=1.6, k=0.25`, over occurrences beyond the minimum |
| `regularity` | linear in regularity score, `w_max=1.2` |
| `recency` | linear in recency score, `w_max=0.5`; a routine unobserved for 8+ weeks decays toward its prior |
| `span_weeks` | saturating, `w_max=0.6, k=0.4`, over weeks beyond the minimum |

### Failure modes

One-off event series with routine-like titles ("Dentist") can clear the
thresholds at low confidence; that is intended — they surface as
*speculative* and are cheap to refute. Title-stem grouping will split
routines whose titles drift ("Gym" → "Strength training"); merging across
stems is a non-goal here.

---

## 2. Places (`places.py`)

**Consumes:** photo metadata (evidence) and calendar events with location
fields (evidence). Pass-1 deriver.

**Emits:** `place.frequent`, `place.home_candidate`, `place.work_candidate`

### Heuristic

1. Extract geo points: photo EXIF lat/lon; calendar locations resolve via
   the sample corpus's offline location table in the loader (no network
   geocoding — non-goal).
2. Grid-cluster at ~150 m resolution (3 decimal places of lat/lon as the
   cell key), then merge adjacent occupied cells into a cluster whose
   centroid and label are recomputed.
3. `place.frequent` fires for clusters with ≥ `MIN_VISITS` (default 5)
   distinct *days* of presence. The label is the modal calendar location
   string for the cluster if any, else `unlabeled`.
4. `place.home_candidate`: the cluster with the highest share of
   night/early-morning and weekend presence, if that share ≥ 0.5.
5. `place.work_candidate`: highest share of weekday 09:00–17:00 presence
   among non-home clusters, if that share ≥ 0.4.

Home/work are deliberately suffixed `_candidate`: the deriver proposes, the
user confirms via the milestone #5 surface (a confirmation is a correction
op pinning the payload at confidence 1.0).

### Identity key

```
place.frequent|<lat_centroid_3dp>,<lon_centroid_3dp>
place.home_candidate|self
place.work_candidate|self
```

Frequent places key on the rounded centroid (stable under small drift,
deterministic). Home/work candidates are singletons per subject: a new best
candidate *supersedes* the old claim rather than coexisting, and refuting
"home candidate" suppresses the singleton until withdrawn.

### Payload

`{label, centroid: {lat, lon}, visit_days, sources: {photos: n, calendar: n},
day_part_histogram, first_seen, last_seen}`

### Confidence

| | |
|---|---|
| **Prior** | `place.frequent` 0.35; `home_candidate` 0.25; `work_candidate` 0.25 |
| `visit_days` | saturating, `w_max=1.5, k=0.2` |
| `multi_source` | +0.7 log-odds when both photo and calendar evidence support the cluster (cross-source corroboration) |
| `day_part_fit` | (home/work only) linear in the presence-share score, `w_max=1.4` |
| `recency` | as routines, half-life 28 days |

### Failure modes

Grid clustering splits places straddling a cell boundary (mitigated by
adjacent-cell merge, not eliminated). Vacation rentals can transiently win
`home_candidate`; the recency-vs-history balance keeps such flips at
*probable* rather than *strong*, and the singleton supersede mechanism means
the correction history stays legible.

---

## 3. Relationships (`relationships.py`)

**Consumes:** calendar events (attendees), notes (person mentions via the
loader's contact-matching), photo metadata (people tags). Pass-1 deriver.

**Emits:** `relationship.frequent_contact`, `relationship.inferred_role`

### Heuristic

1. Resolve person references to a canonical person key: the loader matches
   attendee emails, note `@mentions` and exact contact names, and photo
   people-tags against the contacts table; unresolved references get a
   normalized-name key prefixed `~` (e.g. `~sam riley`) and never merge with
   resolved contacts.
2. `frequent_contact` fires for persons with ≥ `MIN_INTERACTIONS` (default
   4) interaction records across ≥ 2 distinct ISO weeks. Interaction = a
   shared calendar event, a note mention, or a shared photo.
3. `inferred_role` proposes a coarse role from context-word scoring over
   interaction records: `colleague` (weekday-business-hours meetings,
   work-domain emails, words like "standup", "1:1", "review"), `friend`
   (evening/weekend events, words like "dinner", "birthday", shared
   non-work photos), `family` (words like "mom", "dad", "sister", shared
   surname with the user where contacts provide it), `provider` ("dr.",
   "dentist", "appointment", clinic-style locations). Highest score wins if
   it exceeds the runner-up by margin ≥ 0.2 of total; otherwise no role
   claim is emitted (ambiguity is not guessed at).

### Identity key

```
relationship.frequent_contact|<person_key>
relationship.inferred_role|<person_key>     # role lives in payload, not the key
```

Role is payload, not key, so a role *change* (colleague → friend) is a
superseding claim with visible history, while a role *refutation* ("this
person is not my colleague — stop guessing") blocks all future role guesses
for that person until withdrawn. This matches the granularity at which
users object.

### Payload

`frequent_contact`: `{person_key, display_name, interactions, channels:
{calendar, notes, photos}, first_seen, last_seen}` —
`inferred_role`: adds `{role, role_scores, top_signals: [up to 5 strings]}`.

### Confidence

| | |
|---|---|
| **Prior** | `frequent_contact` 0.40; `inferred_role` 0.25 |
| `interaction_count` | saturating, `w_max=1.5, k=0.3` |
| `channel_diversity` | +0.45 log-odds per additional channel beyond the first (max +0.9) |
| `role_margin` | (role only) linear in winner-minus-runner-up margin, `w_max=1.3` |
| `recency` | half-life 42 days (relationships decay slower than routines) |

### Failure modes

Name collisions in unresolved (`~`) references are not merged — two distinct
"Sam"s in notes with no contact entry become one `~sam` key; this is a known
precision/recall trade documented in the rationale string whenever any `~`
key is emitted. Role inference is the most heuristic component in the
system and is capped: its factor weights cannot push it past the *likely*
band without `channel_diversity`.

---

## 4. Preferences (`preferences.py`)

**Consumes:** notes (evidence), and — as a **pass-2** deriver — `routine.weekly`
and `place.frequent` claims. This is the layered-derivation exemplar: its
claims have other claims in their provenance chains, which is exactly what
the cascade tests exercise (refute a routine → its derived preference is
invalidated).

**Emits:** `preference.stated`, `preference.inferred`

### Heuristic

1. `preference.stated`: pattern-matching over note sentences using a small
   grammar of first-person preference constructions ("I love/like/prefer/
   hate/can't stand/always order X", "no X for me", "favorite X is Y"). The
   matched object is normalized to a topic key; polarity is `positive` or
   `negative`. Negation handling covers the grammar's own forms only —
   free-text sarcasm is a documented non-goal.
2. `preference.inferred`: behavioral inference from upstream claims —
   a `routine.weekly` with an activity-class stem (gym, run, yoga, swim,
   climbing, …, per a small built-in lexicon) at ≥ *probable* confidence
   yields `preference.inferred|activity:<class>` positive; a
   `place.frequent` whose label matches a venue lexicon entry (cafe,
   climbing gym, library, …) with ≥ 8 visit days yields a venue-category
   preference.

### Identity key

```
preference.stated|<topic_key>|<positive|negative>
preference.inferred|<topic_key>|positive
```

Polarity is in the key for stated preferences: "I love cilantro" and "I hate
cilantro" are *different claims* that can coexist as contradictory evidence;
the explanation surface shows both, and resolving the contradiction is the
user's call (a correction), not the deriver's.

### Payload

`stated`: `{topic, polarity, quotes: [verbatim matched sentences, ≤3],
note_count}` — `inferred`: `{topic, basis: "routine"|"place",
upstream_summary}`.

### Confidence

| | |
|---|---|
| **Prior** | `stated` 0.55 (the user said it themselves); `inferred` 0.20 |
| `restatement_count` | (stated) saturating over distinct notes, `w_max=1.2, k=0.5` |
| `statement_strength` | (stated) lexical strength of the matched verb: love/hate +0.8, like/dislike +0.3, prefer +0.4 |
| `upstream_confidence` | (inferred) `logit(p_upstream) × 0.6` — inferred preferences explicitly inherit *attenuated* upstream confidence, so a downgraded routine mechanically downgrades the preference on re-derivation |
| `recency` | half-life 90 days (preferences are sticky) |

### Failure modes

The stated-preference grammar is intentionally narrow (high precision, low
recall); it will miss most preferences expressed in free prose and that is
acceptable for a reference implementation. Inferred preferences are
floor-capped at *speculative*/*probable* by their low prior and attenuation —
the system should never claim *strong* knowledge of a preference the user
never stated.

---

## Adding a deriver

Third-party derivers subclass `derivers/base.py`'s `Deriver`, declare input
record/claim types and a pass number, use a reverse-DNS `claim_type` prefix,
document their identity-key construction, and ship golden-confidence
fixtures. The engine treats them identically to built-ins: same filtering,
same dedupe, same cascade semantics, same explanation rendering. The
normative requirements are in `docs/derivation-model.md` §9.