# Milestone 3 Acceptance Audit — Query Engine This document is a traceability audit for Milestone 3 (*Query Engine: vector search, BM25 full-text, filters, hybrid, planner*). It maps every funded deliverable in the milestone scope to the concrete files and test suites that implement it inside `crates/shoal-query`, so reviewers can verify coverage without reverse-engineering the crate layout. All code referenced below has already been delivered in this milestone; this document adds the acceptance mapping only. ## 1. Scope-to-deliverable traceability | # | Funded deliverable | Implementation | Tests | |---|--------------------|----------------|-------| | 1 | Exact kNN baseline with cosine / dot / Euclidean | `src/vector/math.rs` (distance kernels, normalization), `src/vector/exact.rs` (brute-force top-k scan over candidate doc sets) | inline unit tests in `vector/math.rs` and `vector/exact.rs`; exact path is also the ground-truth oracle in `tests/vector_recall.rs` | | 2 | IVF/centroid-style ANN designed for object-storage-backed retrieval | `src/vector/kmeans.rs` (Lloyd's k-means++ training), `src/vector/ivf.rs` (centroid table + per-centroid posting lists; centroids are a small manifest-resident block, each posting list is an independently addressable, independently fetchable unit so segments can be cached per-list), `src/codec.rs` (versioned, checksummed binary encoding for index blocks) | `tests/vector_recall.rs` | | 3 | Recall-vs-nprobe tuning | `nprobe` is a per-query runtime parameter on the IVF search path (`src/vector/ivf.rs`); `tests/vector_recall.rs` sweeps `nprobe` against the exact-kNN oracle and asserts recall@k is monotonically usable and crosses the target threshold | `tests/vector_recall.rs` | | 4 | Tokenizer | `src/text/tokenizer.rs` (Unicode-aware lowercasing tokenizer with configurable stopword handling) | inline unit tests; exercised end-to-end by `tests/bm25_correctness.rs` | | 5 | Persistent inverted index | `src/text/inverted.rs` (in-memory build + postings with term/doc frequencies and field-level stats) serialized through `src/codec.rs` into segment-storable byte blocks | `tests/bm25_correctness.rs` | | 6 | BM25 ranking with field boosting | `src/text/bm25.rs` (BM25 with configurable `k1`/`b`, per-field boosts combined at score time) | `tests/bm25_correctness.rs` (hand-computed BM25 scores, IDF edge cases, boost ordering) | | 7 | Optional sparse-vector scoring path | `src/sparse.rs` (sparse dot-product scoring over (index, weight) pairs) | inline unit tests in `src/sparse.rs` | | 8 | Metadata filter engine: Eq/NotEq/Gt/Gte/Lt/Lte/In/ContainsAny + And/Or/Not | `src/filter/mod.rs` (filter AST + evaluation), `src/filter/docset.rs` (doc-set algebra: union/intersect/negate over sorted ID sets) | `tests/filter_algebra.rs` (operator semantics, boolean algebra laws, De Morgan checks) | | 9 | Attribute/filter indexes for fast filtered search | `src/filter/index.rs` (per-attribute value→docset indexes used to resolve predicates without scanning) | `tests/filter_algebra.rs` (index-resolved results equal full-scan results) | | 10 | Hybrid search: weighted score fusion + reciprocal rank fusion | `src/fusion.rs` (score normalization, weighted fusion, RRF with configurable `k`), `src/plan/fuse.rs` (fusion stage wiring in execution) | inline unit tests in `src/fusion.rs`; end-to-end in `tests/planner_executor.rs` | | 11 | Multi-query requests | `src/plan/request.rs` (request model supports multiple sub-queries per call), `src/plan/executor.rs` (per-sub-query execution + fusion) | `tests/planner_executor.rs` | | 12 | Selected-attribute projection and top_k | `src/topk.rs` (bounded max-heap top-k collector), `src/wire.rs` + `src/plan/executor.rs` (attribute projection of result rows) | inline unit tests in `src/topk.rs`; projection assertions in `tests/planner_executor.rs` | | 13 | Query planner choosing exact / ANN / full-text / filter-index / hybrid paths | `src/plan/planner.rs` (selectivity- and availability-driven plan selection: exact scan for small or highly filtered candidate sets, IVF for large unfiltered/lightly filtered vector queries, inverted-index path for text, hybrid plan when both rankers are requested), `src/plan/mod.rs` | `tests/planner_executor.rs` (asserts chosen plan per request shape and that executed results are correct) | | 14 | Unit tests: vector math, BM25 correctness, filter algebra, recall@k vs exact kNN | — | `tests/vector_recall.rs`, `tests/bm25_correctness.rs`, `tests/filter_algebra.rs`, `tests/planner_executor.rs`, plus inline `#[cfg(test)]` modules across `vector/`, `text/`, `filter/`, `fusion.rs`, `sparse.rs`, `topk.rs` | Shared infrastructure delivered this milestone: `src/error.rs` (query-engine error taxonomy), `src/types.rs` (document/attribute/score types shared across paths), `src/lib.rs` (public surface). ## 2. Object-storage orientation (design intent for acceptance) The IVF layout deliberately splits into two artifact classes so that the serving tier can stay stateless: - **Centroid block** — small (`nlist × dim × f32`), intended to live in (or alongside) the namespace manifest and be held in memory per warm namespace. - **Posting-list blocks** — one independently encoded, checksummed block per centroid via `src/codec.rs`. A query touching `nprobe` lists fetches at most `nprobe` blocks; each block is a natural unit for the disk/memory cache hierarchy delivered in the caching milestone. The inverted index and attribute indexes use the same codec framing so they are storable and re-fetchable as immutable segment files. ## 3. Explicitly out of scope for this milestone (per project plan) - Wiring `src/wire.rs` request/response types into the HTTP server (`shoal-api`) — API-server milestone. - MinIO-backed restart/reconstruction integration tests for index files — indexing/recovery milestone (the codec layer delivered here is what those tests will exercise). - Disk/memory cache implementation, warm/pin endpoints, cold-vs-warm benchmarks — caching milestone. ## 4. How to verify ```sh cargo test -p shoal-query # full unit + integration suite cargo test -p shoal-query --test vector_recall # recall@k: IVF vs exact oracle, nprobe sweep cargo test -p shoal-query --test bm25_correctness cargo test -p shoal-query --test filter_algebra cargo test -p shoal-query --test planner_executor ``` Reviewer checklist before sign-off on a clean machine: 1. Confirm the workspace root `Cargo.toml` includes `crates/shoal-query` in `[workspace] members` (a `crates/*` glob already covers it). 2. `cargo test -p shoal-query` passes on current stable Rust. 3. `tests/vector_recall.rs` reports recall@10 meeting its asserted threshold at the swept `nprobe` values.