# 02 — Object Pooling Strategy ## 1. Scope and philosophy Pools serve lifetime class **L2**: objects with dynamic, message-driven lifetimes that don't fit the epoch model — working orders, in-flight requests, timer entries, retry records. The design principle is **pools are pre-sized, never grow, never shrink, never allocate after seal**, and exhaustion is a *first-class designed behavior* (§6), not an afterthought. We do **not** use `Microsoft.Extensions.ObjectPool` or `ArrayPool.Shared` on the hot path: both may allocate on miss (`ArrayPool.Shared.Rent` allocates a new array when a bucket is empty; `DefaultObjectPool` allocates via its policy), neither gives generation-checked handles, and neither provides exhaustion telemetry to our contract. `ArrayPool` remains acceptable on cold paths. ## 2. Pool taxonomy | Pool kind | Element | Discipline | Used for | |-----------|---------|-----------|----------| | `StructPool` | `unmanaged` struct in a slab | Single-threaded (one owner thread) | Per-thread state records (e.g., strategy's order intents) | | `SharedStructPool` | `unmanaged` struct in a slab | Multi-thread acquire/release, lock-free freelist | Objects whose acquire and release happen on different pinned threads (e.g., `OrderState`: acquired by Strategy, released by TX after fill/cancel-ack) | | `BufferPool` | Fixed-size byte blocks in one slab | Single- or multi-threaded variants | Scratch encode buffers, snapshot staging | All three are thin variations on one mechanism: a pre-allocated slab + a freelist of indices + generation counters. There is deliberately no general-purpose `ClassPool`: pooling reference types reintroduces GC graph edges and accidental retention; if a design seems to need one, the type gets converted to an unmanaged struct or moved off the hot path. ## 3. `StructPool` specification (single-threaded) ```csharp namespace Fp.HotPath.Pooling; public sealed class StructPool where T : unmanaged { // Construction is pre-seal only (allocates the slab). Capacity is rounded up to a power of two. public StructPool(byte poolId, int capacity, string name); public int Capacity { get; } public int InUse { get; } // exact; single-threaded public int HighWaterMark { get; } // max InUse since start; exported to telemetry /// Acquire a free slot. O(1): pops the freelist head. Returns false when exhausted. /// The slot's memory is NOT zeroed on acquire (perf); callers must fully initialize /// (message/codec layer guarantees this; see doc 03 §6). ZeroOnRelease is a config option /// for pools holding sensitive data. public bool TryAcquire(out Handle handle); /// Resolve to a borrow. Checked per build flavor (doc 01 §5.3). public ref T Get(Handle handle); /// Return the slot. Bumps generation. Faults on double-release in checked builds. public void Release(Handle handle); /// Pre-seal only: touch every page, optionally run an init function per slot. public void Prefault(Action? initSlot = null); } ``` ### 3.1 Internal layout (normative) - Slab: `T[] _slab = GC.AllocateArray(capacity, pinned: true);` - Generations: `uint[] _gen` (same length, also pinned at startup — pinning irrelevant here but keeps everything POH-resident and out of Gen0 statistics). - Freelist: intrusive — a `int[] _nextFree` array plus `_freeHead`; index `-1` = end. We use a side array (not embedding next-pointers in `T`) so element layout stays exactly the domain layout and a released slot's memory can be poisoned in checked builds (`0xDD` fill) to surface use-after-release immediately. - `Get` in Release-Fast: `return ref _slab[(int)h.Index & _mask];` — one AND, one bounds-elided array access (the mask makes the JIT drop the bounds check when `_mask` is `capacity-1` and the array length is the matching power of two; we verify codegen in the perf test suite). ### 3.2 Why slot memory is not zeroed on acquire Zeroing is O(sizeof(T)) on the hot path. Instead: (a) message structs are fully written by their constructor-equivalent `Init` methods, enforced by analyzer FP0010 ("all fields assigned in Init"); (b) checked builds poison on release, so any read-before-init reads `0xDDDD…` and trips field validators. Pools holding partially-initialized-by-design records opt into `ZeroOnRelease`. ## 4. `SharedStructPool` specification (multi-threaded) Same surface as `StructPool` with these differences: - **Freelist is a Treiber stack on indices** with the generation embedded in the CAS word to defeat ABA: head word = `[32b generation-of-head | 32b index]`, advanced with `Interlocked.CompareExchange(ref long)`. Acquire/release are wait-free in the absence of contention and lock-free under it; worst case is a few CAS retries — bounded in practice because at most a handful of pinned threads share a pool. - `InUse` is approximate (relaxed counter via `Interlocked.Increment/Decrement`); exact only at barriers. - **Cache-line discipline:** the head word, the counters, and the slab live on separate cache lines (`[StructLayout]` padding on an internal `Pads` struct) to prevent false sharing between the acquiring and releasing threads. - Per-thread **release buffers** are explicitly *not* used (a classic optimization): they add a reclamation delay that complicates exhaustion accounting. Revisit only if CAS contention shows up in the perf rig. ## 5. Sizing methodology (normative process, doc 09 cross-ref) Every pool's capacity is a config value derived as follows and recorded in the deployment's `memory-budget.toml`: 1. **Model the population.** For each pool, identify the count driver. Example, `OrderState`: `max working orders per symbol × symbols traded × safety factor for in-flight transitions`. 2. **Bound it from exchange/risk limits**, not from observed behavior: message-rate throttles, max-open-orders limits, and the risk layer's own caps give hard ceilings. If no hard ceiling exists, the risk layer MUST impose one — an unbounded population is a design error. 3. **Multiply by 2** (headroom factor; covers transient double-occupancy during handoffs, e.g., order replaced while cancel in flight). 4. **Validate in soak:** `HighWaterMark / Capacity` MUST stay ≤ 0.5 in the worst recorded soak; the telemetry exporter publishes this ratio and the CI soak gate (doc 08 §7) fails above it. Memory cost is reported at startup by `MemoryMap.Report()` and checked against the budget file — a pool that silently grows in a config change gets caught at deploy review. ## 6. Exhaustion policy (first-class design) `TryAcquire == false` is a *designed state* with a per-pool declared policy: | Policy | Behavior | Appropriate for | |--------|----------|-----------------| | `RejectNew` | Caller declines the triggering action (e.g., strategy skips the new order; counter incremented) | Order/intent pools — the safe default: refusing to send a *new* order is always safe | | `ShedOldest` | Caller frees a victim chosen by domain logic (e.g., cancel oldest passive order, reclaim its state) | Quote-heavy strategies that can always reduce exposure to make room | | `Breach` | `HotPathContract.Breach(PoolExhausted)` → kill switch | Pools whose exhaustion implies the sizing model is broken (e.g., ack-tracking pool: exhaustion means the exchange owes us more acks than we believe possible) | Policies are declared at construction (`PoolExhaustionPolicy` enum + optional victim callback set pre-seal) so reviewers see the decision next to the sizing constant. Every exhaustion event increments `pool.exhaustions` (telemetry) regardless of policy; **any nonzero value pages the desk** — policy handling is a parachute, not a steady state. ## 7. Leak detection A leak in this design = a handle acquired and never released; the pool drains over hours. Detection layers: 1. **High-water-mark drift alarm.** Telemetry exports `InUse` and `HighWaterMark` per pool every second. A monotonic `InUse` floor rising over a window with flat business activity (open orders flat, but `OrderState.InUse` rising) triggers a warning. The alert rule ships with the telemetry package (doc 11). 2. **Epoch audit.** At each epoch barrier (all threads parked), pools can enumerate in-use slots (`AuditInUse(Action)`, barrier-only API). Each domain registers a reconciler: e.g., every in-use `OrderState` must be reachable from the working-orders index. Orphans are logged with the slot's `LastAcquireSite` — 3. **Acquire-site tagging (checked builds only).** In `FP_CHECKED`, `TryAcquire` records a 16-bit caller site id (source-generated table of call sites) into a side array, making epoch audit reports actionable without allocation. 4. **Soak gate.** The 24 h CI soak (doc 08 §7) requires every pool's `InUse` at end-of-soak barrier to equal its reconciled expected count. There is deliberately **no GC-based safety net** (no finalizer that returns leaked slots): M-8 bans finalizers, and a silent self-heal would mask the bug the audit exists to catch. ## 8. `BufferPool` specification Fixed-block byte pool for scratch buffers (encode staging, compression scratch, FIX string assembly during migration phase): ```csharp public sealed class BufferPool { public BufferPool(byte poolId, int blockSize, int blockCount, string name, BufferPoolConcurrency concurrency); // SingleThread | Shared public bool TryAcquire(out BufferLease lease); // lease: ref struct, see below public void Release(in BufferHandle handle); } /// ref struct so a lease cannot be stored; expose handle explicitly for the /// (rare, M-6-governed) case of passing buffer ownership through a ring. public ref struct BufferLease { public Span Span { get; } // exactly blockSize bytes public BufferHandle Handle { get; } // plain struct: (poolId, gen, index) public void Dispose(); // releases if not detached public BufferHandle Detach(); // caller takes over release duty (ownership transfer) } ``` Block size is uniform per pool (no buddy/bucket scheme): heterogeneous needs get separate pools (`BufferPool encode64k`, `BufferPool scratch4k`), keeping fragmentation impossible and accounting exact. ## 9. What does NOT get pooled - **Strings**: banned on the hot path entirely (doc 03 §7 defines `Symbol`, `ClOrdId` as inline fixed-size structs; doc 04 §6 covers text formatting into spans). - **Arrays as messages**: messages are structs in rings, not pooled arrays. - **Tasks/delegates**: no async, no delegate creation after seal (doc 07's threading model has no thread pool on the hot path; callbacks are registered pre-seal into fixed dispatch tables). - **Order book nodes**: these are L1 (arena) — they live a whole session; pooling adds churn for no benefit (doc 06 §4 gives the book layout). ## 10. Failure-mode summary (full analysis in doc 09) | Failure | Detection | Containment | |---------|-----------|-------------| | Exhaustion | `TryAcquire=false` + telemetry counter | Declared policy (§6) | | Leak | HWM drift, epoch audit, soak gate | Epoch reset reclaims nothing for pools — must fix; breach if pool nears exhaustion | | Use-after-release | Generation check (Debug/Checked); poison fill | Fault → breach in prod-checked; undefined in Release-Fast (hence canary policy, doc 01 §5.3) | | Double release | Generation mismatch on second call | Same as above | | Cross-pool handle confusion | PoolId check (Debug/Checked) | Same as above | | False sharing on shared pool head | Perf rig regression (CAS latency histogram) | Padding audit; `dotnet-counters` cache-miss proxy metrics |