# 12 — Runtime & GC Configuration Reference > Companion to [08 — No-Allocation Contract](08-no-allocation-contract.md) and > [07 — Threading & Core Pinning](07-threading-and-core-pinning.md). > This document specifies the exact runtime configuration the architecture > assumes, why each knob is set the way it is, and which combinations are > explicitly **not** supported. The allocation-free architecture removes the GC from the hot path by construction, but the runtime still runs a GC (for warmup, cold paths, and control-plane code) and a JIT (which can recompile in the background). Both must be configured so that their residual activity cannot perturb pinned hot-path threads. --- ## 12.1 GC mode selection | Configuration | Recommended value | Rationale | |---|---|---| | `System.GC.Server` | `false` (Workstation GC) | With a zero-allocation steady state, GC throughput is irrelevant; what matters is that GC machinery never competes for the isolated cores. Server GC creates one GC thread + one dedicated heap per logical core and (by default) hard-affinitizes those threads across the machine — including the cores we isolated for hot-path threads. Workstation GC uses the triggering thread plus minimal helper threads and keeps a far smaller footprint. | | `System.GC.Concurrent` | `false` | Background (concurrent) GC keeps a standing background thread that wakes unpredictably. Since gen2 collections only happen during warmup/cold operations under the contract, a blocking gen2 at a *chosen* time (warmup boundary, scheduled maintenance window) is strictly preferable to an unscheduled background thread. | | `System.GC.RetainVM` | `true` | Prevents the GC from returning segments to the OS and re-faulting them later; keeps the committed/touched page set stable after warmup, which matters once pages are pre-touched (see arenas, doc 06). | | `System.GC.HeapHardLimit` | Set explicitly (e.g. `0x40000000` = 1 GiB) | Converts "slow leak eventually destabilises the host" into "leak hits a hard wall and trips alarms" — a deliberately fail-fast posture consistent with FMEA item F-7 in doc 09. Size it at 4–8× the measured post-warmup heap. | | `System.GC.HeapAffinitizeMask` / `System.GC.NoAffinitize` | If Server GC is ever required: set `HeapAffinitizeMask` (or `HeapAffinitizeRanges`) to **exclude every isolated hot-path core**; alternatively `NoAffinitize=true`. | Defence in depth: even though we recommend Workstation GC, any deployment that opts into Server GC for cold-path-heavy services **must** keep GC threads off the pinned cores. This is a hard requirement, reviewed in deployment checklists. | **Unsupported combination:** Server GC with default affinity on a machine with isolated cores. The deployment validation script must reject it. ### Reference `runtimeconfig.template.json` ```json { "configProperties": { "System.GC.Server": false, "System.GC.Concurrent": false, "System.GC.RetainVM": true, "System.GC.HeapHardLimit": 1073741824, "System.Runtime.TieredCompilation": false, "System.Runtime.TieredPGO": false } } ``` Environment-variable equivalents (useful for ops overrides; env vars take precedence over `runtimeconfig.json`): | Env var | Meaning | |---|---| | `DOTNET_gcServer=0` | Workstation GC | | `DOTNET_gcConcurrent=0` | Disable background GC | | `DOTNET_GCRetainVM=1` | Retain segments | | `DOTNET_GCHeapHardLimit=0x40000000` | Hard heap limit (hex) | | `DOTNET_TieredCompilation=0` | Disable tiering (see §12.3) | | `DOTNET_TieredPGO=0` | Disable dynamic PGO instrumentation | --- ## 12.2 Latency modes and No-GC regions ### `GCSettings.LatencyMode` The warmup sequence (doc 08, §"Warmup protocol") ends with: ```csharp GC.Collect(2, GCCollectionMode.Forced, blocking: true, compacting: true); GC.WaitForPendingFinalizers(); GC.Collect(2, GCCollectionMode.Forced, blocking: true, compacting: true); GCSettings.LatencyMode = GCLatencyMode.SustainedLowLatency; ContractEventSource.Log.WarmupComplete(); ``` `SustainedLowLatency` suppresses *unsolicited* blocking gen2 collections for the remainder of the session. Under the contract there should be nothing to collect anyway; the latency mode is a belt-and-braces measure against cold-path allocations triggering a compacting gen2 at a bad time. ### Why we do **not** rely on `TryStartNoGCRegion` `GC.TryStartNoGCRegion(budget)` is sometimes proposed as the "fix" for GC in HFT. We evaluated and rejected it as a primary mechanism: 1. **Finite budget.** The region aborts (with a GC) the moment cumulative allocation exceeds the budget — exactly the failure you cannot tolerate, delivered at an unpredictable time. It converts a measurable, gateable property ("we do not allocate") into a runtime gamble ("we hope we don't allocate more than X before the session ends"). 2. **Fragile lifecycle.** `EndNoGCRegion` throws if a GC already happened; exceptions inside the region can leave the process in an awkward state. 3. **It masks bugs.** Allocations inside a no-GC region don't show up as GC pauses, so an allocation regression survives until the budget bursts — typically in production. Our gate (doc 08 + `tools/AllocationGate`) makes the regression fail CI instead. **Permitted use:** wrapping *short, bounded* critical sections during partial migrations (migration guide, phase 2), where the legacy code still allocates and you need a temporary pause shield around e.g. an auction window. Always pair with allocation telemetry and treat as scaffolding to be removed. --- ## 12.3 JIT configuration: tiering, PGO, and warmup interaction Tiered compilation is a **post-warmup allocation source**: the runtime counts invocations at Tier 0, then queues background recompilation to Tier 1, and the background JIT allocates (method tables, IR, code heaps). It can fire minutes into a session if a code path's call count crosses the threshold late. Policy: | Knob | Value | Rationale | |---|---|---| | `TieredCompilation` | `false` for the hot-path process | Methods compile directly at full optimisation; no background recompilation, no late JIT allocations, deterministic code quality from the first call. Startup is slower — irrelevant, because warmup already takes seconds by design. | | `TieredPGO` | `false` | Dynamic PGO inserts instrumentation at Tier 0 and allocates during reoptimisation. With tiering off it is moot, but set explicitly so an SDK default change cannot re-enable it. | | `ReadyToRun` | Optional, `true` for large deployments | R2R precompiled code reduces warmup *time*; the runtime may still re-JIT R2R code for better quality, which is another reason tiering stays off. | | OSR (`DOTNET_TC_OnStackReplacement`) | Follows tiering | With tiering disabled, on-stack replacement is inactive; long-running loops are compiled optimised from the start. | **NativeAOT** is attractive long-term (no JIT at all, smaller runtime), and the architecture is compatible with it (no reflection-emit, no runtime codegen in the hot path). It is *not* required for this milestone; we treat it as a future hardening option because (a) some diagnostic tooling differs under AOT and (b) third-party exchange SDKs in scope still assume CoreCLR. The design's warmup protocol explicitly touches every hot-path method precisely so the JIT-vs-AOT decision does not change the steady-state contract. **Interaction with the allocation gate:** in CI, gated processes run with tiering disabled (the sample project sets `false`). If a production service must run with tiering on, the gate's warmup window must exceed the tiering quiescence period, and any residual JIT allocation types may be baselined only with a linked justification (contract doc, §"Baseline governance"). --- ## 12.4 Pinned Object Heap and large-object placement * Buffers shared with native I/O (NIC userspace stacks, `Socket` receive buffers used with `SocketAsyncEventArgs`) are allocated **once at startup** on the **Pinned Object Heap** via `GC.AllocateArray(length, pinned: true)` (or `GC.AllocateUninitializedArray` where zeroing is wasteful). POH allocation removes the classic pinned-buffer fragmentation problem and avoids long-lived `GCHandle` pins on the normal heap. * Arrays ≥ 85,000 bytes land on the LOH regardless; pre-allocate them during warmup only. The LOH is never compacted under our configuration (`GCSettings.LargeObjectHeapCompactionMode` is left at `Default`), which is safe because LOH contents are fixed after warmup. * Anything whose lifetime is the whole session and whose address must be stable for native interop goes to the **unmanaged arenas** (doc 06) instead; POH is reserved for buffers that must remain *managed* arrays for API compatibility. --- ## 12.5 OS-level configuration (Linux production profile) These complement the core-pinning design in doc 07 and are required for the latency numbers in doc 13 to be reproducible: | Setting | Value | Why | |---|---|---| | `isolcpus=` + `nohz_full=` + `rcu_nocbs=` kernel args | The hot-path core set | Removes scheduler ticks and RCU callbacks from isolated cores. | | CPU governor | `performance`; C-states limited to C1 (`processor.max_cstate=1`, `intel_idle.max_cstate=1`) | Deep C-state exit latency (tens of µs) dwarfs our budget. | | Turbo / frequency | Fixed frequency preferred for reproducibility; turbo acceptable in production if thermal headroom is monitored | Frequency transitions add jitter and confound benchmarking. | | SMT | Disabled on hot-path cores (offline siblings) | Sibling contention adds unpredictable tens-of-ns stalls. | | Transparent Huge Pages | `madvise` (not `always`) | `always` can stall on compaction; arenas request huge pages explicitly via `madvise(MADV_HUGEPAGE)` where beneficial. | | Memory locking | `mlockall(MCL_CURRENT \| MCL_FUTURE)` at startup via P/Invoke (after arenas commit), with `memlock` ulimit raised | Eliminates major faults on hot memory. Pre-touching (doc 06) handles minor faults; mlock handles reclaim. | | Swap | Disabled on trading hosts | A swapped page is a session-ending latency event. | | IRQ affinity | All NIC IRQs steered away from isolated cores (or kernel-bypass stack in use) | Interrupt handlers on a pinned core are a direct jitter injection. | | Clock source | `tsc` (verify `constant_tsc nonstop_tsc` CPU flags) | `Stopwatch` resolves to RDTSC-backed clock; HPET fallback is dramatically slower. | A deployment **validation script** must assert all of the above (plus §12.1's GC settings, read back via `GCSettings`/`AppContext`) at process start and refuse to enter warmup on mismatch. Failing fast at startup is cheap; finding out at 14:30 in the session is not. --- ## 12.6 Configuration matrix summary | Profile | GC | Tiering | Use | |---|---|---|---| | **Production hot path** | Workstation, non-concurrent, RetainVM, hard limit | Off (or R2R + off) | Trading engine process | | **CI gate** | Same as production | Off | `AllocationGate` runs — must match production to be meaningful | | **Cold-path services** (risk reporting, UI gateways) | Server GC permitted, affinitized away from isolated cores | Default | Not subject to the contract | | **Developer inner loop** | Defaults | Default | Fast iteration; contract checked in CI, not on dev boxes | The CI profile **must** equal the production profile for GC/JIT settings — gating a configuration you don't ship verifies nothing. The validation script and the CI job share one canonical settings file to keep them from drifting.