# ADR 0016: KaTeX for Math Rendering (with MathML Output for Accessibility)

- **Status:** Accepted
- **Date:** 2025-01-15
- **Deciders:** Core architecture team
- **Related:** docs/architecture/03-content-format.md, docs/architecture/09 (accessibility/low-bandwidth), ADR 0006 (MDX/JSON)

## Context

Mathematical notation appears in nearly every problem on the platform — inline (`$\frac{a}{b}$`) and display math, within problem statements, hints, solutions, discussion comments, and symbolic-answer feedback. Requirements:

- **Performance on low-end devices and connections** (docs/architecture/09): rendering hundreds of formulas on a course page must not jank a budget Android phone.
- **Server-side renderability**: Next.js renders content pages on the server/edge; math must render to static markup without a browser runtime, both for performance and so low-bandwidth/no-JS readers still see formulas.
- **Accessibility**: screen readers must get a meaningful representation of formulas (WCAG; our docs/architecture/09 commitments).
- **Determinism for versioned content**: the same source must render identically across versions and over time — content review approves *rendered* appearance.
- **Security**: math source is community input; the renderer must not be an XSS vector.

The two serious candidates are **KaTeX** and **MathJax (v3/v4)**. Also considered: **Temml** (LaTeX→MathML only) and raw MathML authoring.

Honest capability comparison:

| | KaTeX | MathJax 3 |
|---|---|---|
| Render speed | Very fast, synchronous | Slower; async pipeline |
| LaTeX coverage | Large common subset | Near-complete, extensible (e.g. full `\mathchoice`, more packages) |
| SSR story | First-class (`renderToString`) | Possible (liteDOM) but heavier |
| Accessibility | MathML output mode | Richer (speech-rule engine, exploration) |
| Bundle/CSS+fonts | ~70KB JS gz (client, if needed) + fonts | Larger runtime |

## Decision

We adopt **KaTeX (0.16.x)** as the platform's math renderer, used in **server-side rendering mode with dual HTML+MathML output**, under these rules:

1. **Render at the server, ship markup.** Content pipelines (Next.js server components for pages; the backend's MDX-validation pass for previews) call `katex.renderToString(src, { output: "htmlAndMathml", throwOnError: false, strict: "warn" })`. Clients receive pre-rendered markup + the KaTeX stylesheet; the KaTeX JS bundle is **not** shipped for content viewing. Client-side KaTeX loads only inside the editor (live preview) and discussion composer.
2. **Accessibility via embedded MathML.** `htmlAndMathml` output embeds a MathML tree alongside the visual HTML; screen readers (VoiceOver, NVDA+browsers with MathML support) read the MathML while sighted users see KaTeX's visual layout. This is our WCAG answer for math content. The original LaTeX source is additionally preserved in a `data-latex` attribute so users can copy source, and a "view LaTeX source" affordance exists on every formula.
3. **Supported-macro policy.** Authoring docs publish the supported KaTeX function list as the platform's LaTeX dialect. The content validation pass (run on save and on submit-for-review) renders every formula with `strict: "error"` semantics for *unknown commands*, rejecting documents containing unsupported macros — so authors learn at edit time, not when a learner sees a red error box. A curated platform macro set (`\platformmacros`: common shorthands like `\R`, `\Z`, `\abs{}`) is defined centrally and versioned, since macros affect rendered output of historical versions.
4. **Security:** `trust: false` always (no `\href`/`\includegraphics` URL execution from community LaTeX); `maxSize` and `maxExpand` set to sane caps to bound pathological inputs (`\rule{99999em}...`, macro-expansion bombs) — relevant because formulas render server-side and could otherwise be a DoS vector.
5. **Fonts:** KaTeX's woff2 fonts self-hosted, preloaded only on math-containing pages, `font-display: swap`. Total math overhead for a typical page: CSS (~25KB gz) + the 2–4 font subsets actually used.
6. **Escape hatch acknowledged:** content needing LaTeX features beyond KaTeX's coverage is, for MVP, out of dialect — authors restructure or use SVG figures (as MediaAssets). If community demand demonstrates a real corpus of KaTeX-impossible content, the recorded fallback is server-side MathJax rendering *for flagged documents only*, never a wholesale switch.

## Alternatives Considered

- **MathJax 3:** chosen against primarily on performance and SSR ergonomics. Its superior LaTeX coverage matters less under a published-dialect policy, and its accessibility advantages (speech-rule engine) are substantially matched by the MathML+modern-screen-reader path at a fraction of the runtime cost. Remains the documented escape hatch.
- **Temml (LaTeX→MathML only):** elegant and tiny, but rendering quality then depends entirely on browser MathML implementations, which remain inconsistent (especially Chromium's MathML-Core gaps for complex layout). Visual fidelity of math is core product quality; not yet acceptable.
- **Raw MathML authoring:** hostile to authors; the community writes LaTeX. Rejected.
- **Client-side-only KaTeX:** simpler pipeline but ships JS to every reader, breaks no-JS rendering, and re-renders on every visit. Rejected per low-bandwidth strategy.

## Consequences

**Positive**
- Zero math JS for readers; fast, deterministic, SSR-rendered formulas; accessible via MathML; bounded attack surface.
- A published LaTeX dialect with edit-time validation prevents the worst authoring failure mode (silently broken math in published content).

**Negative / Accepted risks**
- KaTeX's LaTeX subset will occasionally frustrate advanced authors; mitigated by the macro set, figures-as-assets, and the documented MathJax fallback path.
- MathML screen-reader support, while now broadly shipped (Chromium MathML-Core since 109, long-standing in Firefox/Safari), has uneven verbosity across AT; we commit to testing with NVDA and VoiceOver in the accessibility audit milestone.
- The platform macro set becomes a compatibility surface: macros are additive-only and versioned; removing one would break historical rendered content and is forbidden.

**Follow-ups**
- Define the v1 macro set and the formula-validation pass in the content-pipeline milestone.
- Accessibility audit: math reading experience with NVDA + VoiceOver.