# ADR 0016: KaTeX for Math Rendering (with MathML Output for Accessibility) - **Status:** Accepted - **Date:** 2025-01-15 - **Deciders:** Core architecture team - **Related:** docs/architecture/03-content-format.md, docs/architecture/09 (accessibility/low-bandwidth), ADR 0006 (MDX/JSON) ## Context Mathematical notation appears in nearly every problem on the platform — inline (`$\frac{a}{b}$`) and display math, within problem statements, hints, solutions, discussion comments, and symbolic-answer feedback. Requirements: - **Performance on low-end devices and connections** (docs/architecture/09): rendering hundreds of formulas on a course page must not jank a budget Android phone. - **Server-side renderability**: Next.js renders content pages on the server/edge; math must render to static markup without a browser runtime, both for performance and so low-bandwidth/no-JS readers still see formulas. - **Accessibility**: screen readers must get a meaningful representation of formulas (WCAG; our docs/architecture/09 commitments). - **Determinism for versioned content**: the same source must render identically across versions and over time — content review approves *rendered* appearance. - **Security**: math source is community input; the renderer must not be an XSS vector. The two serious candidates are **KaTeX** and **MathJax (v3/v4)**. Also considered: **Temml** (LaTeX→MathML only) and raw MathML authoring. Honest capability comparison: | | KaTeX | MathJax 3 | |---|---|---| | Render speed | Very fast, synchronous | Slower; async pipeline | | LaTeX coverage | Large common subset | Near-complete, extensible (e.g. full `\mathchoice`, more packages) | | SSR story | First-class (`renderToString`) | Possible (liteDOM) but heavier | | Accessibility | MathML output mode | Richer (speech-rule engine, exploration) | | Bundle/CSS+fonts | ~70KB JS gz (client, if needed) + fonts | Larger runtime | ## Decision We adopt **KaTeX (0.16.x)** as the platform's math renderer, used in **server-side rendering mode with dual HTML+MathML output**, under these rules: 1. **Render at the server, ship markup.** Content pipelines (Next.js server components for pages; the backend's MDX-validation pass for previews) call `katex.renderToString(src, { output: "htmlAndMathml", throwOnError: false, strict: "warn" })`. Clients receive pre-rendered markup + the KaTeX stylesheet; the KaTeX JS bundle is **not** shipped for content viewing. Client-side KaTeX loads only inside the editor (live preview) and discussion composer. 2. **Accessibility via embedded MathML.** `htmlAndMathml` output embeds a MathML tree alongside the visual HTML; screen readers (VoiceOver, NVDA+browsers with MathML support) read the MathML while sighted users see KaTeX's visual layout. This is our WCAG answer for math content. The original LaTeX source is additionally preserved in a `data-latex` attribute so users can copy source, and a "view LaTeX source" affordance exists on every formula. 3. **Supported-macro policy.** Authoring docs publish the supported KaTeX function list as the platform's LaTeX dialect. The content validation pass (run on save and on submit-for-review) renders every formula with `strict: "error"` semantics for *unknown commands*, rejecting documents containing unsupported macros — so authors learn at edit time, not when a learner sees a red error box. A curated platform macro set (`\platformmacros`: common shorthands like `\R`, `\Z`, `\abs{}`) is defined centrally and versioned, since macros affect rendered output of historical versions. 4. **Security:** `trust: false` always (no `\href`/`\includegraphics` URL execution from community LaTeX); `maxSize` and `maxExpand` set to sane caps to bound pathological inputs (`\rule{99999em}...`, macro-expansion bombs) — relevant because formulas render server-side and could otherwise be a DoS vector. 5. **Fonts:** KaTeX's woff2 fonts self-hosted, preloaded only on math-containing pages, `font-display: swap`. Total math overhead for a typical page: CSS (~25KB gz) + the 2–4 font subsets actually used. 6. **Escape hatch acknowledged:** content needing LaTeX features beyond KaTeX's coverage is, for MVP, out of dialect — authors restructure or use SVG figures (as MediaAssets). If community demand demonstrates a real corpus of KaTeX-impossible content, the recorded fallback is server-side MathJax rendering *for flagged documents only*, never a wholesale switch. ## Alternatives Considered - **MathJax 3:** chosen against primarily on performance and SSR ergonomics. Its superior LaTeX coverage matters less under a published-dialect policy, and its accessibility advantages (speech-rule engine) are substantially matched by the MathML+modern-screen-reader path at a fraction of the runtime cost. Remains the documented escape hatch. - **Temml (LaTeX→MathML only):** elegant and tiny, but rendering quality then depends entirely on browser MathML implementations, which remain inconsistent (especially Chromium's MathML-Core gaps for complex layout). Visual fidelity of math is core product quality; not yet acceptable. - **Raw MathML authoring:** hostile to authors; the community writes LaTeX. Rejected. - **Client-side-only KaTeX:** simpler pipeline but ships JS to every reader, breaks no-JS rendering, and re-renders on every visit. Rejected per low-bandwidth strategy. ## Consequences **Positive** - Zero math JS for readers; fast, deterministic, SSR-rendered formulas; accessible via MathML; bounded attack surface. - A published LaTeX dialect with edit-time validation prevents the worst authoring failure mode (silently broken math in published content). **Negative / Accepted risks** - KaTeX's LaTeX subset will occasionally frustrate advanced authors; mitigated by the macro set, figures-as-assets, and the documented MathJax fallback path. - MathML screen-reader support, while now broadly shipped (Chromium MathML-Core since 109, long-standing in Firefox/Safari), has uneven verbosity across AT; we commit to testing with NVDA and VoiceOver in the accessibility audit milestone. - The platform macro set becomes a compatibility surface: macros are additive-only and versioned; removing one would break historical rendered content and is forbidden. **Follow-ups** - Define the v1 macro set and the formula-validation pass in the content-pipeline milestone. - Accessibility audit: math reading experience with NVDA + VoiceOver.