# ADR 0015: Internationalization — UI Message Catalogs + Per-Language Content Documents - **Status:** Accepted - **Date:** 2025-01-15 - **Deciders:** Core architecture team - **Related:** docs/architecture/09-accessibility-i18n-bandwidth.md, ADR 0006 (MDX/JSON content), ADR 0007 (versioning), ADR 0002 (Next.js) ## Context A community-owned learning platform that only works in English contradicts its own mission. But "i18n" hides two very different problems: 1. **UI internationalization** — buttons, navigation, validation errors, emails. Finite string set, owned by the codebase, translated by the community, shipped with releases. 2. **Content internationalization** — problems, hints, solutions, courses. Unbounded, community-authored, versioned, reviewed, and *not* a string-table problem: a good translation of a math problem may legitimately adapt names, currencies, units, and examples, and must go through the same review pipeline as original content. Conflating these (e.g. storing problem text in gettext catalogs, or treating translations as untracked metadata on the original) produces unmaintainable systems. We need an architecture that keeps them separate and gives each the right tooling. Locale handling also touches: URL design (SEO + shareability), RTL scripts, number/date formatting, fonts for non-Latin scripts under low-bandwidth constraints, and search (ADR 0011). ## Decision ### A. UI strings 1. **Frontend:** ICU MessageFormat catalogs managed with **next-intl** in the Next.js app. All user-facing strings go through the message API — hardcoded JSX strings are lint-banned (`eslint` rule in CI). Messages live in `messages/.json`; English (`en`) is the source locale. 2. **Backend:** Django's built-in i18n (gettext) for API error messages, emails, and any server-rendered text, with `Accept-Language` negotiation. Machine-readable error `code`s (ADR 0014) mean clients never need to parse translated messages. 3. **Translation workflow:** catalogs are translated via **Weblate** (open-source, self-hostable, Git-integrated) against the repository; translations arrive as ordinary pull requests. A locale ships when it crosses an 80% completion threshold; below that it's available behind a "help translate" banner with English fallback per-message. 4. **Formatting:** all dates, numbers, and relative times rendered through `Intl.*` APIs (frontend) / Django's localization (backend). No hand-rolled formatting anywhere. ### B. Content 5. **Translations are first-class documents, not metadata.** A translated problem is a new Problem entity with its own version history, review lifecycle, and attribution, linked to the original via `translation_of: ` and carrying `language: ` in its document metadata (already specified in `problem.schema.json`). This means: - Translations go through the standard draft → review → publish workflow — quality control applies equally. - The translator earns attribution and reputation (consistent with the forking/attribution model, ADR 0008 — a translation *is* a specialized fork). - Originals and translations evolve independently; a **staleness indicator** (original has published versions newer than the translation's `translated_from_version` pointer) is surfaced to translators and on the translated page ("the English original has been updated since this translation"). 6. **Language sets and discovery:** the API exposes sibling translations (`GET /problems/{id}/translations`), the frontend shows a language switcher on content pages, and search (ADR 0011) facets on `language` with the user's preferred languages boosting results. ### C. Locale plumbing 7. **URLs:** locale-prefixed paths (`/es/courses/...`) via next-intl routing, with `en` as the default unprefixed locale; `hreflang` alternates emitted for SEO. Locale resolution order: explicit URL → authenticated user preference → `Accept-Language` → `en`. 8. **RTL:** the design system uses CSS logical properties (`margin-inline-start`, etc.) exclusively — enforced by stylelint — so `dir="rtl"` works without parallel stylesheets. Tailwind is configured with the RTL-aware logical-property utilities. 9. **Fonts & scripts:** system font stack by default (zero font bytes — see docs/architecture/09); KaTeX handles math identically across locales (ADR 0016). No locale ships webfonts unless its script genuinely requires them, and then via `font-display: swap` subsetted files. ## Alternatives Considered - **Content in translation-string tables (gettext/Crowdin per-paragraph):** destroys document structure, fights the versioning model, prevents legitimate adaptation, and gives translations no review path. Rejected decisively. - **Translations as fields on the original problem (`title_es`, JSONB lang map):** schema explosion, single shared version history (one review queue blocks all languages), no independent attribution. Rejected. - **Machine translation as primary:** unacceptable for pedagogical content where precision matters (a mistranslated inequality is a wrong problem). MT may later assist translators as a *draft seed*, clearly flagged, but never auto-publishes. - **`react-i18next` instead of next-intl:** viable; next-intl chosen for first-class App Router/server-component support and locale-routing integration, reducing custom glue. - **Cookie/header-only locale (no URL prefix):** breaks shareability ("this link shows Spanish for me, English for you") and SEO. Rejected. ## Consequences **Positive** - Each problem class gets appropriate tooling: Weblate+gettext/ICU for finite UI strings; the full authoring/review/versioning pipeline for content. - Translators are first-class contributors with attribution and reputation — aligned with the community-driven mission. - Staleness tracking makes translation drift visible instead of silent. **Negative / Accepted risks** - Translated content can lag originals; mitigated by staleness indicators and a translator-facing "needs update" queue, but never fully solved — that's inherent. - Two i18n systems (next-intl + Django gettext) to maintain; mitigated by the API-error-`code` convention which keeps the backend's translated surface small. - Locale-prefixed URLs require redirects when users switch preferred language; standard, handled by next-intl middleware. **Follow-ups** - Backend milestone: `translation_of` linkage, staleness computation, `/translations` endpoint. - Infra milestone: self-hosted Weblate (or hosted Weblate's gratis open-source tier) wiring.