# Architecture Overview: Home Assistant in 2025 This section describes the system an effective contributor must understand. It is organized from the kernel outward, because the central architectural fact of Home Assistant is a **stable, strictly-governed core surrounded by a vast, variable-quality integration layer** — and that asymmetry is the foundation of this report's recommendation. ## 1. The shape of the project Home Assistant is not one repository; it is a constellation governed since 2024 by the **Open Home Foundation** (a Swiss non-profit holding the IP), with commercial development funded primarily by **Nabu Casa** (Home Assistant Cloud subscriptions): | Repository / project | Role | Language | |---|---|---| | `home-assistant/core` | The hub: event loop, state machine, entity model, automation engine, **and all ~2,800 bundled integrations** | Python (3.13-era, fully async) | | `home-assistant/frontend` | The web UI (and the UI embedded in mobile apps) | TypeScript / Lit web components | | `home-assistant/supervisor` | Manages the containerized deployment: add-ons, OS updates, backups | Python | | `home-assistant/operating-system` | Home Assistant OS — Buildroot-based appliance OS | Buildroot/shell | | `home-assistant/architecture` | ADRs and architecture discussions | Markdown | | Mobile apps (`iOS`, `android`) | Companion apps: notifications, sensors, location | Swift / Kotlin | | Protocol servers | `python-matter-server`, `zwave-js-server` (Node), `zigpy` stack, ESPHome, Wyoming (voice) | Mixed | | `home-assistant.io` / `developers.home-assistant` | User and developer documentation | Markdown | Deployment comes in four supported flavors — **Home Assistant OS** (the appliance, dominant among reporting installs), **Supervised**, **Container**, and **Core** (bare venv). Add-ons (separately containerized apps managed by the Supervisor) exist only on OS/Supervised. This matters to contributors mainly in one way: **integrations cannot assume the host environment** — no shelling out, no system packages, pure-Python (or pre-built wheel) dependencies only, a constraint enforced by `hassfest` manifest validation and the wheels build. ## 2. The core kernel Everything in `home-assistant/core` outside `homeassistant/components/` is a compact, high-quality kernel (~50k lines) that has been stable in shape for years: ### 2.1 Event bus, state machine, service registry The `HomeAssistant` object (`hass`) owns a single asyncio event loop and three primitives: - **Event bus** (`hass.bus`): typed events (`state_changed`, `call_service`, `homeassistant_started`, …). Everything observable in the system is an event. - **State machine** (`hass.states`): the current state of every entity — `entity_id → State(state: str, attributes: dict, last_changed, last_updated, context)`. State writes fire `state_changed` events; the state machine is the *consequence* of entity updates, never the source of truth for devices. - **Service registry** (`hass.services`): named, schema-validated actions (`light.turn_on`, …) — renamed "actions" in user-facing UI in 2024, still services in code. Service calls carry a `Context` (user, parent action) enabling the trace/attribution system. Execution discipline is strict and enforced in review: the event loop must never block. Synchronous library calls must be pushed through the executor (`hass.async_add_executor_job`); a blocking-call detector in core actively logs/raises on known blocking operations (file I/O, `time.sleep`, blocking HTTP) inside the loop. This single rule is the root cause of the Platinum tier's hardest requirement (§5). ### 2.2 The entity model and registries - **Entities** are Python objects subclassing per-domain bases (`SensorEntity`, `LightEntity`, `ClimateEntity`, … ~50 domains). The base classes define the *contract* (supported features bitmasks, device classes, state classes, units) and core handles state-machine writes, so an integration mostly fills in properties. - **Entity registry / device registry / area registry**, plus **floors, labels, and categories** (added 2024.4): persistent metadata that survives restarts and lets users rename/organize without integrations caring. Devices group entities and carry identifiers (so multiple integrations can attach to one physical device), manufacturer/model/sw version, and `via_device` topology. - **Unique IDs** are the load-bearing concept: an entity with a stable `unique_id` gets registry persistence, user customization, and survives re-setup. Lack of unique IDs is a classic legacy-integration defect and a Bronze-tier rule. ### 2.3 Config entries and flows Modern integrations are configured through **config entries** — persisted setup records created by **config flows** (UI wizards defined in the integration's `config_flow.py`). The flow framework also provides: - **Discovery-initiated flows** (zeroconf/mDNS, SSDP, DHCP-sniffing, Bluetooth advertisements, USB enumeration, MQTT discovery, `hassio` add-on discovery) — declared in the manifest so core can wake the right integration when hardware appears. - **Reauth flows** (cloud token expired → actionable repair issue → guided re-login instead of a silently dead integration) and **reconfigure flows** (change host/IP without delete-and-re-add, generalized in 2024). - **Options flows** for post-setup settings, and **subentries** (2025-era) for per-device/per-resource children of one entry. YAML configuration survives for the automation/script/template layer and a shrinking set of infrastructure integrations; **new device integrations must be config-entry based** (ADR-0010). The decade-long YAML→UI migration is essentially complete policy-wise, but *partially* complete code-wise — another quality-variance source in the long tail. ### 2.4 Update logic: `DataUpdateCoordinator` The blessed pattern for polling integrations: one coordinator fetches per device/account on an interval (or via push callbacks), entities subscribe, and core gets centralized error handling (`UpdateFailed` → entities marked unavailable → automatic recovery on next success), debounced refreshes, and request parallelism control. Half of the "integration dies until restart" bug class in report 04 traces to integrations that predate or sidestep this pattern and hand-roll update loops with broken error recovery. ### 2.5 Automation engine Triggers (state, time, event, device, template, …) → conditions → actions, executed by a script engine with run modes (single/restart/queued/parallel), `wait`/`repeat`/`choose` control flow, **traces** (step-by-step execution recording for debugging), and **blueprints** (parameterized shareable automations). Jinja2 templating is pervasive. This subsystem is core-team-owned, actively developed (the roadmap's "automation usability" track), and **not** a good outsider target. ### 2.6 Recorder, statistics, energy The **recorder** persists events/states to SQLAlchemy-backed storage (SQLite default; MariaDB/PostgreSQL supported) with a heavily optimized schema (state attributes deduplicated/compressed; major schema overhauls landed 2022–2023 cutting DB size several-fold). **Long-term statistics** (5-minute/hourly aggregates kept forever) power history charts and the **energy dashboard**. The contributor-relevant consequence: sensors must declare correct `device_class`, `state_class`, and units or statistics silently misbehave — one of the most common *user-visible* defects in the long tail (report 04 theme "entity correctness"), and exactly what quality-scale review catches. ### 2.7 Auth, API surface, frontend Native auth (users, refresh/access tokens, MFA), a WebSocket API (primary frontend transport), REST API, and server-sent events. The frontend is a separate TypeScript/Lit codebase consumed by core as a built Python package (`home-assistant-frontend`); dashboards are user-configurable ("Lovelace"), with a new sections/drag-drop layout system rolled out across 2024–2025. Frontend contribution is a distinct skill set and review pipeline — relevant to the feature-gap path's cost in report 06. ## 3. The integration layer — where the variance is `homeassistant/components/` contains ~2,800 directories, each one integration ("domain"), each with a `manifest.json` declaring: domain, name, dependencies (other integrations), `requirements` (PyPI packages, **exact-pinned**), discovery hooks, `iot_class` (`local_push`, `local_polling`, `cloud_push`, `cloud_polling`, `assumed_state`, `calculated`), `code_owners`, `integration_type`, and — since October 2024 — `quality_scale`. Key structural properties of this layer: 1. **Monorepo with per-integration ownership.** All bundled integrations live in core and ride core's CI, but each has (optionally) listed **code owners** — community maintainers auto-requested on PRs/issues. A large fraction of integrations have no code owner or an inactive one. Core team members review everything that merges, but they don't *drive* per-integration work. 2. **The third-party-library rule (ADR-0011-era policy):** integrations may not implement protocol logic inline; device/API communication must live in a published PyPI library. Consequence: an integration's ceiling is often set by a library *outside* the contributor's direct control — the single biggest effort risk for Platinum uplift (report 06 §4). 3. **Exact-pinned requirements** (uniqueness enforced repo-wide): upstream API drift requires a library release *plus* a core bump PR. "Pinned lib is broken against the vendor's current API" is a top-five issue theme in report 04. 4. **Custom components** (`custom_components/`, distributed via HACS) form a parallel ecosystem outside core quality control. Migration of popular custom components into core is a recurring, partially-realized ambition; out of scope here but noted in report 05. ## 4. Quality machinery already in place This is the scaffolding any contribution program inherits for free — and it is excellent: - **`hassfest`**: static manifest/translations/services validation, runs in CI and as a pre-commit-style check; also validates `quality_scale.yaml` rule files against the declared tier. - **Test harness**: `pytest-homeassistant-custom-component`-style fixtures live in core itself (`tests/common.py`, `MockConfigEntry`, time-travel helpers, snapshot testing via `syrupy`); per-integration coverage is tracked and **full coverage of `config_flow.py` is mandatory for new integrations**; the `.coveragerc`/coverage roster marks legacy integrations exempted from coverage gates — a literal machine-readable list of the debt. - **Strict typing roster** (`.strict-typing`): integrations opt in to full mypy strict mode file-by-file; Platinum requires it. - **Repairs platform**: integrations raise actionable, user-visible issues ("your token expired, click to fix") instead of log spam. - **Diagnostics platform**: one click exports redacted integration state for bug reports — a Gold-tier rule, and its absence measurably degrades issue-triage quality in the data of report 04. - **Release train**: monthly minor releases (`2025.x`), beta week, patch releases; deprecation policy of (typically) six months for breaking changes, announced in release notes. ## 5. Architectural consequences for this report Five conclusions that the rest of the report builds on: 1. **The kernel is not the opportunity.** It is actively stewarded by the paid team, architecturally conservative (ADR-gated), and high-friction for outsiders. Marginal outsider hours there compete with the people best positioned in the world to do that work. 2. **The integration layer is the opportunity.** It is huge, popularity-concentrated (report 04: install base is extremely top-heavy), quality-variable, and the project has *just finished building* a rules-and-tooling framework (the Quality Scale) whose explicit purpose is to let contributors raise integration quality in a reviewable, checklisted way. 3. **Effort is predictable at Bronze→Silver, unpredictable at →Platinum.** Silver-tier rules (reauth flow, unavailability handling, coverage, parallel updates) are local to the integration. Platinum rules (fully async dependency, strict typing through the dependency's type surface) depend on third-party libraries — sometimes unmaintained ones — making "polish the top 100 to Platinum" an unbounded-cost program (scored in report 06). 4. **Bug themes map onto tier rules.** The dominant open-issue themes are the precise failure modes Silver/Gold rules exist to prevent, so uplift work *subsumes* the highest-value bug fixing rather than competing with it. 5. **Feature-gap work pays a structural toll.** Anything user-facing and cross-cutting needs an architecture discussion, likely frontend work in a separate repo/skill-set, and scarce core-team review bandwidth — all friction multipliers an integration-layer program avoids. These are assessed quantitatively in reports 03–06.