# Report 03 — Integration Quality-Scale Distribution > Part of Milestone 1: Ecosystem Analysis & Impact Assessment. > Data snapshot: 2025-Q2 (see Report 01 for collection window and provenance). > Reproducible via `tools/fetch_quality_scale.py` and `tools/build_report_tables.py`. > Underlying data: `data/quality_scale_distribution.csv`, `data/top_integrations.csv`. --- ## 3.1 The quality scale, before and after the 2024 revamp Home Assistant has carried an integration quality scale for years, but until late 2024 it was a loosely-enforced label (`quality_scale` in `manifest.json`) with prose-level tier definitions. The **Integration Quality Scale (IQS) revamp** (announced alongside the project roadmap work in 2024, documented at `developers.home-assistant.io/docs/core/integration-quality-scale/`) changed it into a **rules-based system**: - Each tier — **Bronze, Silver, Gold, Platinum** — is defined by an explicit checklist of rules (e.g. `config-flow`, `entity-unique-id`, `reauthentication-flow`, `diagnostics`, `strict-typing`). - Each integration that opts in carries a `quality_scale.yaml` in its source directory recording per-rule status: `done`, `todo`, or `exempt` (with justification). - `hassfest` validates the claimed tier against the rule statuses in CI. - Integrations that have not yet been (re-)assessed under the new rules carry the **`legacy`** designation — they are not necessarily low quality, but they carry no verified guarantees. - Two non-tier designations exist: **`internal`** (core plumbing such as `sun`, `mobile_app`, `bluetooth`) and **`virtual`** (brand aliases that point at another integration, e.g. dozens of Tuya-brand stubs). The practical meaning of each tier, condensed from the rule lists: | Tier | What a user actually gets | Representative rules | |---|---|---| | **Bronze** | UI setup works and is tested; entities are stable across restarts | `config-flow`, full config-flow test coverage, `entity-unique-id`, `runtime-data`, `has-entity-name`, docs for setup | | **Silver** | Survives bad days: auth expiry, devices going offline, clean reloads | `reauthentication-flow`, `entity-unavailable`, `log-when-unavailable`, `config-entry-unloading`, `parallel-updates`, `action-exceptions`, named code owner | | **Gold** | Feels first-class: devices, translations, diagnostics, self-healing | `devices`, `diagnostics`, `entity-translations`, `icon-translations`, `entity-device-class`, `repair-issues`, `dynamic-devices`, `discovery`, `reconfiguration-flow` | | **Platinum** | Technically exemplary | `strict-typing`, `async-dependency`, `inject-websession` (shared HTTP session) | **Why this matters for contribution planning:** the rules give an external contributor an *objective, reviewable work plan per integration*. Tier uplift PRs are unusually mergeable because the acceptance criteria are written down by the core team itself, and `hassfest` mechanically verifies most of them. --- ## 3.2 Distribution across all core integrations From the snapshot of `homeassistant/components/*/manifest.json` plus `quality_scale.yaml` files (script: `tools/fetch_quality_scale.py`): | Designation | Count | Share of all 2,879 | |---|---:|---:| | Virtual (brand alias) | 287 | 10.0% | | Internal | 58 | 2.0% | | **Legacy (unassessed)** | **2,101** | **73.0%** | | Bronze | 214 | 7.4% | | Silver | 119 | 4.1% | | Gold | 73 | 2.5% | | Platinum | 27 | 0.9% | | **Total** | **2,879** | 100% | Headline: **only 433 of 2,534 scoreable integrations (17.1%) have any verified tier at all**, and only 100 (3.9%) are Gold or Platinum. Eighteen months into the IQS revamp, the long tail has barely been touched — which is expected, since tier assessment is opt-in and driven almost entirely by individual code owners. ### Velocity check Comparing manifests across the snapshot window's bounding releases, roughly **14–18 integrations per month** gain or raise a tier, heavily skewed toward new integrations (which must enter at Bronze since 2025) rather than uplift of existing popular ones. At that organic rate, the existing top-150 would take **years** to reach Silver coverage without directed effort. --- ## 3.3 Distribution within the top 150 by install base This is the slice users actually live in. Joining the analytics install-base ranking (Report 01, `data/top_integrations.csv`) with tier data: | Designation | Count in top 150 | Share | |---|---:|---:| | Internal | 3 | 2.0% | | **Legacy** | **62** | **41.3%** | | Bronze | 31 | 20.7% | | Silver | 27 | 18.0% | | Gold | 19 | 12.7% | | Platinum | 8 | 5.3% | And the very top of the list (25 most-installed, excluding the 2 internal): | Tier | Count of top 25 | Examples | |---|---:|---| | Platinum | 2 | `esphome`, `wled` | | Gold | 4 | `shelly`, `hue`, `zwave_js`, `tplink` | | Silver | 4 | `mqtt`, `zha`, `unifi`, `fritz` | | Bronze | 7 | `tuya`, `sonos`, `samsungtv`, `homekit_controller`, … | | Legacy | 6 | `cast`, `upnp`, `dlna_dmr`, `google_translate`, `homekit`, `ipp` | Two findings worth underlining: 1. **The popular tier gap is real but tractable.** 93 of the top 150 (Legacy + Bronze) lack the Silver guarantees — reauth flows, unavailability handling, clean unload — that most directly map to "my integration broke and I don't know why" forum threads. That is a *bounded* population. 2. **Legacy ≠ obscure.** Six of the 25 most-installed integrations are unassessed, including `cast` (≈176k reporting installs) and `homekit` (≈94k). Several of these are old, structurally sound code that mostly needs *assessment plus targeted gap-filling*, not rewrites. --- ## 3.4 Rule-level gap analysis (what actually blocks promotion) To estimate where the effort lies, we audited a stratified sample of **40 integrations** from the top-150's Legacy/Bronze/Silver population against the rule checklists (manual review of source + `quality_scale.yaml` where present), then extrapolated to the 120 top-150 integrations below Gold. Estimated counts of integrations blocked by each rule family: | Rule (family) | Tier it blocks | Est. # affected (of 120) | Typical effort to fix | |---|---|---:|---| | `repair-issues` not used | Gold | ~112 | 0.5–2 days | | `reconfiguration-flow` missing | Gold | ~98 | 0.5–1 day | | `strict-typing` failing | Platinum | ~94 | 1–10 days (size-dependent) | | `entity-translations` / `icon-translations` incomplete | Gold | ~81 | 0.5–2 days | | `diagnostics` missing | Gold | ~74 | 0.5–1 day | | `parallel-updates` undeclared | Silver | ~69 | < 0.5 day | | `reauthentication-flow` missing (cloud/auth integrations) | Silver | ~41 | 1–3 days | | `entity-unavailable` / `log-when-unavailable` incorrect | Silver | ~57 | 0.5–2 days | | `config-entry-unloading` broken or untested | Silver | ~33 | 0.5–2 days | | Config-flow test coverage below 100% | Bronze | ~46 | 1–3 days | | No config flow at all (YAML-only setup) | Bronze | ~9 | 3–15 days | Reading: the **Silver-blocking rules are cheap** (mostly < 2 days each, often mechanical), while **Gold is dominated by translation/diagnostics/repairs plumbing** that follows well-established patterns, and **Platinum is gated almost entirely by `strict-typing`**, whose cost scales with integration size (`tuya` and `zha` would be multi-week; `pi_hole` would be a day). --- ## 3.5 Effort model for tier uplift Combining the rule audit with observed PR history for recent uplift work (e.g. the Gold pushes on `enphase_envoy`, `reolink`, `lamarzocco`), median effort per integration per step, including tests and review cycles: | Transition | Median effort | P90 effort | Notes | |---|---:|---:|---| | Legacy → Bronze (has config flow) | 2 days | 5 days | Mostly test coverage + `quality_scale.yaml` assessment | | Legacy → Bronze (YAML-only) | 8 days | 15+ days | Config-flow authoring; needs code-owner buy-in | | Bronze → Silver | 2 days | 4 days | Reauth flow is the only commonly expensive item | | Silver → Gold | 5 days | 9 days | Translations, diagnostics, repairs, reconfigure | | Gold → Platinum | 4 days | 12+ days | `strict-typing`; sometimes requires dependency-library typing work upstream | **Implication for path scoring (Report 06):** "polish the top 100 to Platinum" is dominated by Platinum's typing tail and by a handful of giant integrations; "lift the top ~100 non-internal integrations to **Silver**, and the best 30–40 of those to **Gold**" delivers most of the user-visible reliability benefit at roughly **one quarter of the effort**, and avoids the highest-friction reviews. --- ## 3.6 Findings - **F3.1** — 73% of all core integrations and 41% of the top 150 are unassessed (`legacy`); verified quality is the exception, not the rule. - **F3.2** — The gap between "what users run" and "what is verified" is concentrated in a bounded set: **93 top-150 integrations below Silver**. - **F3.3** — Silver-tier rules are the highest reliability-per-hour work in the entire codebase: small, mechanical, objectively checkable, and aligned with the most common user-reported failure modes (see Report 04 §4.4). - **F3.4** — Platinum-for-its-own-sake is poor value on large integrations; `strict-typing` on `tuya`/`zha`-class codebases costs weeks and fixes few user-facing problems. - **F3.5** — Because every rule has a documented definition and `hassfest` enforcement, uplift PRs have an unusually high merge probability for an outside contributor — the acceptance criteria are pre-agreed.