# Report 04 — Open-Issue Triage Across the Top ~150 Integrations > Data snapshot: 2025-Q2 (see Report 01). Reproducible via `tools/fetch_issues.py` > (GitHub Search API over `home-assistant/core`, `integration: ` labels). > Underlying data: `data/issue_triage.csv` (one row per top-150 domain; the tag > columns are **non-exclusive** — an issue can be both a regression and stale). --- ## 4.1 Corpus At the snapshot, `home-assistant/core` carried **≈2,914 open issues**. Mapping issues to integrations via the bot-applied `integration:` labels: | Slice | Open issues | Share | |---|---:|---:| | All open issues in `home-assistant/core` | 2,914 | 100% | | Mapped to the top-150 install-base domains | 1,853 | 63.6% | | Mapped to other integrations (the long tail) | 781 | 26.8% | | Core subsystems / unlabeled / multi-label | 280 | 9.6% | The top-150 slice is where triage effort pays: it covers under 6% of integration domains but **nearly two thirds of all open issues**, and each fix lands on tens of thousands of installations. Concentration within the slice is itself extreme — the **top 25 domains by issue count carry 932 issues (50.3% of the top-150 total)**: | # | Domain | Open | # | Domain | Open | |---:|---|---:|---:|---|---:| | 1 | tuya | 96 | 14 | xiaomi_miio | 27 | | 2 | zha | 89 | 15 | mobile_app | 26 | | 3 | mqtt | 74 | 16 | onvif | 26 | | 4 | zwave_js | 63 | 17 | netatmo | 24 | | 5 | homekit_controller | 57 | 18 | plex | 23 | | 6 | bluetooth | 47 | 19 | androidtv | 22 | | 7 | shelly | 41 | 20 | webostv | 21 | | 8 | samsungtv | 38 | 21 | ring | 20 | | 9 | tplink | 36 | 22 | hue | 19 | | 10 | sonos | 35 | 23 | spotify | 19 | | 11 | unifi | 33 | 24 | switchbot | 19 | | 12 | cast | 31 | 25 | esphome | 18 | | 13 | fritz | 28 | | | | --- ## 4.2 Triage taxonomy Each sampled issue was assigned one *primary* category (sampling protocol in Report 01: full reads of the top-40 domains' issues; 30%-sample for ranks 41–150, extrapolated): | Category | Definition | Share of top-150 issues | |---|---|---:| | `setup_auth` | Config entry fails to set up; auth/reauth loops; token expiry | 19% | | `state_accuracy` | Entity exists but reports wrong/missing/stale values | 17% | | `regression` | Worked in release N−1, broke in N (core or bumped dependency) | 14% | | `connectivity` | Disconnect/reconnect storms, "unavailable" flapping | 12% | | `firmware_drift` | Device firmware update changed behavior the integration assumes | 9% | | `feature_request` | Enhancement filed as an issue | 9% | | `cloud_api` | Vendor cloud API changed, throttled, or partially shut down | 8% | | `needs_upstream` | Root cause in the dependency library, not HA code | 4% | | `performance` | CPU/memory/recorder load attributable to the integration | 4% | | `docs_ux` | User confusion resolvable by docs or better error messages | 4% | Roughly **79% of the corpus is genuine defect work** (everything except `feature_request`, `docs_ux`, and about half of `needs_upstream` which is fixable via a library PR anyway). --- ## 4.3 Structural metrics | Metric | Value | Notes | |---|---:|---| | Stale (no activity > 180 days) | ≈660 (36%) | Many auto-closed later by stale-bot without resolution | | Flagged regressions open at snapshot | ≈270 (15%) | Median age 94 days — regressions are *not* being fast-tracked | | Issues with a reproduction another contributor confirmed | ≈41% | The single best predictor of eventual fix | | Median time-to-first-maintainer-response (top-150) | 11 days | Bimodal: < 48h for ~50 well-staffed domains, weeks elsewhere | | Domains in top-150 with zero active code owner | 4 explicit (+ ~17 with inactive owners) | `google_translate`, `yamaha`, `amcrest`, `envisalink`, `keba` are ownerless | The bimodal response pattern matters: domains like `shelly`, `zwave_js`, `enphase_envoy`, `reolink` have engaged owners and short queues *relative to install base*; the pain concentrates in **owned-on-paper-but-dormant** domains (`xiaomi_miio`, `androidtv`, `broadlink`, `harmony`, `wemo`, `homematic`, much of the legacy media-player cluster). --- ## 4.4 The five dominant clusters **C1 — Cloud-API churn (≈230 issues).** `tuya`, `netatmo`, `spotify`, `ring`, `smartthings`, `honeywell`, `ecobee`, `ezviz`, `growatt_server`, automotive domains. Root causes are vendor-side; fixes need API-version tracking, better reauth, and graceful degradation. High recurrence — fixing an instance doesn't fix the class, but Silver-rule reauth/unavailability work converts hard failures ("integration dead, restart HA") into self-healing ones. **C2 — Protocol-stack flakiness (≈300 issues).** `zha`, `bluetooth` (incl. ESPHome BLE proxies), `homekit_controller`, `zwave_js`. Deep, hardware-shaped, expert-bound. Several already have dedicated upstream maintainers (zigpy, aioesphomeapi, zwave-js). **Low marginal value for a generalist contributor**; misdirected PRs here consume scarce expert review time. **C3 — Firmware/behavior drift on local devices (≈170 issues).** `samsungtv`, `webostv`, `onvif`, `tplink`, `shelly` (Gen drift), `reolink`, `solax`, `sma`. Mid-difficulty, very fixable *when the reporter can run diagnostics* — which is exactly what the Gold `diagnostics` rule provides. Direct synergy with uplift. **C4 — State-accuracy and unit/device-class bugs (≈310 issues).** Spread thin across nearly all 150 domains: wrong device classes, missing `suggested_unit`, stale coordinator data after reconnect, broken `available` logic. **This is the single largest fixable mass**: localized, testable without hardware in ~60% of cases, and frequently a < 50-line fix. **C5 — Setup/auth failures in legacy config flows (≈350 issues).** Highest correlation with tier: Legacy/Bronze domains average **2.6× more `setup_auth` issues per 10k installs** than Silver+ domains. The Silver checklist (`reauthentication-flow`, `config-entry-unloading`, proper setup-retry) is effectively the cure for this cluster. --- ## 4.5 Fixability analysis ("low-hanging" inventory) An issue was tagged **low-hanging** if (a) reproducible without proprietary hardware *or* a confirmed reproduction exists, and (b) the plausible fix is localized (< ~100 LOC, single domain, no architecture change). | Population | Count | Share | |---|---:|---:| | Low-hanging, in top-150 | ≈390 | 21% of top-150 issues | | …of which in Legacy/Bronze domains | ≈265 | 68% of the low-hanging set | | …median estimated fix effort | 0.5–1.5 days each | incl. tests + review | A focused contributor clearing 3–4 of these per week would retire **~150–200 real defects per year**, with each fix landing on a median ~20k installations. But note §4.6: without structural improvement, the inflow refills the pool. ## 4.6 Inflow vs. outflow Over the snapshot's trailing 90 days, top-150 domains averaged **≈310 newly opened vs. ≈285 closed issues per month** (closures inflated by stale-bot). Pure bug-fixing is therefore *treadmill work*: valuable, but it doesn't bend the curve. The clusters that **generate** inflow (C1, C5, parts of C3) map almost one-to-one onto Silver/Gold quality-scale rules — which is the central empirical argument, carried into Report 06, that **uplift and bug-fixing are not competing paths but a sequence**: uplift reduces inflow, and the uplift process itself surfaces and fixes the low-hanging defects in passing. --- ## 4.7 Findings - **F4.1** — Issue mass is concentrated: 25 domains hold half the top-150 load. - **F4.2** — ~21% of the top-150 corpus (≈390 issues) is realistically fixable by an outside contributor at < 1.5 days each; two-thirds of those sit in Legacy/Bronze domains. - **F4.3** — Legacy/Bronze domains generate 2.6× the setup/auth failure rate of Silver+ domains per install — the strongest quantitative link between tier and user pain found in this study. - **F4.4** — The protocol-stack cluster (zha/bluetooth/zwave/homekit) should be explicitly **out of scope** for a generalist effort. - **F4.5** — Regressions stay open a median 94 days; a contributor who triages and bisects regressions within the beta window would relieve a visible maintainer pain point at modest cost (candidate sub-path for Report 06).