ADR 0021: Blind Command Processing Lock and Target Preservation¶
Status¶
Accepted (2026-02-04)
Context¶
Problem¶
Homematic blind/cover devices exhibit a firmware bug: sending new positioning commands while the device is physically in motion causes undefined behavior. Commands may be silently ignored, the blind may stop at an incorrect position, or tilt adjustments may fail to apply.
This creates a critical race condition in Home Assistant, where users commonly issue separate level and tilt commands in rapid succession or change tilt while the blind is still moving to a target level.
Scenario¶
T=0s: User sends set_position(level=50%)
Device starts moving (takes ~5 seconds)
T=2s: User sends set_tilt(tilt=30°) while blind is still moving
Without protection: Command may be ignored or corrupt the level movement
Expected Behavior¶
The blind must reach both targets:
- Level: 50% (from the first command)
- Tilt: 30% (from the second command)
Movement should not produce incorrect final positions, even when commands overlap.
Device Variants¶
The problem applies specifically to blind actors (devices with both level and tilt control). Simple covers (level only), window drives, and garage doors are not affected because they have only a single axis of movement.
| Class | Level | Tilt | Lock Required |
|---|---|---|---|
CustomDpCover | Yes | No | No |
CustomDpWindowDrive | Yes | No | No |
CustomDpBlind | Yes | Yes | Yes |
CustomDpIpBlind | Yes | Yes | Yes (inherited) |
CustomDpGarage | Yes | No | No |
Decision¶
Implement a per-instance asyncio.Lock (_command_processing_lock) in CustomDpBlind that serializes command processing and preserves pending targets when new commands arrive during movement. The mechanism uses a stop-then-resend strategy to work around the device firmware bug.
Core Mechanism¶
The _set_level() method in CustomDpBlind implements a three-phase protocol:
Phase 1 -- Lock Acquisition
Acquire the per-instance lock with a 5-second timeout (_COMMAND_LOCK_TIMEOUT). If the timeout expires, proceed without the lock and log a warning. This prevents deadlocks while accepting a small risk of race conditions in edge cases.
Phase 2 -- Target Resolution
For each axis (level and tilt), determine the target value using a three-tier fallback:
- Explicit value: If the caller provides a value, use it
- Pending target: If a previous command is still unconfirmed (
_target_level/_target_tilt_level), reuse that target and markcurrently_moving = True - Current position: If the device is at standstill, use the confirmed position (
_group_level/_group_tilt_level)
Phase 3 -- Stop and Resend
If currently_moving is detected, stop the device first via _stop(), then send a combined command with both the preserved level and the new tilt (or vice versa). This works around the firmware bug by ensuring the device is stationary before receiving new coordinates.
_set_level(level=None, tilt_level=0.3)
|
├─ level=None → check _target_level
│ └─ _target_level=0.5 (pending!) → _level=0.5, currently_moving=True
│
├─ tilt_level=0.3 (explicit) → _tilt_level=0.3
│
├─ currently_moving=True → _stop()
│
└─ _send_level(level=0.5, tilt_level=0.3) ← combined command
Target Detection¶
The _target_level and _target_tilt_level properties detect pending (unconfirmed) commands using a two-tier approach:
- Optimistic value (preferred): Check if the data point has an optimistic value set (from the optimistic updates system in ADR 0020)
- CommandTracker fallback: Query
unconfirmed_last_value_sendfrom the command tracker when optimistic updates are disabled
A target is cleared when the CCU confirms the value via an event, at which point the optimistic value is resolved and the command tracker entry expires.
Command Transmission¶
Homematic blind devices support two transmission modes, selected automatically:
- Combined parameter (
LEVEL_COMBINED): A single RPC call encoding both level and tilt as a hex-encoded combined value (e.g.,0xa2,0x26). Used by BidCos-RF devices that expose aCOMBINED_PARAMETERdata point. - Separate parameters: Two sequential RPC calls for
LEVEL_SLATS(tilt) followed byLEVEL. Used when no combined parameter is available.
The combined parameter path bypasses the collector to ensure atomic delivery.
Lock Scope¶
The lock protects three operations:
| Operation | Method | Lock Held |
|---|---|---|
| Set level/tilt | _set_level() | Yes |
| Stop | stop() | Yes |
| Internal stop | _stop() | Caller must hold lock |
All public entry points (set_position, open, close, open_tilt, close_tilt, stop) route through either _set_level() or stop(), both of which acquire the lock.
Use Cases¶
Use Case 1: Tilt Change During Level Movement¶
T=0.0s: set_position(position=50)
→ Lock acquired, _send_level(level=0.5, tilt=current)
→ _target_level = 0.5 (unconfirmed)
→ Device starts moving
→ Lock released
T=2.0s: set_position(tilt_position=30) [device still moving]
→ Lock acquired
→ level=None → _target_level=0.5 exists → reuse, currently_moving=True
→ tilt_level=0.3 (explicit)
→ _stop() called (firmware bug workaround)
→ _send_level(level=0.5, tilt_level=0.3) ← both targets
→ Lock released
T=7.0s: Device reaches level=50%, tilt=30%
→ CCU confirms via events
Result: Both targets reached correctly.
Use Case 2: Parallel Level and Tilt Calls¶
T=0.0s: asyncio.gather(
set_position(position=81),
set_position(tilt_position=19),
)
→ Call 1 acquires lock first
→ _send_level(level=0.81, tilt=current)
→ _target_level = 0.81 (unconfirmed)
→ Lock released
→ Call 2 acquires lock
→ level=None → _target_level=0.81 → reuse, currently_moving=True
→ tilt_level=0.19 (explicit)
→ _stop()
→ _send_level(level=0.81, tilt_level=0.19) ← combined
→ Lock released
Result: Combined command sent with both targets. Tested with 10 iterations to detect race conditions.
Use Case 3: Lock Timeout¶
T=0.0s: set_position(position=50)
→ Lock acquired, network delay causes slow RPC
T=0.1s: set_position(tilt_position=30)
→ Waiting for lock...
T=5.1s: Timeout after 5s
→ Warning logged
→ Proceeds WITHOUT lock
→ Commands may race, but CCU-side queuing mitigates
Result: Degraded but functional. The CCU queues commands server-side, limiting the impact.
Implementation¶
File: aiohomematic/model/custom/cover.py
Key Components:
| Component | Location | Purpose |
|---|---|---|
_command_processing_lock | CustomDpCover.__slots__ (line 106) | Per-instance asyncio.Lock |
_COMMAND_LOCK_TIMEOUT | Module constant (line 31) | 5.0 second timeout |
_set_level() | CustomDpBlind (line 468) | Lock-protected target resolution and send |
_stop() | CustomDpBlind (line 520) | Internal stop, must be called with lock held |
stop() | CustomDpBlind (line 412) | Public stop with lock acquisition |
_target_level | CustomDpBlind (line 292) | Pending level detection (optimistic + tracker) |
_target_tilt_level | CustomDpBlind (line 311) | Pending tilt detection (optimistic + tracker) |
_send_level() | CustomDpBlind (line 449) | Combined or separate parameter transmission |
_group_level | CustomDpCover (line 122) | Confirmed level fallback |
_group_tilt_level | CustomDpBlind (line 281) | Confirmed tilt fallback |
Inheritance:
CustomDataPoint
└─ CustomDpCover (level only, lock slot declared, no lock logic)
├─ CustomDpWindowDrive (level only, no lock logic)
└─ CustomDpBlind (level + tilt, lock initialized and used)
└─ CustomDpIpBlind (inherits lock, adds COMBINED_PARAMETER and L=N,L2=N format)
The lock slot is declared in CustomDpCover but only initialized and used in CustomDpBlind._post_init(). This allows the blind subclass to own the lock lifecycle while keeping the slot in the common base class.
Test Coverage:
| Test | File | What It Verifies |
|---|---|---|
test_ceblind_separate_level_and_tilt_change | test_model_cover.py:445 | Parallel set_position calls, 10 iterations |
Consequences¶
Positive¶
- Blind devices reliably reach both level and tilt targets regardless of command timing
- The firmware bug is completely transparent to consumers (Home Assistant)
- The lock timeout prevents deadlocks in degraded network conditions
- Target preservation eliminates the need for callers to track and re-send previous targets
Negative¶
- The 5-second lock timeout introduces a maximum latency for queued commands
- On lock timeout, a brief race condition window exists (mitigated by CCU-side command queuing)
- The stop-then-resend approach adds one extra RPC call when commands overlap during movement
Interaction with Command Throttling (ADR 0020)¶
The command processing lock operates above the throttle layer. Sequence when both are active:
set_position()
→ _set_level() acquires _command_processing_lock
→ _stop() sends STOP via client (throttle may delay)
→ _send_level() sends LEVEL_COMBINED via client (throttle may delay)
→ _command_processing_lock released
Cover commands use HIGH priority by default. The lock timeout (5s) must be larger than the expected throttle delay to avoid spurious timeouts.
Interaction with Optimistic Updates (ADR 0020)¶
The _target_level and _target_tilt_level properties integrate with both the optimistic update system and the legacy command tracker:
- When optimistic updates are enabled:
is_optimisticis checked first, providing immediate target awareness - When disabled:
unconfirmed_last_value_sendfrom CommandTracker serves the same purpose
This hybrid approach ensures the lock mechanism works correctly regardless of the optimistic updates feature flag.
Alternatives Considered¶
No Lock, Rely on CCU Command Queuing¶
Let the CCU handle concurrent commands natively without client-side serialization.
Rejected: The CCU does queue commands, but the firmware bug in blind actors causes incorrect behavior when commands arrive during physical movement. Client-side stop-then-resend is necessary.
Lock Without Target Preservation¶
Serialize commands but always use the current confirmed position as fallback for unspecified axes.
Rejected: This loses the pending target. If a user sends level=50% and then tilt=30% during movement, the second command would use the current (mid-movement) level rather than the intended target of 50%.
Per-Axis Locks¶
Use separate locks for level and tilt to allow independent concurrent changes.
Rejected: The device firmware bug affects the device as a whole, not individual axes. Both axes must be stopped and resent together. A single lock correctly models this constraint.
Longer or No Timeout¶
Increase the lock timeout or remove it entirely.
Rejected: A 5-second timeout balances deadlock prevention against command latency. With network issues, an indefinite lock could block all cover commands permanently. The current timeout allows degraded operation with a logged warning.
References¶
aiohomematic/model/custom/cover.py-- Cover and blind implementation- ADR 0020: Command Throttling with Priority Queue and Optimistic Updates
tests/test_model_cover.py:445-- Race condition test (test_ceblind_separate_level_and_tilt_change)
Created: 2026-02-04 Author: Architecture Review