fixes for next round of testing

This commit is contained in:
Jacob Dubin
2026-04-15 14:33:43 -05:00
parent 3f0c17e424
commit 874e5a1637
8 changed files with 348 additions and 58 deletions

View File

@@ -66,6 +66,7 @@ The current .NET pass covers only a narrow, explicitly synthetic subset of obser
- token/session tracking across websocket turns
- explicit per-turn state tracking for transID, rules, context, buffered audio, and finalize attempts
- buffered audio accounting and turn-pending state
- auto-finalize triggering for raw audio once `LISTEN`, `CONTEXT`, and minimum buffered-audio thresholds are present
- `LISTEN` message handling with synthetic `LISTEN` result payload shaping
- `CONTEXT` capture for turn/session state
- `CLIENT_NLU` turn completion using remembered listen/session metadata
@@ -81,6 +82,12 @@ This does not yet mean parity for:
- multi-step skill lifecycles beyond the current synthetic playback response
- broader interaction, animation, or ESML command families
Current raw-audio fallback behavior remains explicitly synthetic:
- when a buffered-audio turn can be resolved through the synthetic transcript-hint seam, `.NET` now auto-finalizes and emits `LISTEN` + `EOS` + `SKILL_ACTION`
- when the turn crosses the finalize threshold without a usable transcript, `.NET` now emits a fallback `LISTEN` + `EOS` + generic `SKILL_ACTION` rather than leaving the robot hanging on an unfinished turn
- that fallback is a compatibility measure inspired by the Node oracle, not a claim of real ASR understanding
### Internal ASR Direction
The current .NET websocket layer now separates: