Add GLSM listener telemetry and stale-listen recovery
This commit is contained in:
@@ -88,6 +88,11 @@ Current websocket scope:
|
|||||||
- active local prompt preservation so `shared/yes_no`, clock, gallery, and settings prompts can still consume transcript-bearing short replies even when the stock skill reports a local context
|
- active local prompt preservation so `shared/yes_no`, clock, gallery, and settings prompts can still consume transcript-bearing short replies even when the stock skill reports a local context
|
||||||
- binary audio ignored for an existing transID until a fresh `LISTEN` has been seen, preventing context-only or post-speech tails from reopening an endless buffered turn
|
- binary audio ignored for an existing transID until a fresh `LISTEN` has been seen, preventing context-only or post-speech tails from reopening an endless buffered turn
|
||||||
- blank-audio hotphrase turns clear pending listen state and install a short late-audio ignore window
|
- blank-audio hotphrase turns clear pending listen state and install a short late-audio ignore window
|
||||||
|
- first GLSM-aligned listener telemetry and recovery slice is now in source:
|
||||||
|
- derived phase labels (`HJ_LISTENING`, `LISTENING`, `WAIT_LISTEN_FINISHED`, `DISPATCH_DIALOG`, `PROCESS_LISTENER_QUEUE`)
|
||||||
|
- `glsm_phase_transition` turn diagnostics
|
||||||
|
- websocket turn events with `glsmPhase` snapshots
|
||||||
|
- stale pending-listen recovery for long-open no-context/no-audio listens before processing a new hotphrase listen
|
||||||
- unknown inbound websocket types dropped silently instead of echoing stock-OS-unknown OpenJibo events
|
- unknown inbound websocket types dropped silently instead of echoing stock-OS-unknown OpenJibo events
|
||||||
- file telemetry and fixture export for HTTP, websocket, and turn captures
|
- file telemetry and fixture export for HTTP, websocket, and turn captures
|
||||||
|
|
||||||
@@ -145,6 +150,7 @@ Use these sources as evidence, not as code to copy blindly:
|
|||||||
- User-provided original source snapshot: `..\jibo` when extracted locally
|
- User-provided original source snapshot: `..\jibo` when extracted locally
|
||||||
- Original Pegasus cloud source inside that snapshot: `pegasus`
|
- Original Pegasus cloud source inside that snapshot: `pegasus`
|
||||||
- Original SDK and skill source inside that snapshot: `sdk`
|
- Original SDK and skill source inside that snapshot: `sdk`
|
||||||
|
- Legacy listener flow reference diagram: `..\jibo\sdk\packages\skills-service-manager\resources\state-diagrams\glsm.png`
|
||||||
- JiboOS reference tree: `..\JiboOS`
|
- JiboOS reference tree: `..\JiboOS`
|
||||||
- JiboOS skill snapshot: `..\JiboOS\opt\jibo\Jibo\Skills\@be`
|
- JiboOS skill snapshot: `..\JiboOS\opt\jibo\Jibo\Skills\@be`
|
||||||
|
|
||||||
|
|||||||
@@ -301,6 +301,20 @@ Current release theme:
|
|||||||
- Follow-up:
|
- Follow-up:
|
||||||
- live smoke should confirm `cloud version` speaks `1.0.18`, carries `match.skipSurprises = true`, does not stop itself on the word `Jibo`, and settles without a generic `I heard...` reply or a local surprise handoff
|
- live smoke should confirm `cloud version` speaks `1.0.18`, carries `match.skipSurprises = true`, does not stop itself on the word `Jibo`, and settles without a generic `I heard...` reply or a local surprise handoff
|
||||||
|
|
||||||
|
### GLSM Listener Flow Capture And Recovery
|
||||||
|
|
||||||
|
- Status: `implemented`
|
||||||
|
- Tags: `protocol`, `docs`
|
||||||
|
- Result:
|
||||||
|
- the legacy listener state machine source (`sdk ... glsm.png`) is now captured in current planning docs
|
||||||
|
- runtime now emits GLSM-aligned phase snapshots (`HJ_LISTENING`, `LISTENING`, `WAIT_LISTEN_FINISHED`, `DISPATCH_DIALOG`, `PROCESS_LISTENER_QUEUE`)
|
||||||
|
- turn diagnostics now include `glsm_phase_transition` for phase changes
|
||||||
|
- websocket telemetry now records `glsmPhase` on binary/context/turn events
|
||||||
|
- stale pending-listen recovery is now in source so a long-open no-context/no-audio listen can be cleared when the next hotphrase listen arrives
|
||||||
|
- Follow-up:
|
||||||
|
- live-capture proof is still required against the recurring blue-ring/stuck-listening sequence
|
||||||
|
- deeper GLSM parity (`Interrupt Listeners`, launch/global parse branches) should be tackled after this first capture slice is validated on-device
|
||||||
|
|
||||||
### End-Of-Skill Surprise Suppression
|
### End-Of-Skill Surprise Suppression
|
||||||
|
|
||||||
- Status: `implemented`
|
- Status: `implemented`
|
||||||
|
|||||||
@@ -119,7 +119,7 @@ Reference:
|
|||||||
|
|
||||||
## Next Queued Task (`2026-05-06`)
|
## Next Queued Task (`2026-05-06`)
|
||||||
|
|
||||||
Queued next `1.0.19` implementation task:
|
Queued next `1.0.19` implementation task (now started):
|
||||||
|
|
||||||
- dialog parsing expansion and ambiguity guardrails
|
- dialog parsing expansion and ambiguity guardrails
|
||||||
|
|
||||||
@@ -129,6 +129,12 @@ Execution focus:
|
|||||||
- reduce trigger-only captures that drop the rest of the utterance
|
- reduce trigger-only captures that drop the rest of the utterance
|
||||||
- preserve command-vs-question personality split and local skill payload compatibility
|
- preserve command-vs-question personality split and local skill payload compatibility
|
||||||
- add focused tests for new phrase families and ambiguity boundaries
|
- add focused tests for new phrase families and ambiguity boundaries
|
||||||
|
- keep listener-state observability aligned with the legacy GLSM flow while phrase guardrails are added
|
||||||
|
|
||||||
|
First completed guardrail slice under this queue:
|
||||||
|
|
||||||
|
- GLSM listener flow capture + telemetry mapping
|
||||||
|
- stale pending-listen recovery path for long-open no-context/no-audio listens
|
||||||
|
|
||||||
## Next Slices
|
## Next Slices
|
||||||
|
|
||||||
|
|||||||
@@ -16,6 +16,7 @@ As-of date: `2026-05-06`
|
|||||||
|
|
||||||
- Legacy system architecture: `C:\Projects\jibo\pegasus\resources\system_diagram.png`
|
- Legacy system architecture: `C:\Projects\jibo\pegasus\resources\system_diagram.png`
|
||||||
- Legacy generic skill scaffold: `C:\Projects\jibo\pegasus\packages\template-skill\docs\TemplateSkill.png`
|
- Legacy generic skill scaffold: `C:\Projects\jibo\pegasus\packages\template-skill\docs\TemplateSkill.png`
|
||||||
|
- Legacy listener state machine: `C:\Projects\jibo\sdk\packages\skills-service-manager\resources\state-diagrams\glsm.png`
|
||||||
|
|
||||||
## Template Skill Verdict
|
## Template Skill Verdict
|
||||||
|
|
||||||
@@ -45,6 +46,30 @@ Conclusion: do not treat template-skill flow as a port target. Treat it as a sha
|
|||||||
| `Proactivity Catalog` | in-code candidate lists/weights | explicit catalog service with tuned weights and operator controls |
|
| `Proactivity Catalog` | in-code candidate lists/weights | explicit catalog service with tuned weights and operator controls |
|
||||||
| `Audio Logs` | file telemetry sinks in infrastructure telemetry | hosted indexed capture/retention for multi-operator analysis |
|
| `Audio Logs` | file telemetry sinks in infrastructure telemetry | hosted indexed capture/retention for multi-operator analysis |
|
||||||
|
|
||||||
|
## GLSM Listener Flow Alignment (`2026-05-06`)
|
||||||
|
|
||||||
|
Captured source:
|
||||||
|
|
||||||
|
- `C:\Projects\jibo\sdk\packages\skills-service-manager\resources\state-diagrams\glsm.png`
|
||||||
|
|
||||||
|
First OpenJibo support slice (implemented):
|
||||||
|
|
||||||
|
- explicit derived listener phases are now emitted in cloud diagnostics:
|
||||||
|
- `HJ_LISTENING`
|
||||||
|
- `LISTENING`
|
||||||
|
- `WAIT_LISTEN_FINISHED`
|
||||||
|
- `DISPATCH_DIALOG`
|
||||||
|
- `PROCESS_LISTENER_QUEUE`
|
||||||
|
- turn telemetry now records `glsm_phase_transition` with previous/next state and trigger
|
||||||
|
- websocket telemetry now includes `glsmPhase` on binary, context, and turn-processed events
|
||||||
|
- stale pending-listen recovery is now implemented:
|
||||||
|
- when a pending `LISTEN` stays open long enough with no context/audio, a new hotphrase listen can recover the stuck state before continuing
|
||||||
|
|
||||||
|
Current parity boundary:
|
||||||
|
|
||||||
|
- this slice focuses on listener lifecycle observability plus stuck-listen recovery
|
||||||
|
- deeper explicit parity states from GLSM (`Interrupt Listeners`, `Handle Launch Parse`, `Handle Global Parse`, `Dispatch Dialog` sub-branches) are next candidates once this capture-driven slice is validated live
|
||||||
|
|
||||||
## Where We Were
|
## Where We Were
|
||||||
|
|
||||||
Legacy cloud design was service-oriented around:
|
Legacy cloud design was service-oriented around:
|
||||||
|
|||||||
@@ -25,7 +25,8 @@ public sealed class JiboWebSocketService(
|
|||||||
var replies = await turnFinalizationService.HandleBinaryAudioAsync(session, envelope, cancellationToken);
|
var replies = await turnFinalizationService.HandleBinaryAudioAsync(session, envelope, cancellationToken);
|
||||||
await telemetrySink.RecordTurnEventAsync(envelope, session, "binary_audio_received", new Dictionary<string, object?>
|
await telemetrySink.RecordTurnEventAsync(envelope, session, "binary_audio_received", new Dictionary<string, object?>
|
||||||
{
|
{
|
||||||
["bytes"] = envelope.Binary?.Length ?? 0
|
["bytes"] = envelope.Binary?.Length ?? 0,
|
||||||
|
["glsmPhase"] = WebSocketTurnFinalizationService.ResolveGlsmPhase(session)
|
||||||
}, cancellationToken);
|
}, cancellationToken);
|
||||||
return replies;
|
return replies;
|
||||||
}
|
}
|
||||||
@@ -33,6 +34,8 @@ public sealed class JiboWebSocketService(
|
|||||||
var parsedType = ReadMessageType(envelope.Text);
|
var parsedType = ReadMessageType(envelope.Text);
|
||||||
session.LastMessageType = parsedType;
|
session.LastMessageType = parsedType;
|
||||||
var containsInlineTurnPayload = parsedType == "LISTEN" && ContainsInlineTurnPayload(envelope.Text);
|
var containsInlineTurnPayload = parsedType == "LISTEN" && ContainsInlineTurnPayload(envelope.Text);
|
||||||
|
var staleListenRecovered = false;
|
||||||
|
var staleListenAgeMs = 0;
|
||||||
if (parsedType == "LISTEN" &&
|
if (parsedType == "LISTEN" &&
|
||||||
!containsInlineTurnPayload &&
|
!containsInlineTurnPayload &&
|
||||||
WebSocketTurnFinalizationService.ShouldIgnoreLateListenSetup(session, envelope.Text))
|
WebSocketTurnFinalizationService.ShouldIgnoreLateListenSetup(session, envelope.Text))
|
||||||
@@ -57,6 +60,19 @@ public sealed class JiboWebSocketService(
|
|||||||
return replies;
|
return replies;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (parsedType == "LISTEN" &&
|
||||||
|
!containsInlineTurnPayload &&
|
||||||
|
WebSocketTurnFinalizationService.TryRecoverStalePendingListen(session, out staleListenAgeMs))
|
||||||
|
{
|
||||||
|
staleListenRecovered = true;
|
||||||
|
await telemetrySink.RecordTurnEventAsync(envelope, session, "glsm_stale_listen_recovered", new Dictionary<string, object?>
|
||||||
|
{
|
||||||
|
["staleAgeMs"] = staleListenAgeMs,
|
||||||
|
["transID"] = session.TurnState.TransId,
|
||||||
|
["glsmPhase"] = WebSocketTurnFinalizationService.ResolveGlsmPhase(session)
|
||||||
|
}, cancellationToken);
|
||||||
|
}
|
||||||
|
|
||||||
WebSocketTurnFinalizationService.ObserveIncomingMessage(session, envelope.Text);
|
WebSocketTurnFinalizationService.ObserveIncomingMessage(session, envelope.Text);
|
||||||
|
|
||||||
switch (parsedType)
|
switch (parsedType)
|
||||||
@@ -66,7 +82,8 @@ public sealed class JiboWebSocketService(
|
|||||||
var replies = await turnFinalizationService.HandleContextAsync(session, envelope, cancellationToken);
|
var replies = await turnFinalizationService.HandleContextAsync(session, envelope, cancellationToken);
|
||||||
await telemetrySink.RecordTurnEventAsync(envelope, session, "context_received", new Dictionary<string, object?>
|
await telemetrySink.RecordTurnEventAsync(envelope, session, "context_received", new Dictionary<string, object?>
|
||||||
{
|
{
|
||||||
["transID"] = session.TurnState.TransId
|
["transID"] = session.TurnState.TransId,
|
||||||
|
["glsmPhase"] = WebSocketTurnFinalizationService.ResolveGlsmPhase(session)
|
||||||
}, cancellationToken);
|
}, cancellationToken);
|
||||||
return replies;
|
return replies;
|
||||||
}
|
}
|
||||||
@@ -80,7 +97,10 @@ public sealed class JiboWebSocketService(
|
|||||||
["messageType"] = parsedType,
|
["messageType"] = parsedType,
|
||||||
["replyCount"] = replies.Count,
|
["replyCount"] = replies.Count,
|
||||||
["transcript"] = session.LastTranscript,
|
["transcript"] = session.LastTranscript,
|
||||||
["intent"] = session.LastIntent
|
["intent"] = session.LastIntent,
|
||||||
|
["glsmPhase"] = WebSocketTurnFinalizationService.ResolveGlsmPhase(session),
|
||||||
|
["staleListenRecovered"] = staleListenRecovered,
|
||||||
|
["staleListenAgeMs"] = staleListenAgeMs
|
||||||
}, cancellationToken);
|
}, cancellationToken);
|
||||||
return replies;
|
return replies;
|
||||||
}
|
}
|
||||||
@@ -92,7 +112,8 @@ public sealed class JiboWebSocketService(
|
|||||||
["messageType"] = parsedType,
|
["messageType"] = parsedType,
|
||||||
["replyCount"] = replies.Count,
|
["replyCount"] = replies.Count,
|
||||||
["transcript"] = session.LastTranscript,
|
["transcript"] = session.LastTranscript,
|
||||||
["intent"] = session.LastIntent
|
["intent"] = session.LastIntent,
|
||||||
|
["glsmPhase"] = WebSocketTurnFinalizationService.ResolveGlsmPhase(session)
|
||||||
}, cancellationToken);
|
}, cancellationToken);
|
||||||
return replies;
|
return replies;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -14,9 +14,11 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
{
|
{
|
||||||
private const int AutoFinalizeMinBufferedAudioBytes = 15000;
|
private const int AutoFinalizeMinBufferedAudioBytes = 15000;
|
||||||
private const int AutoFinalizeMinBufferedAudioChunks = 5;
|
private const int AutoFinalizeMinBufferedAudioChunks = 5;
|
||||||
|
private const string GlsmPhaseMetadataKey = "glsmPhase";
|
||||||
private static readonly TimeSpan AutoFinalizeMinTurnAge = TimeSpan.FromMilliseconds(1800);
|
private static readonly TimeSpan AutoFinalizeMinTurnAge = TimeSpan.FromMilliseconds(1800);
|
||||||
private static readonly TimeSpan AutoFinalizeMissingTranscriptFallbackAge = TimeSpan.FromMilliseconds(4200);
|
private static readonly TimeSpan AutoFinalizeMissingTranscriptFallbackAge = TimeSpan.FromMilliseconds(4200);
|
||||||
private static readonly TimeSpan AutoFinalizeContinuationDeferralMaxAge = TimeSpan.FromMilliseconds(3600);
|
private static readonly TimeSpan AutoFinalizeContinuationDeferralMaxAge = TimeSpan.FromMilliseconds(3600);
|
||||||
|
private static readonly TimeSpan StaleListenSetupRecoveryAge = TimeSpan.FromSeconds(9);
|
||||||
private const int AutoFinalizeContinuationDeferralMaxAttempts = 2;
|
private const int AutoFinalizeContinuationDeferralMaxAttempts = 2;
|
||||||
private static readonly HashSet<string> PegasusAffinityContinuationStems = new(StringComparer.Ordinal)
|
private static readonly HashSet<string> PegasusAffinityContinuationStems = new(StringComparer.Ordinal)
|
||||||
{
|
{
|
||||||
@@ -61,54 +63,61 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
WebSocketMessageEnvelope envelope,
|
WebSocketMessageEnvelope envelope,
|
||||||
CancellationToken cancellationToken = default)
|
CancellationToken cancellationToken = default)
|
||||||
{
|
{
|
||||||
var turnState = session.TurnState;
|
try
|
||||||
var ignoreLateAudio = ShouldIgnoreLateAudio(session);
|
|
||||||
var ignoreAudioWithoutListen = ShouldIgnoreAudioWithoutListen(turnState);
|
|
||||||
if (ignoreLateAudio || ignoreAudioWithoutListen)
|
|
||||||
{
|
{
|
||||||
await sink.RecordTurnDiagnosticAsync("binary_audio_ignored", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
var turnState = session.TurnState;
|
||||||
|
var ignoreLateAudio = ShouldIgnoreLateAudio(session);
|
||||||
|
var ignoreAudioWithoutListen = ShouldIgnoreAudioWithoutListen(turnState);
|
||||||
|
if (ignoreLateAudio || ignoreAudioWithoutListen)
|
||||||
|
{
|
||||||
|
await sink.RecordTurnDiagnosticAsync("binary_audio_ignored", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
||||||
|
{
|
||||||
|
["ignored"] = true,
|
||||||
|
["ignoreLateAudio"] = ignoreLateAudio,
|
||||||
|
["ignoreAudioWithoutListen"] = ignoreAudioWithoutListen,
|
||||||
|
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
||||||
|
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
||||||
|
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
||||||
|
["sawListen"] = turnState.SawListen,
|
||||||
|
["sawContext"] = turnState.SawContext
|
||||||
|
}), cancellationToken);
|
||||||
|
return [];
|
||||||
|
}
|
||||||
|
|
||||||
|
session.LastMessageType = "BINARY_AUDIO";
|
||||||
|
turnState.FirstAudioReceivedUtc ??= DateTimeOffset.UtcNow;
|
||||||
|
turnState.BufferedAudioChunkCount += 1;
|
||||||
|
turnState.BufferedAudioBytes += envelope.Binary?.Length ?? 0;
|
||||||
|
if (envelope.Binary is { Length: > 0 })
|
||||||
|
{
|
||||||
|
turnState.BufferedAudioFrames.Add([.. envelope.Binary]);
|
||||||
|
}
|
||||||
|
turnState.LastAudioReceivedUtc = DateTimeOffset.UtcNow;
|
||||||
|
turnState.AwaitingTurnCompletion = true;
|
||||||
|
session.Metadata["lastAudioBytes"] = envelope.Binary?.Length ?? 0;
|
||||||
|
await sink.RecordTurnDiagnosticAsync("binary_audio_received", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
||||||
{
|
{
|
||||||
["ignored"] = true,
|
|
||||||
["ignoreLateAudio"] = ignoreLateAudio,
|
|
||||||
["ignoreAudioWithoutListen"] = ignoreAudioWithoutListen,
|
|
||||||
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
|
||||||
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
||||||
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
||||||
|
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
||||||
["sawListen"] = turnState.SawListen,
|
["sawListen"] = turnState.SawListen,
|
||||||
["sawContext"] = turnState.SawContext
|
["sawContext"] = turnState.SawContext,
|
||||||
|
["listenRules"] = turnState.ListenRules,
|
||||||
|
["listenAsrHints"] = turnState.ListenAsrHints,
|
||||||
|
["yesNoRule"] = turnState.ListenRules.FirstOrDefault(IsConstrainedYesNoRule)
|
||||||
}), cancellationToken);
|
}), cancellationToken);
|
||||||
|
|
||||||
|
if (ShouldAutoFinalize(session))
|
||||||
|
{
|
||||||
|
return await FinalizeTurnAsync(session, envelope, "AUTO_FINALIZE", allowFallbackOnMissingTranscript: true, cancellationToken);
|
||||||
|
}
|
||||||
|
|
||||||
return [];
|
return [];
|
||||||
}
|
}
|
||||||
|
finally
|
||||||
session.LastMessageType = "BINARY_AUDIO";
|
|
||||||
turnState.FirstAudioReceivedUtc ??= DateTimeOffset.UtcNow;
|
|
||||||
turnState.BufferedAudioChunkCount += 1;
|
|
||||||
turnState.BufferedAudioBytes += envelope.Binary?.Length ?? 0;
|
|
||||||
if (envelope.Binary is { Length: > 0 })
|
|
||||||
{
|
{
|
||||||
turnState.BufferedAudioFrames.Add([.. envelope.Binary]);
|
await TrackGlsmPhaseAsync(session, envelope, "binary_audio", cancellationToken);
|
||||||
}
|
}
|
||||||
turnState.LastAudioReceivedUtc = DateTimeOffset.UtcNow;
|
|
||||||
turnState.AwaitingTurnCompletion = true;
|
|
||||||
session.Metadata["lastAudioBytes"] = envelope.Binary?.Length ?? 0;
|
|
||||||
await sink.RecordTurnDiagnosticAsync("binary_audio_received", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
|
||||||
{
|
|
||||||
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
|
||||||
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
|
||||||
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
|
||||||
["sawListen"] = turnState.SawListen,
|
|
||||||
["sawContext"] = turnState.SawContext,
|
|
||||||
["listenRules"] = turnState.ListenRules,
|
|
||||||
["listenAsrHints"] = turnState.ListenAsrHints,
|
|
||||||
["yesNoRule"] = turnState.ListenRules.FirstOrDefault(IsConstrainedYesNoRule)
|
|
||||||
}), cancellationToken);
|
|
||||||
|
|
||||||
if (ShouldAutoFinalize(session))
|
|
||||||
{
|
|
||||||
return await FinalizeTurnAsync(session, envelope, "AUTO_FINALIZE", allowFallbackOnMissingTranscript: true, cancellationToken);
|
|
||||||
}
|
|
||||||
|
|
||||||
return [];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public async Task<IReadOnlyList<WebSocketReply>> HandleContextAsync(
|
public async Task<IReadOnlyList<WebSocketReply>> HandleContextAsync(
|
||||||
@@ -116,34 +125,40 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
WebSocketMessageEnvelope envelope,
|
WebSocketMessageEnvelope envelope,
|
||||||
CancellationToken cancellationToken = default)
|
CancellationToken cancellationToken = default)
|
||||||
{
|
{
|
||||||
var turnState = session.TurnState;
|
try
|
||||||
turnState.SawContext = true;
|
|
||||||
turnState.ContextPayload = ExtractDataPayload(envelope.Text);
|
|
||||||
session.Metadata["context"] = turnState.ContextPayload;
|
|
||||||
|
|
||||||
if (TryReadContextProperty(envelope.Text, "audioTranscriptHint", out var transcriptHint) &&
|
|
||||||
!string.IsNullOrWhiteSpace(transcriptHint))
|
|
||||||
{
|
{
|
||||||
turnState.AudioTranscriptHint = transcriptHint;
|
var turnState = session.TurnState;
|
||||||
session.Metadata["audioTranscriptHint"] = transcriptHint;
|
turnState.SawContext = true;
|
||||||
}
|
turnState.ContextPayload = ExtractDataPayload(envelope.Text);
|
||||||
|
session.Metadata["context"] = turnState.ContextPayload;
|
||||||
|
|
||||||
|
if (TryReadContextProperty(envelope.Text, "audioTranscriptHint", out var transcriptHint) &&
|
||||||
|
!string.IsNullOrWhiteSpace(transcriptHint))
|
||||||
|
{
|
||||||
|
turnState.AudioTranscriptHint = transcriptHint;
|
||||||
|
session.Metadata["audioTranscriptHint"] = transcriptHint;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (ShouldIgnorePassiveLocalSkillContext(session, envelope.Text))
|
||||||
|
{
|
||||||
|
turnState.AwaitingTurnCompletion = false;
|
||||||
|
turnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
||||||
|
ResetBufferedAudio(session);
|
||||||
|
ClearListenTracking(turnState);
|
||||||
|
return [];
|
||||||
|
}
|
||||||
|
|
||||||
|
if (ShouldAutoFinalize(session))
|
||||||
|
{
|
||||||
|
return await FinalizeTurnAsync(session, envelope, "AUTO_FINALIZE", allowFallbackOnMissingTranscript: true, cancellationToken);
|
||||||
|
}
|
||||||
|
|
||||||
if (ShouldIgnorePassiveLocalSkillContext(session, envelope.Text))
|
|
||||||
{
|
|
||||||
turnState.AwaitingTurnCompletion = false;
|
|
||||||
turnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
|
||||||
ResetBufferedAudio(session);
|
|
||||||
turnState.SawListen = false;
|
|
||||||
turnState.SawContext = false;
|
|
||||||
return [];
|
return [];
|
||||||
}
|
}
|
||||||
|
finally
|
||||||
if (ShouldAutoFinalize(session))
|
|
||||||
{
|
{
|
||||||
return await FinalizeTurnAsync(session, envelope, "AUTO_FINALIZE", allowFallbackOnMissingTranscript: true, cancellationToken);
|
await TrackGlsmPhaseAsync(session, envelope, "context", cancellationToken);
|
||||||
}
|
}
|
||||||
|
|
||||||
return [];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public async Task<IReadOnlyList<WebSocketReply>> HandleTurnAsync(
|
public async Task<IReadOnlyList<WebSocketReply>> HandleTurnAsync(
|
||||||
@@ -167,8 +182,8 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
session.TurnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
session.TurnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
||||||
session.FollowUpExpiresUtc = null;
|
session.FollowUpExpiresUtc = null;
|
||||||
ResetBufferedAudio(session);
|
ResetBufferedAudio(session);
|
||||||
session.TurnState.SawListen = false;
|
ClearListenTracking(session.TurnState);
|
||||||
session.TurnState.SawContext = false;
|
UpdateGlsmPhaseMarker(session);
|
||||||
return [.. ResponsePlanToSocketMessagesMapper.MapNoInputAndRedirectToSkill(
|
return [.. ResponsePlanToSocketMessagesMapper.MapNoInputAndRedirectToSkill(
|
||||||
session.TurnState.TransId ?? session.LastTransId ?? string.Empty,
|
session.TurnState.TransId ?? session.LastTransId ?? string.Empty,
|
||||||
session.TurnState.ListenRules,
|
session.TurnState.ListenRules,
|
||||||
@@ -181,6 +196,8 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
}
|
}
|
||||||
|
|
||||||
session.TurnState.AwaitingTurnCompletion = true;
|
session.TurnState.AwaitingTurnCompletion = true;
|
||||||
|
session.TurnState.ListenOpenedUtc ??= DateTimeOffset.UtcNow;
|
||||||
|
UpdateGlsmPhaseMarker(session);
|
||||||
return [];
|
return [];
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -275,6 +292,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
string.Equals(type.GetString(), "LISTEN", StringComparison.OrdinalIgnoreCase))
|
string.Equals(type.GetString(), "LISTEN", StringComparison.OrdinalIgnoreCase))
|
||||||
{
|
{
|
||||||
turnState.SawListen = true;
|
turnState.SawListen = true;
|
||||||
|
turnState.ListenOpenedUtc ??= DateTimeOffset.UtcNow;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (root.TryGetProperty("transID", out var transId) && transId.ValueKind == JsonValueKind.String)
|
if (root.TryGetProperty("transID", out var transId) && transId.ValueKind == JsonValueKind.String)
|
||||||
@@ -351,6 +369,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
turnState.TransId = transId;
|
turnState.TransId = transId;
|
||||||
turnState.ContextPayload = null;
|
turnState.ContextPayload = null;
|
||||||
turnState.AudioTranscriptHint = null;
|
turnState.AudioTranscriptHint = null;
|
||||||
|
turnState.ListenOpenedUtc = null;
|
||||||
turnState.LastSttError = null;
|
turnState.LastSttError = null;
|
||||||
turnState.LastSttErrorUtc = null;
|
turnState.LastSttErrorUtc = null;
|
||||||
turnState.FirstAudioReceivedUtc = null;
|
turnState.FirstAudioReceivedUtc = null;
|
||||||
@@ -376,36 +395,37 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
bool allowFallbackOnMissingTranscript,
|
bool allowFallbackOnMissingTranscript,
|
||||||
CancellationToken cancellationToken)
|
CancellationToken cancellationToken)
|
||||||
{
|
{
|
||||||
var turn = ProtocolToTurnContextMapper.MapListenMessage(envelope, session, messageType);
|
try
|
||||||
var turnState = session.TurnState;
|
|
||||||
if (IsYesNoTurn(turn) || ReadPrimaryYesNoRule(turn) is not null)
|
|
||||||
{
|
{
|
||||||
await sink.RecordTurnDiagnosticAsync("yes_no_turn_received", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
var turn = ProtocolToTurnContextMapper.MapListenMessage(envelope, session, messageType);
|
||||||
|
var turnState = session.TurnState;
|
||||||
|
if (IsYesNoTurn(turn) || ReadPrimaryYesNoRule(turn) is not null)
|
||||||
{
|
{
|
||||||
["messageType"] = messageType,
|
await sink.RecordTurnDiagnosticAsync("yes_no_turn_received", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
||||||
["listenRules"] = ReadRules(turn, "listenRules").ToArray(),
|
{
|
||||||
["clientRules"] = ReadRules(turn, "clientRules").ToArray(),
|
["messageType"] = messageType,
|
||||||
["listenAsrHints"] = ReadRules(turn, "listenAsrHints").ToArray(),
|
["listenRules"] = ReadRules(turn, "listenRules").ToArray(),
|
||||||
["yesNoRule"] = ReadPrimaryYesNoRule(turn),
|
["clientRules"] = ReadRules(turn, "clientRules").ToArray(),
|
||||||
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
["listenAsrHints"] = ReadRules(turn, "listenAsrHints").ToArray(),
|
||||||
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
["yesNoRule"] = ReadPrimaryYesNoRule(turn),
|
||||||
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
["awaitingTurnCompletion"] = turnState.AwaitingTurnCompletion,
|
||||||
["sawListen"] = turnState.SawListen,
|
["bufferedAudioBytes"] = turnState.BufferedAudioBytes,
|
||||||
["sawContext"] = turnState.SawContext,
|
["bufferedAudioChunks"] = turnState.BufferedAudioChunkCount,
|
||||||
["followUpOpen"] = session.FollowUpOpen,
|
["sawListen"] = turnState.SawListen,
|
||||||
["followUpExpiresUtc"] = session.FollowUpExpiresUtc
|
["sawContext"] = turnState.SawContext,
|
||||||
}), cancellationToken);
|
["followUpOpen"] = session.FollowUpOpen,
|
||||||
}
|
["followUpExpiresUtc"] = session.FollowUpExpiresUtc
|
||||||
if (ShouldIgnoreBlankAudioHotphraseTurn(turn))
|
}), cancellationToken);
|
||||||
{
|
}
|
||||||
session.TurnState.AwaitingTurnCompletion = false;
|
if (ShouldIgnoreBlankAudioHotphraseTurn(turn))
|
||||||
session.TurnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
{
|
||||||
session.FollowUpExpiresUtc = null;
|
session.TurnState.AwaitingTurnCompletion = false;
|
||||||
ResetBufferedAudio(session);
|
session.TurnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
||||||
session.TurnState.SawListen = false;
|
session.FollowUpExpiresUtc = null;
|
||||||
session.TurnState.SawContext = false;
|
ResetBufferedAudio(session);
|
||||||
return [];
|
ClearListenTracking(session.TurnState);
|
||||||
}
|
return [];
|
||||||
|
}
|
||||||
|
|
||||||
var finalizedTurn = await ResolveTranscriptAsync(turn, session, cancellationToken);
|
var finalizedTurn = await ResolveTranscriptAsync(turn, session, cancellationToken);
|
||||||
if (!IsTranscriptUsable(finalizedTurn))
|
if (!IsTranscriptUsable(finalizedTurn))
|
||||||
@@ -445,8 +465,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
turnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
turnState.IgnoreAdditionalAudioUntilUtc = DateTimeOffset.UtcNow.Add(WebSocketTurnState.DefaultLateAudioIgnoreWindow);
|
||||||
session.FollowUpExpiresUtc = null;
|
session.FollowUpExpiresUtc = null;
|
||||||
ResetBufferedAudio(session);
|
ResetBufferedAudio(session);
|
||||||
turnState.SawListen = false;
|
ClearListenTracking(turnState);
|
||||||
turnState.SawContext = false;
|
|
||||||
return [.. ResponsePlanToSocketMessagesMapper.MapNoInputAndRedirectToSkill(
|
return [.. ResponsePlanToSocketMessagesMapper.MapNoInputAndRedirectToSkill(
|
||||||
turnState.TransId ?? session.LastTransId ?? string.Empty,
|
turnState.TransId ?? session.LastTransId ?? string.Empty,
|
||||||
turnState.ListenRules,
|
turnState.ListenRules,
|
||||||
@@ -483,8 +502,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
var localRule = ReadPrimaryNoInputRule(finalizedTurn);
|
var localRule = ReadPrimaryNoInputRule(finalizedTurn);
|
||||||
var noInputReplies = BuildLocalNoInputReplies(session, turnState, localRule);
|
var noInputReplies = BuildLocalNoInputReplies(session, turnState, localRule);
|
||||||
ResetBufferedAudio(session);
|
ResetBufferedAudio(session);
|
||||||
turnState.SawListen = false;
|
ClearListenTracking(turnState);
|
||||||
turnState.SawContext = false;
|
|
||||||
return noInputReplies;
|
return noInputReplies;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -545,8 +563,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
.Select(map => new WebSocketReply { Text = map.Text, DelayMs = map.DelayMs })
|
.Select(map => new WebSocketReply { Text = map.Text, DelayMs = map.DelayMs })
|
||||||
.ToArray();
|
.ToArray();
|
||||||
ResetBufferedAudio(session);
|
ResetBufferedAudio(session);
|
||||||
turnState.SawListen = false;
|
ClearListenTracking(turnState);
|
||||||
turnState.SawContext = false;
|
|
||||||
return fallbackReplies;
|
return fallbackReplies;
|
||||||
}
|
}
|
||||||
case true when
|
case true when
|
||||||
@@ -678,10 +695,14 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
}), cancellationToken);
|
}), cancellationToken);
|
||||||
}
|
}
|
||||||
|
|
||||||
ResetBufferedAudio(session);
|
ResetBufferedAudio(session);
|
||||||
turnState.SawListen = false;
|
ClearListenTracking(turnState);
|
||||||
turnState.SawContext = false;
|
return replies;
|
||||||
return replies;
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
await TrackGlsmPhaseAsync(session, envelope, $"finalize:{messageType}", cancellationToken);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private static bool ShouldAutoFinalize(CloudSession session)
|
private static bool ShouldAutoFinalize(CloudSession session)
|
||||||
@@ -708,6 +729,58 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
return ShouldIgnoreLateAudio(session) && IsHotphraseLaunchListenSetup(text);
|
return ShouldIgnoreLateAudio(session) && IsHotphraseLaunchListenSetup(text);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public static bool TryRecoverStalePendingListen(CloudSession session, out int staleAgeMs)
|
||||||
|
{
|
||||||
|
staleAgeMs = 0;
|
||||||
|
var turnState = session.TurnState;
|
||||||
|
if (!turnState.AwaitingTurnCompletion ||
|
||||||
|
!turnState.SawListen ||
|
||||||
|
turnState.SawContext ||
|
||||||
|
turnState.BufferedAudioBytes > 0 ||
|
||||||
|
!turnState.ListenOpenedUtc.HasValue)
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
var age = DateTimeOffset.UtcNow - turnState.ListenOpenedUtc.Value;
|
||||||
|
if (age < StaleListenSetupRecoveryAge)
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
staleAgeMs = (int)age.TotalMilliseconds;
|
||||||
|
turnState.AwaitingTurnCompletion = false;
|
||||||
|
ResetBufferedAudio(session);
|
||||||
|
ClearListenTracking(turnState);
|
||||||
|
turnState.ListenHotphrase = false;
|
||||||
|
turnState.HotphraseEmptyTurnCount = 0;
|
||||||
|
UpdateGlsmPhaseMarker(session);
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static string ResolveGlsmPhase(CloudSession session)
|
||||||
|
{
|
||||||
|
var turnState = session.TurnState;
|
||||||
|
if (!turnState.AwaitingTurnCompletion)
|
||||||
|
{
|
||||||
|
return session.FollowUpOpen ? "DISPATCH_DIALOG" : "PROCESS_LISTENER_QUEUE";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (turnState.SawListen && !turnState.SawContext && turnState.BufferedAudioBytes == 0)
|
||||||
|
{
|
||||||
|
return "HJ_LISTENING";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (turnState.SawListen && turnState.SawContext && turnState.BufferedAudioBytes == 0)
|
||||||
|
{
|
||||||
|
return "LISTENING";
|
||||||
|
}
|
||||||
|
|
||||||
|
return turnState.BufferedAudioBytes > 0
|
||||||
|
? "WAIT_LISTEN_FINISHED"
|
||||||
|
: "LISTENING";
|
||||||
|
}
|
||||||
|
|
||||||
private static TimeSpan ResolveLateAudioIgnoreWindow(ResponsePlan plan)
|
private static TimeSpan ResolveLateAudioIgnoreWindow(ResponsePlan plan)
|
||||||
{
|
{
|
||||||
return string.Equals(plan.IntentName, "cloud_version", StringComparison.OrdinalIgnoreCase)
|
return string.Equals(plan.IntentName, "cloud_version", StringComparison.OrdinalIgnoreCase)
|
||||||
@@ -1518,6 +1591,53 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
return PegasusAffinityContinuationStems.Contains(normalized);
|
return PegasusAffinityContinuationStems.Contains(normalized);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
private static void ClearListenTracking(WebSocketTurnState turnState)
|
||||||
|
{
|
||||||
|
turnState.SawListen = false;
|
||||||
|
turnState.SawContext = false;
|
||||||
|
turnState.ListenOpenedUtc = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
private static void UpdateGlsmPhaseMarker(CloudSession session)
|
||||||
|
{
|
||||||
|
session.Metadata[GlsmPhaseMetadataKey] = ResolveGlsmPhase(session);
|
||||||
|
}
|
||||||
|
|
||||||
|
private async Task TrackGlsmPhaseAsync(
|
||||||
|
CloudSession session,
|
||||||
|
WebSocketMessageEnvelope envelope,
|
||||||
|
string trigger,
|
||||||
|
CancellationToken cancellationToken)
|
||||||
|
{
|
||||||
|
var nextPhase = ResolveGlsmPhase(session);
|
||||||
|
var previousPhase = session.Metadata.TryGetValue(GlsmPhaseMetadataKey, out var rawPhase)
|
||||||
|
? rawPhase?.ToString()
|
||||||
|
: null;
|
||||||
|
session.Metadata[GlsmPhaseMetadataKey] = nextPhase;
|
||||||
|
|
||||||
|
if (string.Equals(previousPhase, nextPhase, StringComparison.OrdinalIgnoreCase))
|
||||||
|
{
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
try
|
||||||
|
{
|
||||||
|
await sink.RecordTurnDiagnosticAsync("glsm_phase_transition", BuildTurnDiagnosticSnapshot(session, envelope, new Dictionary<string, object?>
|
||||||
|
{
|
||||||
|
["trigger"] = trigger,
|
||||||
|
["previousState"] = previousPhase,
|
||||||
|
["state"] = nextPhase,
|
||||||
|
["listenOpenedUtc"] = session.TurnState.ListenOpenedUtc,
|
||||||
|
["followUpOpen"] = session.FollowUpOpen,
|
||||||
|
["listenRules"] = session.TurnState.ListenRules
|
||||||
|
}), cancellationToken);
|
||||||
|
}
|
||||||
|
catch
|
||||||
|
{
|
||||||
|
// Diagnostics should not interrupt turn handling.
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
private static Dictionary<string, object?> BuildTurnDiagnosticSnapshot(
|
private static Dictionary<string, object?> BuildTurnDiagnosticSnapshot(
|
||||||
CloudSession session,
|
CloudSession session,
|
||||||
WebSocketMessageEnvelope envelope,
|
WebSocketMessageEnvelope envelope,
|
||||||
@@ -1534,6 +1654,7 @@ public sealed partial class WebSocketTurnFinalizationService(
|
|||||||
details["bufferedAudioChunks"] = session.TurnState.BufferedAudioChunkCount;
|
details["bufferedAudioChunks"] = session.TurnState.BufferedAudioChunkCount;
|
||||||
details["sawListen"] = session.TurnState.SawListen;
|
details["sawListen"] = session.TurnState.SawListen;
|
||||||
details["sawContext"] = session.TurnState.SawContext;
|
details["sawContext"] = session.TurnState.SawContext;
|
||||||
|
details["glsmState"] = ResolveGlsmPhase(session);
|
||||||
return details;
|
return details;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ public sealed class WebSocketTurnState
|
|||||||
|
|
||||||
public string? TransId { get; set; }
|
public string? TransId { get; set; }
|
||||||
public string? ContextPayload { get; set; }
|
public string? ContextPayload { get; set; }
|
||||||
|
public DateTimeOffset? ListenOpenedUtc { get; set; }
|
||||||
public bool ListenHotphrase { get; set; }
|
public bool ListenHotphrase { get; set; }
|
||||||
public int HotphraseEmptyTurnCount { get; set; }
|
public int HotphraseEmptyTurnCount { get; set; }
|
||||||
public DateTimeOffset? IgnoreAdditionalAudioUntilUtc { get; set; }
|
public DateTimeOffset? IgnoreAdditionalAudioUntilUtc { get; set; }
|
||||||
|
|||||||
@@ -101,4 +101,49 @@ public sealed class FileTurnTelemetrySinkTests
|
|||||||
s => s.RecordTranscriptError(It.IsAny<Exception>(), It.IsAny<string>(), It.IsAny<CancellationToken>()),
|
s => s.RecordTranscriptError(It.IsAny<Exception>(), It.IsAny<string>(), It.IsAny<CancellationToken>()),
|
||||||
Times.Once());
|
Times.Once());
|
||||||
}
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public async Task HandleContext_EmitsGlsmPhaseTransitionDiagnostic()
|
||||||
|
{
|
||||||
|
var sink = new Mock<ITurnTelemetrySink>();
|
||||||
|
sink.Setup(s => s.RecordTurnDiagnosticAsync(It.IsAny<string>(), It.IsAny<IReadOnlyDictionary<string, object?>>(), It.IsAny<CancellationToken>()))
|
||||||
|
.Returns(Task.CompletedTask);
|
||||||
|
var turnService = new WebSocketTurnFinalizationService(
|
||||||
|
Mock.Of<IConversationBroker>(),
|
||||||
|
Mock.Of<ISttStrategySelector>(),
|
||||||
|
sink.Object);
|
||||||
|
|
||||||
|
var session = new CloudSession
|
||||||
|
{
|
||||||
|
Token = "glsm-phase-token",
|
||||||
|
TurnState =
|
||||||
|
{
|
||||||
|
TransId = "trans-glsm",
|
||||||
|
AwaitingTurnCompletion = true,
|
||||||
|
SawListen = true,
|
||||||
|
ListenOpenedUtc = DateTimeOffset.UtcNow - TimeSpan.FromSeconds(1)
|
||||||
|
}
|
||||||
|
};
|
||||||
|
session.Metadata["glsmPhase"] = "HJ_LISTENING";
|
||||||
|
|
||||||
|
await turnService.HandleContextAsync(
|
||||||
|
session,
|
||||||
|
new WebSocketMessageEnvelope
|
||||||
|
{
|
||||||
|
HostName = "neo-hub.jibo.com",
|
||||||
|
Path = "/listen",
|
||||||
|
Kind = "neo-hub-listen",
|
||||||
|
Text = """{"type":"CONTEXT","transID":"trans-glsm","data":{"topic":"conversation"}}"""
|
||||||
|
},
|
||||||
|
CancellationToken.None);
|
||||||
|
|
||||||
|
sink.Verify(
|
||||||
|
s => s.RecordTurnDiagnosticAsync(
|
||||||
|
"glsm_phase_transition",
|
||||||
|
It.Is<IReadOnlyDictionary<string, object?>>(details =>
|
||||||
|
details.ContainsKey("state") &&
|
||||||
|
string.Equals(details["state"] == null ? null : details["state"]!.ToString(), "LISTENING", StringComparison.OrdinalIgnoreCase)),
|
||||||
|
It.IsAny<CancellationToken>()),
|
||||||
|
Times.AtLeastOnce());
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -2523,6 +2523,47 @@ public sealed class JiboWebSocketServiceTests
|
|||||||
Assert.Null(session.LastIntent);
|
Assert.Null(session.LastIntent);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public async Task StaleListenSetup_IsRecoveredWhenNextHotphraseListenArrives()
|
||||||
|
{
|
||||||
|
await _service.HandleMessageAsync(new WebSocketMessageEnvelope
|
||||||
|
{
|
||||||
|
HostName = "neo-hub.jibo.com",
|
||||||
|
Path = "/listen",
|
||||||
|
Kind = "neo-hub-listen",
|
||||||
|
Token = "hub-stale-listen-token",
|
||||||
|
Text = """{"type":"LISTEN","transID":"trans-stale-listen","data":{"hotphrase":true,"rules":["launch","globals/global_commands_launch"]}}"""
|
||||||
|
});
|
||||||
|
|
||||||
|
var session = _store.FindSessionByToken("hub-stale-listen-token");
|
||||||
|
Assert.NotNull(session);
|
||||||
|
session.TurnState.ListenOpenedUtc = DateTimeOffset.UtcNow - TimeSpan.FromSeconds(12);
|
||||||
|
session.TurnState.AwaitingTurnCompletion = true;
|
||||||
|
session.TurnState.SawListen = true;
|
||||||
|
session.TurnState.SawContext = false;
|
||||||
|
session.TurnState.BufferedAudioBytes = 0;
|
||||||
|
session.TurnState.BufferedAudioChunkCount = 0;
|
||||||
|
session.TurnState.HotphraseEmptyTurnCount = 2;
|
||||||
|
|
||||||
|
var replies = await _service.HandleMessageAsync(new WebSocketMessageEnvelope
|
||||||
|
{
|
||||||
|
HostName = "neo-hub.jibo.com",
|
||||||
|
Path = "/listen",
|
||||||
|
Kind = "neo-hub-listen",
|
||||||
|
Token = "hub-stale-listen-token",
|
||||||
|
Text = """{"type":"LISTEN","transID":"trans-stale-listen","data":{"hotphrase":true,"rules":["launch","globals/global_commands_launch"]}}"""
|
||||||
|
});
|
||||||
|
|
||||||
|
Assert.Empty(replies);
|
||||||
|
Assert.True(session.TurnState.AwaitingTurnCompletion);
|
||||||
|
Assert.True(session.TurnState.SawListen);
|
||||||
|
Assert.False(session.TurnState.SawContext);
|
||||||
|
Assert.Equal(0, session.TurnState.BufferedAudioBytes);
|
||||||
|
Assert.Equal(0, session.TurnState.BufferedAudioChunkCount);
|
||||||
|
Assert.Equal(0, session.TurnState.HotphraseEmptyTurnCount);
|
||||||
|
Assert.True(session.TurnState.ListenOpenedUtc > DateTimeOffset.UtcNow - TimeSpan.FromSeconds(3));
|
||||||
|
}
|
||||||
|
|
||||||
[Fact]
|
[Fact]
|
||||||
public async Task BinaryAudio_AfterWordOfDayRightWordListen_IsIgnoredDuringCleanupWindow()
|
public async Task BinaryAudio_AfterWordOfDayRightWordListen_IsIgnoredDuringCleanupWindow()
|
||||||
{
|
{
|
||||||
|
|||||||
Reference in New Issue
Block a user