version 18 fixes

This commit is contained in:
Jacob Dubin
2026-04-26 07:37:31 -05:00
parent baf886097e
commit f3dbd6c7fd
18 changed files with 7888 additions and 35 deletions

View File

@@ -19,6 +19,19 @@ Day-to-day feature sequencing lives in [feature-backlog.md](feature-backlog.md).
Release `1.0.18` is now in feature-hardening. Its main bug-fix theme is alarm and photo/gallery behavior on stock OS `1.9`, with a few small feature slices added while the test loop is warm.
## Latest Live Evidence
`jibo test 22` was captured against a robot that spoke `Open Jibo Cloud version 1 dot 0 dot 18.`
- Radio live validation passed.
- News routing was observed in websocket telemetry from the phrase `So, play the news.`, but the user did not get enough live confidence to call news complete because a backup notification/slowness path was active during the session.
- The backup notification came from stock `@be/surprises-ota` checking `jibo.scheduler.backupStatus`; no `Backup_*` HTTP operation appeared in the captured cloud traffic. The update-menu block therefore looks more like a robot-local scheduler/backup load issue than a cloud `Backup.List` response issue.
- The robot log showed high load, a `jibo-server-service` broken pipe, a settings error path for `Q4-Server_connection_lost`, and the stock backup prompt: `hey i'm sorry if I seem a little slow, I can be that way while i'm doing a backup.`
- Photo/gallery reached the local gallery/create path, but a missed short reply left repeated `create/is_it_a_keeper` listens and the visible blue-ring/listening state.
- Alarm attempts were dominated by collapsed transcripts such as `set and alarm`, `Set and Alonzo`, and `Set an alarm for...`; one path reached local alarm clarification but did not get a complete value-setting pass.
- The turn telemetry contained `ffmpeg` failures where local whisper tried to decode buffered Ogg/Opus turns that were not usable by `ffmpeg`.
- The websocket capture still contained `OPENJIBO_TURN_PENDING`, `OPENJIBO_CONTEXT_ACK`, and proactive `OPENJIBO_ACK` output in the deployed run. The current source has no references to those synthetic OpenJibo events, so the next deployment needs an artifact/build verification pass.
## Release Rhythm
This is the working pattern for each hosted-cloud release:
@@ -58,6 +71,7 @@ Current websocket scope:
- buffered Ogg/Opus audio preservation per turn
- synthetic transcript hint support for fixture-driven parity
- opt-in local `ffmpeg` plus `whisper.cpp` STT path for discovery
- local whisper only attempts external decoding when buffered audio contains an Opus identification header
- auto-finalize thresholds for buffered audio after a real listen phase
- late-audio ignore windows after completed turns
- no-input local completion for constrained prompts
@@ -89,7 +103,9 @@ The following behavior is present in source and covered by focused tests:
- media metadata persists across store recreation and `/media/{path}` can serve the current text-body placeholder payload
- constrained yes/no handling covers `create/is_it_a_keeper`, `shared/yes_no`, `settings/download_now_later`, `surprises-date/offer_date_fact`, `surprises-ota/want_to_download_now`, and `$YESNO` hints
- outbound constrained yes/no responses strip unrelated `globals/*` rules so stock OS stays local
- no-input fallback for constrained yes/no prompts emits local `LISTEN`/`EOS` instead of relaunching generic Nimbus speech
- no-input fallback for constrained yes/no prompts emits local `LISTEN`/`EOS` instead of relaunching generic Nimbus speech, including `shared/yes_no` after STT failure
- repeated empty `create/is_it_a_keeper` replies redirect to `@be/idle` after the second miss so the photo/create flow can settle instead of leaving a stale listening state
- local whisper skips buffered audio turns that do not contain `OpusHead`, preventing a known `ffmpeg` failure path from becoming the noisy failure mode
- Word of the Day launch, spoken guesses, structured `CLIENT_NLU` guesses, hint-order guesses, fuzzy hint matching, right-word cleanup, and late audio cleanup are covered in the websocket layer
## Reference Sources
@@ -116,12 +132,13 @@ Before calling `1.0.18` complete, prove or explicitly defer these:
- Run the focused `.NET` cloud test suite after the last feature slice.
- Confirm the running robot build reports cloud version `1.0.18`.
- Regression test alarm flows: set with explicit time, set with compact/spoken time, clarify missing time, cancel alarm, and local cleanup prompts.
- Regression test photo/gallery flows: open gallery, answer the stock `shared/yes_no` prompt, hand into create, take one photo, and avoid blue-ring stale turns.
- Live-test radio launch: `open the radio` and `play country music`.
- Live-test first news path: `tell me the news` should use the Nimbus cloud-skill lane instead of generic chat.
- Regression test alarm flows again after the `jibo test 22` fixes: set with explicit time, set with compact/spoken time, clarify missing time, cancel alarm, and local cleanup prompts.
- Regression test photo/gallery flows again after the `jibo test 22` fixes: open gallery, answer the stock `shared/yes_no` prompt, hand into create, take one photo, and avoid blue-ring stale turns.
- Live-test radio launch: `open the radio` passed in `jibo test 22`; re-run `play country music` if that exact phrase was not captured.
- Live-test first news path again: `jibo test 22` reached the news intent, but the live behavior still needs a clean non-backup session.
- Recheck constrained yes/no prompts for update/backup/share/gallery without leaking global rules.
- Recheck that stock OS no longer logs OpenJibo-only websocket events such as synthetic pending/context/ack packets from the current build.
- Recheck backup/update behavior with explicit attention to robot-local `jibo.scheduler.backupStatus`, CPU/load, and whether the deployed cloud is involved at all.
- Treat remaining `ffmpeg` / `whisper.cpp` transcript failures as STT work unless the capture proves a separate turn-routing regression.
## Known Gaps
@@ -131,7 +148,8 @@ These are not blockers for calling `1.0.18` complete unless the live test shows
- local `whisper.cpp` STT remains a discovery seam, not production ASR
- media upload/body handling is not binary-safe enough for final gallery originals and thumbnails
- state persistence is local JSON, not Azure SQL / Blob Storage
- update, backup, and restore are not end-to-end proven
- update, backup, and restore are not end-to-end proven, and the `jibo test 22` sluggishness appears tied to robot-local backup status/load
- deployed-build verification needs to prove that synthetic OpenJibo websocket events are gone from the hosted artifact, not just from source
- news content is synthetic
- weather, calendar, commute, personal report, identity, memory, and proactivity are still mostly discovery or placeholder content paths
- volume, stop, robot age, and command-versus-question personality routing are not implemented yet

View File

@@ -37,6 +37,7 @@ Current release theme:
- alarm and photo/gallery quirks have received the main bug-fix attention
- Word of the Day cleanup, constrained yes/no routing, unknown websocket event suppression, and local state persistence are already in the current code
- radio, ESML apostrophe cleanup, and first news are implemented in source/tests and need live confidence before the version is called complete
- `jibo test 22` validated radio, exposed backup/load interference, exposed a shared yes/no no-input gap, exposed repeated create keeper prompts after photo handoff, and showed local whisper `ffmpeg` failures on unusable buffered audio
## Immediate `1.0.18` Queue
@@ -52,6 +53,7 @@ Current release theme:
- Evidence:
- JiboOS `@be/radio` treats `menu` as a play launch and reads `result.nlu.entities.station`
- `Country` is a supported station key in the inspected genre metadata
- `jibo test 22` radio live validation passed
- Exit criteria:
- live `open the radio` resumes or opens radio without generic chat speech
- live `play country music` opens a country station
@@ -70,6 +72,7 @@ Current release theme:
- `SKILL_ACTION` uses skill id `news` and `mim_id = runtime-news`
- Evidence:
- JiboOS Nimbus checks `match.cloudSkill === "news"` and waits for a cloud response
- `jibo test 22` captured the phrase `So, play the news.` reaching the `news` intent, but live behavior was not cleanly confirmed
- Exit criteria:
- live `tell me the news` reaches a non-placeholder Nimbus path
- the robot behavior feels like a cloud skill response, not generic chat playback
@@ -86,15 +89,22 @@ Current release theme:
- covered prompt families include `settings/download_now_later`, `surprises-ota/want_to_download_now`, `surprises-date/offer_date_fact`, `shared/yes_no`, and `create/is_it_a_keeper`
- outbound replies strip global rules and keep the local rule
- no-input fallback for constrained prompts emits local `LISTEN`/`EOS`
- `shared/yes_no` now participates in the STT-failure no-input path instead of staying pending behind `$YESNO` hints
- repeated empty `create/is_it_a_keeper` replies redirect to `@be/idle` after the second miss
- Latest evidence:
- `jibo test 22` did not show `Backup_*` HTTP traffic during the backup complaint
- stock `@be/surprises-ota` drives the backup notification from robot-local `jibo.scheduler.backupStatus`
- a spoken `take a backup` command currently routes as generic chat and is not the same as proving the local backup scheduler path
- Exit criteria:
- spoken `yes` and `no` work on update, backup, share/offer, and gallery/create prompts
- empty or missed short replies retry locally instead of relaunching Nimbus or generic chat
- Next action:
- include these prompt families in the `1.0.18` live regression pass
- re-run these prompt families in the `1.0.18` live regression pass after the shared yes/no and create no-input fixes
- keep explicit backup creation as part of the update/backup/restore proof slice, not as an assumed yes/no prompt test
### 4. Alarm And Photo Gallery Release Regression
- Status: `ready`
- Status: `polish`
- Tags: `protocol`, `stt`
- Why now: this is the main bug-fix theme for `1.0.18`.
- Current code:
@@ -103,12 +113,17 @@ Current release theme:
- alarm cancel can reuse the last active clock domain
- gallery opens as `@be/gallery`; snapshot and photobooth open through `@be/create`
- passive gallery/create context no longer reopens stale cloud turns
- `shared/yes_no` no-input fallback and repeated create keeper cleanup were added after `jibo test 22`
- Latest evidence:
- gallery opened and handed into create, but repeated `create/is_it_a_keeper` prompts could leave the blue ring/listening state
- alarm recognition collapsed several attempts before a complete alarm value could be set
- `ffmpeg` failures were present during the same test window, so alarm/gallery retest should separate transcript quality from payload shape
- Exit criteria:
- gallery opens, offers to take a picture if empty, accepts `yes`, and hands into create
- alarm set, clarify, and cancel flows behave locally without blue-ring stale turns
- failures caused by collapsed STT transcripts are logged as STT issues rather than misdiagnosed as payload bugs
- Next action:
- run a stock OS `1.9` regression bundle before declaring `1.0.18` complete
- re-run a stock OS `1.9` regression bundle before declaring `1.0.18` complete
### 5. Optional Small Feature Before `1.0.18` Freeze
@@ -176,10 +191,22 @@ Current release theme:
- gallery, snapshot, and photobooth voice paths route to the correct local skills
- media metadata persists locally
- `/media/{path}` serves the current text-body placeholder payload
- repeated empty `create/is_it_a_keeper` turns redirect to `@be/idle` after the second miss
- Follow-up:
- live regression remains in the immediate queue
- binary-safe media storage remains future work
### Constrained Yes-No Cleanup
- Status: `implemented`
- Tags: `protocol`, `stt`
- Result:
- `shared/yes_no` is included in yes/no STT-failure detection
- local no-input replies strip global rules and keep the active constrained rule
- update, OTA, share/date-offer, gallery shared yes/no, and create keeper rules share the same no-input fallback machinery
- Follow-up:
- live update/backup/share/gallery prompts still need another clean pass
### Word Of The Day Cleanup
- Status: `implemented`
@@ -202,7 +229,7 @@ Current release theme:
- current websocket service drops unknown inbound message types silently
- synthetic `OPENJIBO_TURN_PENDING`, `OPENJIBO_CONTEXT_ACK`, and fallback `OPENJIBO_ACK` should no longer be emitted by current source
- Follow-up:
- if live logs show those event types, first verify the deployed process is actually the current build
- `jibo test 22` still captured those event types from the deployed run, so the next deployment must verify the artifact/build as well as source
### Update Phantom Manifest Fix
@@ -273,6 +300,7 @@ Current release theme:
- feature paths are now often correct when a transcript exists, but short replies and low-quality audio still block otherwise-correct flows
- Current evidence:
- live captures still show `ffmpeg` and `whisper.cpp` failures
- current source now skips local whisper when buffered audio does not contain an Opus identification header
- yes/no and alarm flows are especially sensitive to short or collapsed transcripts
- Implementation notes:
- add lightweight waveform or energy screening before transcription

View File

@@ -413,11 +413,7 @@ public sealed class WebSocketTurnFinalizationService(
session.LastIntent = null;
session.LastListenType = "no-input";
var localRule = ReadPrimaryNoInputRule(finalizedTurn);
var noInputReplies = ResponsePlanToSocketMessagesMapper.MapNoInput(
turnState.TransId ?? session.LastTransId ?? string.Empty,
string.IsNullOrWhiteSpace(localRule) ? turnState.ListenRules : [localRule])
.Select(map => new WebSocketReply { Text = map.Text, DelayMs = map.DelayMs })
.ToArray();
var noInputReplies = BuildLocalNoInputReplies(session, turnState, localRule);
ResetBufferedAudio(session);
turnState.SawListen = false;
turnState.SawContext = false;
@@ -461,11 +457,7 @@ public sealed class WebSocketTurnFinalizationService(
session.LastIntent = null;
session.LastListenType = "no-input";
var localRule = ReadPrimaryYesNoRule(finalizedTurn);
var noInputReplies = ResponsePlanToSocketMessagesMapper.MapNoInput(
turnState.TransId ?? session.LastTransId ?? string.Empty,
string.IsNullOrWhiteSpace(localRule) ? turnState.ListenRules : [localRule])
.Select(map => new WebSocketReply { Text = map.Text, DelayMs = map.DelayMs })
.ToArray();
var noInputReplies = BuildLocalNoInputReplies(session, turnState, localRule);
ResetBufferedAudio(session);
return noInputReplies;
}
@@ -493,6 +485,8 @@ public sealed class WebSocketTurnFinalizationService(
session.LastTranscript = finalizedTurn.NormalizedTranscript ?? finalizedTurn.RawTranscript;
session.LastIntent = plan.IntentName;
session.LastListenType = listenAction?.Mode;
turnState.LastLocalNoInputRule = null;
turnState.LocalNoInputCount = 0;
if (plan.Actions.OfType<InvokeNativeSkillAction>().FirstOrDefault() is { SkillName: "@be/clock", Payload: not null } clockAction &&
clockAction.Payload.TryGetValue("domain", out var lastClockDomainValue) &&
lastClockDomainValue is not null)
@@ -720,12 +714,7 @@ public sealed class WebSocketTurnFinalizationService(
return ReadRules(turn, "listenRules")
.Concat(ReadRules(turn, "clientRules"))
.Concat(ReadRules(turn, "listenAsrHints"))
.Any(static rule =>
string.Equals(rule, "$YESNO", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "create/is_it_a_keeper", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "settings/download_now_later", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-date/offer_date_fact", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-ota/want_to_download_now", StringComparison.OrdinalIgnoreCase));
.Any(IsYesNoRule);
}
private static bool ShouldHandleAsLocalNoInput(TurnContext turn)
@@ -737,8 +726,7 @@ public sealed class WebSocketTurnFinalizationService(
return ReadRules(turn, "listenRules")
.Concat(ReadRules(turn, "clientRules"))
.Any(static rule =>
string.Equals(rule, "clock/alarm_timer_okay", StringComparison.OrdinalIgnoreCase));
.Any(IsLocalNoInputRule);
}
private static string? ReadPrimaryNoInputRule(TurnContext turn)
@@ -757,12 +745,65 @@ public sealed class WebSocketTurnFinalizationService(
{
return ReadRules(turn, "listenRules")
.Concat(ReadRules(turn, "clientRules"))
.FirstOrDefault(static rule =>
string.Equals(rule, "create/is_it_a_keeper", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "shared/yes_no", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "settings/download_now_later", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-date/offer_date_fact", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-ota/want_to_download_now", StringComparison.OrdinalIgnoreCase));
.FirstOrDefault(IsConstrainedYesNoRule);
}
private static IReadOnlyList<WebSocketReply> BuildLocalNoInputReplies(
CloudSession session,
WebSocketTurnState turnState,
string? localRule)
{
var transId = turnState.TransId ?? session.LastTransId ?? string.Empty;
var effectiveRule = string.IsNullOrWhiteSpace(localRule)
? turnState.ListenRules.FirstOrDefault(IsLocalNoInputRule)
: localRule;
IReadOnlyList<string> rules = string.IsNullOrWhiteSpace(effectiveRule) ? turnState.ListenRules : [effectiveRule];
var maps = ShouldRedirectRepeatedNoInputToIdle(turnState, effectiveRule)
? ResponsePlanToSocketMessagesMapper.MapNoInputAndRedirectToSkill(transId, rules, "@be/idle")
: ResponsePlanToSocketMessagesMapper.MapNoInput(transId, rules);
return maps
.Select(map => new WebSocketReply { Text = map.Text, DelayMs = map.DelayMs })
.ToArray();
}
private static bool ShouldRedirectRepeatedNoInputToIdle(WebSocketTurnState turnState, string? localRule)
{
if (string.IsNullOrWhiteSpace(localRule))
{
turnState.LastLocalNoInputRule = null;
turnState.LocalNoInputCount = 0;
return false;
}
turnState.LocalNoInputCount = string.Equals(turnState.LastLocalNoInputRule, localRule, StringComparison.OrdinalIgnoreCase)
? turnState.LocalNoInputCount + 1
: 1;
turnState.LastLocalNoInputRule = localRule;
return turnState.LocalNoInputCount >= 2 &&
string.Equals(localRule, "create/is_it_a_keeper", StringComparison.OrdinalIgnoreCase);
}
private static bool IsYesNoRule(string rule)
{
return string.Equals(rule, "$YESNO", StringComparison.OrdinalIgnoreCase) ||
IsConstrainedYesNoRule(rule);
}
private static bool IsLocalNoInputRule(string rule)
{
return string.Equals(rule, "clock/alarm_timer_okay", StringComparison.OrdinalIgnoreCase) ||
IsConstrainedYesNoRule(rule);
}
private static bool IsConstrainedYesNoRule(string rule)
{
return string.Equals(rule, "create/is_it_a_keeper", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "shared/yes_no", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "settings/download_now_later", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-date/offer_date_fact", StringComparison.OrdinalIgnoreCase) ||
string.Equals(rule, "surprises-ota/want_to_download_now", StringComparison.OrdinalIgnoreCase);
}
private static IEnumerable<string> ReadRules(TurnContext turn, string key)

View File

@@ -18,6 +18,8 @@ public sealed class WebSocketTurnState
public int BufferedAudioBytes { get; set; }
public List<byte[]> BufferedAudioFrames { get; } = [];
public int FinalizeAttemptCount { get; set; }
public string? LastLocalNoInputRule { get; set; }
public int LocalNoInputCount { get; set; }
public bool AwaitingTurnCompletion { get; set; }
public bool SawListen { get; set; }
public bool SawContext { get; set; }

View File

@@ -15,7 +15,7 @@ public sealed class LocalWhisperCppBufferedAudioSttStrategy(
IsConfiguredPathAvailable(options.FfmpegPath, checkFileExists: false) &&
IsConfiguredPathAvailable(options.WhisperCliPath, checkFileExists: true) &&
IsConfiguredPathAvailable(options.WhisperModelPath, checkFileExists: true) &&
ReadBufferedAudioFrames(turn).Count > 0;
ReadBufferedAudioFrames(turn).Any(ContainsOpusIdentificationHeader);
}
public async Task<SttResult> TranscribeAsync(TurnContext turn, CancellationToken cancellationToken = default)
@@ -26,6 +26,11 @@ public sealed class LocalWhisperCppBufferedAudioSttStrategy(
throw new InvalidOperationException("Local whisper.cpp STT requires buffered websocket audio frames.");
}
if (!frames.Any(ContainsOpusIdentificationHeader))
{
throw new InvalidOperationException("Local whisper.cpp STT requires buffered Ogg/Opus audio with an Opus identification header.");
}
var tempDirectory = options.TempDirectory;
if (string.IsNullOrWhiteSpace(tempDirectory))
{
@@ -116,6 +121,11 @@ public sealed class LocalWhisperCppBufferedAudioSttStrategy(
: 0;
}
private static bool ContainsOpusIdentificationHeader(byte[] frame)
{
return frame.AsSpan().IndexOf("OpusHead"u8) >= 0;
}
private static string ExtractTranscript(string standardOutput)
{
var lines = standardOutput

View File

@@ -1124,6 +1124,117 @@ public sealed class JiboWebSocketServiceTests
Assert.Equal("surprises-ota/want_to_download_now", rules[0].GetString());
}
[Fact]
public async Task BufferedAudio_SharedYesNoPromptWithSttFailure_AutoFinalizesAsLocalNoInput()
{
await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-shared-yesno-noinput-token",
Text = """{"type":"LISTEN","transID":"trans-shared-yesno-noinput","data":{"rules":["shared/yes_no","globals/gui_nav","globals/mim_repeat","globals/global_commands_launch"],"asr":{"hints":["$YESNO"]}}}"""
});
await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-shared-yesno-noinput-token",
Text = """{"type":"CONTEXT","transID":"trans-shared-yesno-noinput","data":{"topic":"conversation"}}"""
});
for (var index = 0; index < 4; index += 1)
{
var interimReplies = await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-shared-yesno-noinput-token",
Binary = new byte[3000]
});
Assert.Empty(interimReplies);
}
var session = _store.FindSessionByToken("hub-shared-yesno-noinput-token");
Assert.NotNull(session);
session.TurnState.FirstAudioReceivedUtc = DateTimeOffset.UtcNow - TimeSpan.FromSeconds(2);
session.TurnState.LastSttError = "ffmpeg decode failed";
var replies = await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-shared-yesno-noinput-token",
Binary = new byte[3000]
});
Assert.Equal(2, replies.Count);
Assert.Equal("LISTEN", ReadReplyType(replies[0]));
Assert.Equal("EOS", ReadReplyType(replies[1]));
using var listenPayload = JsonDocument.Parse(replies[0].Text!);
var rules = listenPayload.RootElement.GetProperty("data").GetProperty("nlu").GetProperty("rules");
Assert.Single(rules.EnumerateArray());
Assert.Equal("shared/yes_no", rules[0].GetString());
}
[Fact]
public async Task ClientAsr_CreateKeeperRepeatedNoInput_RedirectsToIdle()
{
await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-create-noinput-token",
Text = """{"type":"LISTEN","transID":"trans-create-noinput-1","data":{"rules":["create/is_it_a_keeper","globals/gui_nav","globals/mim_repeat","globals/global_commands_launch"]}}"""
});
var firstReplies = await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-create-noinput-token",
Text = """{"type":"CLIENT_ASR","transID":"trans-create-noinput-1","data":{}}"""
});
Assert.Equal(2, firstReplies.Count);
Assert.Equal("LISTEN", ReadReplyType(firstReplies[0]));
Assert.Equal("EOS", ReadReplyType(firstReplies[1]));
await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-create-noinput-token",
Text = """{"type":"LISTEN","transID":"trans-create-noinput-2","data":{"rules":["create/is_it_a_keeper","globals/gui_nav","globals/mim_repeat","globals/global_commands_launch"]}}"""
});
var secondReplies = await _service.HandleMessageAsync(new WebSocketMessageEnvelope
{
HostName = "neo-hub.jibo.com",
Path = "/listen",
Kind = "neo-hub-listen",
Token = "hub-create-noinput-token",
Text = """{"type":"CLIENT_ASR","transID":"trans-create-noinput-2","data":{}}"""
});
Assert.Equal(3, secondReplies.Count);
Assert.Equal("LISTEN", ReadReplyType(secondReplies[0]));
Assert.Equal("EOS", ReadReplyType(secondReplies[1]));
Assert.Equal("SKILL_REDIRECT", ReadReplyType(secondReplies[2]));
using var redirectPayload = JsonDocument.Parse(secondReplies[2].Text!);
Assert.Equal("@be/idle", redirectPayload.RootElement.GetProperty("data").GetProperty("match").GetProperty("skillID").GetString());
}
[Fact]
public async Task ClientAsr_SurprisesDateOfferPrompt_MapsYesWithoutGlobalRuleLeak()
{

View File

@@ -53,6 +53,30 @@ public sealed class LocalWhisperCppBufferedAudioSttStrategyTests
Assert.False(strategy.CanHandle(turn));
}
[Fact]
public void CanHandle_ReturnsFalse_WhenBufferedAudioHasNoOpusIdentificationHeader()
{
var strategy = new LocalWhisperCppBufferedAudioSttStrategy(
new BufferedAudioSttOptions
{
EnableLocalWhisperCpp = true,
FfmpegPath = "ffmpeg",
WhisperCliPath = "whisper-cli",
WhisperModelPath = "model.bin"
},
new FakeExternalProcessRunner());
var turn = new TurnContext
{
Attributes = new Dictionary<string, object?>
{
["bufferedAudioFrames"] = new[] { BuildMinimalOggPageWithoutOpusHead() }
}
};
Assert.False(strategy.CanHandle(turn));
}
[Fact]
public async Task TranscribeAsync_UsesFfmpegAndWhisperCpp_WhenConfigured()
{
@@ -119,6 +143,13 @@ public sealed class LocalWhisperCppBufferedAudioSttStrategyTests
];
}
private static byte[] BuildMinimalOggPageWithoutOpusHead()
{
var page = BuildMinimalOggPage();
"NotAudio"u8.CopyTo(page.AsSpan(28, 8));
return page;
}
private sealed class FakeExternalProcessRunner : IExternalProcessRunner
{
public List<(string FileName, IReadOnlyList<string> Arguments)> Calls { get; } = [];