This plan is the repeatable live regression checklist for OpenJibo Cloud releases.
Use [live-jibo-test-runbook.md](live-jibo-test-runbook.md) for the environment setup and capture mechanics. Use this file for what to test once the robot is connected and the hosted `.NET` cloud is running.
The goal is to reduce trial-and-error cycles: every live pass should prove the release theme, keep prior working paths warm, and produce enough evidence to separate payload bugs, local robot behavior, and STT quality issues.
## When To Run
Run this plan:
- after the last code change before calling a release complete
- after any fix that touches websocket turn finalization, local skill redirects, constrained yes/no, or STT
- before moving from `1.0.18` bug-fix closeout into `1.0.19` feature work
- after the Test 26 and Test 27 fixes, run at least the focused cloud-version, alarm/timer, photo/gallery, stop, volume, and blue-ring cleanup sections before deciding whether `1.0.18` is ready to freeze
For small feature slices, run the automated `.NET` tests plus the smoke checks and only the live sections that share the same machinery. Before release closeout, run the full current-release suite.
## Required Evidence
For each live pass, keep these artifacts together under a named test folder such as `artifact-output/jibo-test-N`:
-`.NET` console logs
- websocket captures and fixture exports
- HTTP captures when startup, update, backup, media, or upload paths are involved
- robot runtime logs pulled after the session
- operator notes with exact phrases attempted and visible robot/menu state
Record failures with the observed transcript, active listen rules, emitted websocket response shape, and whether the robot menu state agreed with the cloud response.
## Release Gates
A release is not ready until these are true or explicitly deferred in [development-plan.md](development-plan.md):
- focused `.NET` cloud tests pass
- running robot reports the expected cloud version by voice and `/health`
3. Confirm the robot is not in a local connection-lost state; if logs show `Q4-Server_connection_lost` or a fresh `jibo-server-service` reconnect, wait for it to clear before scoring voice behavior.
4. Ask `cloud version`; confirm Jibo speaks the same version using `Cloud version ...` wording and does not stop itself, follow with `Cloudford`, `I heard...`, a local `@be/surprises` handoff, or another generic tail reply.
Stop and fix environment issues if startup, websocket connection, or capture output is not clean.
## Current `1.0.18` Regression Suite
### Radio
Goal: keep the local radio redirect path proven.
- Say `open the radio`.
- Say `play country music`.
- Expected: Jibo opens or resumes the radio locally, and the country phrase carries a `Country` station entity.
- Capture check: websocket output should be local `SKILL_REDIRECT` plus silent completion, not generic chat speech.
### News
Goal: keep the Nimbus-shaped cloud skill path proven.
- Say `tell me the news`.
- Expected: Jibo plays the current synthetic quick brief.
- Capture check: `LISTEN` match includes `cloudSkill = news`, followed by a `news``SKILL_ACTION`.
- Current limitation: provider-backed and category-expanded headlines are deferred unless selected as the optional feature slice.
### Backup, OTA, And Share Yes/No
Goal: prove constrained yes/no prompts stay local and do not leak global launch rules.
- Trigger the update menu path when available and answer one short `yes` or `no` prompt.
- Exercise any available share/date/offer yes-no prompt and answer both `yes` and `no` across runs when practical.
- Observe backup-in-progress behavior separately from explicit voice commands.
- Do not treat a spoken `take a backup` failure as proof of the backup scheduler path; that command is not currently wired as a hosted-cloud voice feature.
- If the update menu reports backup-in-progress, record whether HTTP captures include any `Backup_*` targets; current evidence points to robot-local scheduler/status or log/upload load unless those calls appear.
- If Jibo announces backup-in-progress without update-menu interaction, note the local skill in robot logs; Tests 26 and 27 showed `@be/surprises-ota`, Test 28 showed the preceding `@be/surprises` router opening after Nimbus, and Test 30 showed gallery settling into `@be/surprises` -> `@be/surprises-ota`.
- Test 31 added a startup `Backup_20170222.List` request before the first voice turn, so if the warning returns, capture that startup backup-status traffic alongside the later surprise handoff.
- If the warning appears soon after startup or update, check for local `jibo-server-service` restart, notification reconnect, or `Q4-Server_connection_lost` before scoring it as a hosted backup defect.
- Expected: short `yes`/`no` replies map locally, empty replies no-input locally, and backup/download notifications are not repeatedly re-announced once acknowledged.
- Capture check: active rule remains the constrained rule such as `surprises-ota/want_to_download_now`, `settings/download_now_later`, `shared/yes_no`, or another stock prompt rule; ordinary Nimbus/cloud/local turns should not transition into `@be/surprises` after completion.
Goal: prove the clock skill behaves locally and menu state agrees after the `jibo test 24` fixes.
Start from a known state. If an alarm already exists, record it and clear it through the menu or a controlled voice delete before beginning.
Test these paths:
- explicit set: `set an alarm for 7:43 AM`, adjusted to a near-future time during the actual run
- compact set: `set alarm for 743`, adjusted to a near-future time during the actual run
- clarification: `set an alarm`, then answer the value prompt with a short time such as `7 44` or `7, 44`
- replacement: with an alarm already set, set a different alarm and answer the replacement prompt; verify whether the answer kept or replaced the old alarm
- value-prompt cancel: `set an alarm`, then say `cancel`
- voice delete: `delete my alarm` or `cancel alarm`
- voice delete variants from Test 26: `delete the alarm`, `delete alarm`, and, if ASR mishears it, record whether `delete along` maps to local clock delete
- timer sanity: `set a timer for 10 seconds`, let it fire or record the exact remaining state, then verify a second timer request does not report a stale already-running timer
- STT sanity: if a short alarm time collapses to a shorter transcript such as `seven`, capture that as STT loss; Test 31's `7:11 AM` attempt collapsed to `7:00 PM`
- value replies under `clock/alarm_set_value` or `clock/timer_set_value` also return local `LISTEN`/`EOS` only; a delayed `@be/clock` relaunch after the local clock skill consumes the reply is a regression
- after a delete/replacement `No`, the robot should not remain in a continuous listen loop or open `@be/surprises` unless the stock OS explicitly takes that route
- when gallery is empty and asks whether to take a picture, verify whether a local `shared/yes_no` or equivalent `LISTEN` appears and whether the blue ring visually opens for voice input
- absolute volume emits `nlu.intent = volumeToValue` and `entities.volumeLevel` matching the requested value, including the observed `Set Volume 2-6.` homophone shape, with no `SKILL_ACTION` cloud speech
Goal: catch the Test 26 no-`LISTEN` buffering regression, the Test 27 diagnostic speech-tail regression, and the Test 28 unsuppressed end-of-skill surprise handoff quickly.
- After `cloud version`, wait five to ten seconds and confirm there is no fresh no-transcript hotphrase launch `LISTEN` that turns speech tail into generic chat.
- Confirm ordinary hosted replies and local redirects carry `match.skipSurprises = true`.
- Expected: binary audio for an existing transID is ignored until a fresh valid `LISTEN` appears; blank hotphrase turns clear instead of buffering indefinitely; diagnostic speech tails do not reopen launch listens; settled turns do not open `@be/surprises` / `@be/surprises-ota`.
- Expected: binary audio for an existing transID is ignored until a fresh valid `LISTEN` appears; blank hotphrase turns clear instead of buffering indefinitely; diagnostic speech tails do not reopen launch listens; settled turns do not open `@be/surprises` / `@be/surprises-ota`; a delete/replacement `No` should not strand the robot in a blue-ring listen loop.
- Expected: a proactive yes/no prompt such as Word of the Day should consume `yes`/`no` without echoing the answer back or leaving the robot listening.
- Capture check: long-running context-only transactions should not accumulate buffered audio chunks or stay `AwaitingTurnCompletion = true`; a late ignored diagnostic `LISTEN` may appear as cleanup telemetry but should not set `SawListen` or buffer audio; normal cloud/local completions should not be followed by a BE surprise router request.