Files
JiboExperiments/OpenJibo/src/Jibo.Cloud/dotnet/README.md

110 lines
3.8 KiB
Markdown
Raw Normal View History

# Jibo.Cloud.DotNet
## Summary
`Jibo.Cloud.DotNet` is the stable hosted implementation of the OpenJibo cloud.
This is the production-oriented path for restoring device connectivity and creating a foundation for future runtime, AI, and OTA work.
## Architecture
The first implementation is a modular monolith:
```text
Api -> Application -> Domain -> Infrastructure
```
This keeps deployment simple while preserving clean boundaries.
## Azure Direction
The target Azure footprint is:
- Azure App Service for HTTP and WebSocket traffic
- Azure SQL for relational persistence
- Azure Blob Storage for uploads and update artifacts
- Azure Key Vault for secrets and certificates
- Application Insights for observability
Azure SQL is the primary system of record for:
- accounts
- devices
- sessions
- update metadata
- host mappings
- bootstrap and provisioning records
## Compatibility Goal
The first compatibility milestone is `core revive`.
That means the .NET cloud should handle:
- token and session issuance
- account and robot identity flows needed for startup
- core `X-Amz-Target` dispatch
- listen and proactive WebSocket paths
- basic media and update metadata responses
- handoff into normalized `TurnContext` and `ResponsePlan` contracts
## Relationship To The Node Prototype
The Node server remains the discovery harness and fixture source.
The .NET implementation should:
- copy observed behavior where needed
- use fixtures captured from Node and real robots
- avoid speculative protocol design
2026-04-11 21:50:26 -05:00
- separate HTTP parity, websocket parity, and future discovery work so coverage stays honest
## Current State
This folder now contains the first hosted scaffold, not just a README.
The intent is to grow from a runnable dev monolith into the real Azure deployment target without abandoning the existing abstractions work.
2026-04-11 21:50:26 -05:00
Current websocket scope is still intentionally narrow:
- token-backed socket sessions
2026-04-12 08:31:33 -05:00
- explicit websocket turn-state tracking separate from long-lived cloud session state
2026-04-11 21:50:26 -05:00
- synthetic `LISTEN` result shaping for `LISTEN`, `CLIENT_NLU`, and `CLIENT_ASR`
2026-04-11 22:11:08 -05:00
- buffered audio state tracking behind a dedicated turn-finalization layer
2026-04-15 14:33:43 -05:00
- raw audio auto-finalization once `LISTEN` + `CONTEXT` + minimum buffered audio thresholds are present
2026-04-11 22:11:08 -05:00
- synthetic STT strategy selection for fixture-driven audio turn completion
- structured websocket telemetry and live-run fixture export
2026-04-11 21:50:26 -05:00
- `CONTEXT` capture and follow-up turn state
- `EOS` completion
- first skill vertical for joke/chat `SKILL_ACTION` playback
2026-04-15 11:58:58 -05:00
- repo-root live-run capture support for both `captures/http/` and `captures/websocket/`
2026-04-11 21:50:26 -05:00
Not yet covered:
- real binary audio / ASR finalization parity
2026-04-11 22:11:08 -05:00
- provider-backed ASR integration
2026-04-12 08:31:33 -05:00
- timed finalize/fallback behavior matching richer Node turn-state semantics
2026-04-11 21:50:26 -05:00
- upstream Nimbus or broader skill lifecycle behavior
- animation / expression command families
- ESML feature parity beyond the narrow synthetic playback payloads used in the current scaffold
2026-04-15 11:58:58 -05:00
## Live Capture Status
The first real `.NET` robot test has confirmed:
- startup HTTP traffic reaches the `.NET` cloud
- `Notification.NewRobotToken` is in the active startup path
- `api-socket.jibo.com` connections are being accepted live
It has not yet confirmed:
- full startup parity with the successful Node run cadence
- consistent eye-open / wake completion on the robot
- the later health/log upload sequence currently seen in the working Node run
2026-04-15 14:33:43 -05:00
Current raw-audio behavior is still a compatibility bridge:
- if buffered audio has a synthetic transcript hint, the server now auto-finalizes the turn and emits `LISTEN` + `EOS` + `SKILL_ACTION`
- if buffered audio crosses the finalize threshold without a usable transcript, the server now emits a Node-style fallback completion with `EOS` instead of hanging the turn forever
- this is intentionally not a claim of real ASR parity