Stub in framework for new .net Open Jibo cloud
This commit is contained in:
@@ -1,439 +1,89 @@
|
||||
# Hybrid Jibo Runtime Plan
|
||||
# OpenJibo
|
||||
|
||||
## Goal
|
||||
## Summary
|
||||
|
||||
Build a **modern local-first Jibo runtime** while preserving the parts of native Jibo that are still useful:
|
||||
OpenJibo is the working revival track for Jibo.
|
||||
|
||||
* native wake/turn plumbing where helpful
|
||||
* native skills where helpful
|
||||
* native embodiment and rendering
|
||||
* fast experimentation in **.NET 10** off-robot
|
||||
The near-term plan is intentionally concrete:
|
||||
|
||||
Jibo’s native runtime already exposes a layered service model centered around **Jetstream** for turn/event flow, **GlobalManagerService** for routing, **SkillsService** for skill lifecycle, and **ExpressionService** for embodiment/rendering. The SSM startup is config-driven and mode-driven, which suggests a hybrid mode is a viable path.
|
||||
1. Build a stable replacement cloud on Azure.
|
||||
2. Use the existing Node prototype as the protocol oracle and capture harness.
|
||||
3. Port the hosted implementation to .NET as a modular monolith.
|
||||
4. Bring real robots online first through RCM plus controlled DNS/TLS patching.
|
||||
5. Use OTA later to reduce setup friction once the hosted cloud is proven.
|
||||
|
||||
---
|
||||
This keeps the project grounded in what is already working while moving toward a maintainable hosted platform.
|
||||
|
||||
## Architecture Direction
|
||||
## Current Truth
|
||||
|
||||
We will keep the **main experimental runtime in .NET 10** and treat Jibo as an embodied endpoint with a thin bridge layer.
|
||||
The repo now has three distinct lanes:
|
||||
|
||||
That means:
|
||||
- `src/Jibo.Cloud/node`
|
||||
The discovery server. This is the best source of observed protocol behavior today.
|
||||
- `src/Jibo.Cloud/dotnet`
|
||||
The long-term hosted implementation. This is where the stable cloud is being built.
|
||||
- `src/Jibo.Runtime.Abstractions`
|
||||
The normalized runtime seam between robot/cloud traffic and modern conversation logic.
|
||||
|
||||
* **off-robot**: conversation logic, planning, AI routing, capabilities
|
||||
* **on-robot**: thin adapter/bridge to native Jibo services
|
||||
* **native Jibo**: reuse rendering, skill hosting, and useful event seams
|
||||
|
||||
---
|
||||
|
||||
## High-Level ASCII Flowchart
|
||||
The key architectural idea is:
|
||||
|
||||
```text
|
||||
+--------------------------------------------------------------+
|
||||
| NATIVE JIBO LAYER |
|
||||
|--------------------------------------------------------------|
|
||||
| Wake / Turn Events |
|
||||
| - Jetstream |
|
||||
| - hjHeard / turn started / turn result |
|
||||
| |
|
||||
| Native Services |
|
||||
| - GlobalManagerService |
|
||||
| - SkillsService |
|
||||
| - ExpressionService |
|
||||
| - TTS / Body / Visual / Motion services |
|
||||
+------------------------------+-------------------------------+
|
||||
|
|
||||
| events / hooks / commands
|
||||
v
|
||||
+--------------------------------------------------------------+
|
||||
| JIBO BRIDGE LAYER |
|
||||
|--------------------------------------------------------------|
|
||||
| Thin adapter between Jibo and modern runtime |
|
||||
| |
|
||||
| Responsibilities: |
|
||||
| - receive turn/wake events |
|
||||
| - receive skill context / native state |
|
||||
| - forward normalized events to .NET runtime |
|
||||
| - accept ResponsePlans / commands from .NET runtime |
|
||||
| - invoke native skills / expression / TTS / visuals |
|
||||
+------------------------------+-------------------------------+
|
||||
|
|
||||
| normalized turn context
|
||||
v
|
||||
+--------------------------------------------------------------+
|
||||
| MODERN .NET 10 RUNTIME |
|
||||
|--------------------------------------------------------------|
|
||||
| Conversation Broker |
|
||||
| - session state |
|
||||
| - follow-up windows |
|
||||
| - topic/context tracking |
|
||||
| |
|
||||
| STT Strategy Selector |
|
||||
| - native transcript |
|
||||
| - local STT |
|
||||
| - cloud STT |
|
||||
| |
|
||||
| Brain Strategy Selector |
|
||||
| - skill/rules path |
|
||||
| - local AI |
|
||||
| - cloud AI |
|
||||
| - hybrid routing |
|
||||
| |
|
||||
| Action / Orchestration Planner |
|
||||
| - gestures / visuals / ESML / delegation |
|
||||
| - capability/tool calls |
|
||||
| - build final ResponsePlan |
|
||||
| |
|
||||
| Capability Registry |
|
||||
| - weather / time / reminders / tools |
|
||||
| - native skill delegation |
|
||||
| - robot expression helpers |
|
||||
+------------------------------+-------------------------------+
|
||||
|
|
||||
| ResponsePlan / commands
|
||||
v
|
||||
+--------------------------------------------------------------+
|
||||
| EXECUTION TARGETS |
|
||||
|--------------------------------------------------------------|
|
||||
| - Native SkillsService |
|
||||
| - Native ExpressionService |
|
||||
| - Native TTS / visuals / motion |
|
||||
| - Local AI backends |
|
||||
| - Cloud AI backends |
|
||||
| - External APIs / tools |
|
||||
+--------------------------------------------------------------+
|
||||
Jibo device -> OpenJibo cloud -> normalized runtime contracts -> capabilities and planning
|
||||
```
|
||||
|
||||
---
|
||||
## First Supported Device Path
|
||||
|
||||
## Runtime Flow
|
||||
The first supported recovery path is enthusiast-friendly, not zero-touch:
|
||||
|
||||
```text
|
||||
[Wake Word / Turn / Follow-up]
|
||||
|
|
||||
v
|
||||
[Jibo Native Events]
|
||||
|
|
||||
v
|
||||
[Jibo Bridge Layer]
|
||||
|
|
||||
v
|
||||
[Conversation Broker (.NET)]
|
||||
|
|
||||
v
|
||||
[STT Strategy Selection]
|
||||
|
|
||||
v
|
||||
[Brain Strategy Selection]
|
||||
/ | \
|
||||
/ | \
|
||||
[Skill/Rules] [Local AI] [Cloud AI]
|
||||
\ | /
|
||||
\ | /
|
||||
[Planner]
|
||||
|
|
||||
v
|
||||
[ResponsePlan Built]
|
||||
|
|
||||
v
|
||||
[Jibo Bridge Layer]
|
||||
|
|
||||
v
|
||||
[Skills / Expression / TTS / Motion / Visuals]
|
||||
|
|
||||
v
|
||||
[Follow-up Window or Timeout]
|
||||
QR Wi-Fi -> controlled router/DNS -> redirect legacy Jibo hosts ->
|
||||
RCM/device patch for TLS and host acceptance -> OpenJibo cloud on Azure
|
||||
```
|
||||
|
||||
---
|
||||
That path is documented in [docs/device-bootstrap.md](C:/Projects/JiboExperiments/OpenJibo/docs/device-bootstrap.md).
|
||||
|
||||
## Planned Hybrid Mode
|
||||
|
||||
Jibo’s startup and service composition are mode-driven and config-driven, so the long-term plan is to add a **new custom mode** rather than replacing stock behavior outright.
|
||||
|
||||
### Candidate mode names
|
||||
|
||||
* `hybrid`
|
||||
* `openjibo`
|
||||
* `revival`
|
||||
* `local-first`
|
||||
|
||||
### Intent of the mode
|
||||
|
||||
The custom mode should:
|
||||
|
||||
* preserve normal mode for stock behavior
|
||||
* preserve developer mode for native debugging
|
||||
* enable the bridge/runtime path for hybrid experiments
|
||||
* allow selective routing between old and new Jibo behavior
|
||||
|
||||
---
|
||||
|
||||
## Design Principles
|
||||
|
||||
### 1. Keep Jibo-specific code at the edges
|
||||
|
||||
The .NET runtime should know about:
|
||||
|
||||
* turns
|
||||
* sessions
|
||||
* plans
|
||||
* capabilities
|
||||
* render actions
|
||||
|
||||
It should **not** depend directly on:
|
||||
|
||||
* Electron internals
|
||||
* SSM implementation quirks
|
||||
* old Linux deployment constraints
|
||||
|
||||
### 2. Reuse native embodiment
|
||||
|
||||
Native Jibo rendering is valuable. ExpressionService appears to own animation, attention, DOF arbitration, and embodied output, so it should be reused as long as possible.
|
||||
|
||||
### 3. Replace cognition before replacing embodiment
|
||||
|
||||
The first thing to modernize is:
|
||||
|
||||
* routing
|
||||
* planning
|
||||
* AI selection
|
||||
* follow-up conversation behavior
|
||||
|
||||
Not necessarily:
|
||||
|
||||
* body motion
|
||||
* TTS
|
||||
* expression plumbing
|
||||
|
||||
### 4. Favor thin robot-side code
|
||||
|
||||
The bridge on Jibo should stay small and stable. Fast-moving logic belongs in .NET 10.
|
||||
|
||||
### 5. Everything should converge to a ResponsePlan
|
||||
|
||||
Regardless of source:
|
||||
|
||||
* skill
|
||||
* rules engine
|
||||
* local AI
|
||||
* cloud AI
|
||||
|
||||
the result should become a single normalized response/output plan.
|
||||
|
||||
---
|
||||
|
||||
## Native Jibo Mapping
|
||||
|
||||
Based on current reverse engineering, the native service boundaries map roughly like this: Jetstream is the turn/event seam, GlobalManagerService performs routing and skill-launch logic, SkillsService manages skill lifecycle, and ExpressionService handles embodiment/rendering.
|
||||
## Repo Map
|
||||
|
||||
```text
|
||||
Our Concept Native Jibo Equivalent
|
||||
---------------------------- --------------------------------
|
||||
Wake / Turn Source Jetstream
|
||||
Conversation Broker split across Jetstream + routing
|
||||
Brain Selection GlobalManagerService + skills
|
||||
Skill Execution SkillsService
|
||||
Renderer / Embodiment ExpressionService
|
||||
OpenJibo/
|
||||
docs/
|
||||
device-bootstrap.md
|
||||
protocol-inventory.md
|
||||
public-site-plan.md
|
||||
support-tiers.md
|
||||
|
||||
scripts/bootstrap/
|
||||
Discover-JiboHosts.ps1
|
||||
Generate-JiboDnsOverrides.ps1
|
||||
Test-OpenJiboRouting.ps1
|
||||
|
||||
src/
|
||||
Jibo.Cloud/
|
||||
node/
|
||||
dotnet/
|
||||
Jibo.Runtime.Abstractions/
|
||||
Playground/
|
||||
OpenJibo.Site/
|
||||
```
|
||||
|
||||
---
|
||||
## Decisions Locked In
|
||||
|
||||
## Proposed Project Layout
|
||||
- The first milestone is `core revive`, not full protocol parity.
|
||||
- Azure SQL is the relational system of record for the hosted cloud.
|
||||
- Billing and donations are future-compatible concerns, not phase-one delivery requirements.
|
||||
- OTA is a phase-two simplification strategy, not the initial dependency.
|
||||
|
||||
```text
|
||||
/src
|
||||
/Jibo.Runtime
|
||||
Core runtime orchestration
|
||||
- ConversationBroker
|
||||
- Session state
|
||||
- Turn pipeline
|
||||
- ResponsePlan builder
|
||||
## Near-Term Work
|
||||
|
||||
/Jibo.Runtime.Abstractions
|
||||
Interfaces and models
|
||||
- ITurnSource
|
||||
- ISttStrategy
|
||||
- IBrainStrategy
|
||||
- IResponsePlanner
|
||||
- IRobotAdapter
|
||||
- TurnContext
|
||||
- ResponsePlan
|
||||
- port required endpoint and WebSocket behavior from Node to .NET
|
||||
- keep protocol captures and replay fixtures current
|
||||
- harden device bootstrap documentation and scripts
|
||||
- stand up the initial `openjibo.com` information site
|
||||
|
||||
/Jibo.Bridge
|
||||
Jibo adapter / compatibility layer
|
||||
- robot event ingestion
|
||||
- command dispatch back to Jibo
|
||||
- native hook integration
|
||||
## Important Docs
|
||||
|
||||
/Jibo.Brain.Rules
|
||||
deterministic routing / skills / decision tree
|
||||
|
||||
/Jibo.Brain.Local
|
||||
local AI experiments
|
||||
|
||||
/Jibo.Brain.Cloud
|
||||
cloud AI experiments
|
||||
|
||||
/Jibo.Capabilities
|
||||
tools and callable capabilities
|
||||
- weather
|
||||
- time
|
||||
- reminders
|
||||
- skill delegation
|
||||
- expression helpers
|
||||
|
||||
/Jibo.Simulator
|
||||
fake robot target for testing ResponsePlans
|
||||
|
||||
/docs
|
||||
architecture
|
||||
notes
|
||||
traces
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Initial Build Plan
|
||||
|
||||
### Phase 1 — Contracts and runtime skeleton
|
||||
|
||||
Build the core models and interfaces first:
|
||||
|
||||
* `TurnContext`
|
||||
* `ConversationSession`
|
||||
* `SttResult`
|
||||
* `BrainDecision`
|
||||
* `ResponsePlan`
|
||||
* `RenderAction`
|
||||
* `FollowupPolicy`
|
||||
|
||||
### Phase 2 — Minimal broker
|
||||
|
||||
Implement:
|
||||
|
||||
* session open/close
|
||||
* follow-up timeout
|
||||
* topic/context tracking
|
||||
|
||||
### Phase 3 — Bridge skeleton
|
||||
|
||||
Create the adapter boundary for:
|
||||
|
||||
* inbound Jibo events
|
||||
* outbound robot commands
|
||||
|
||||
Even if the first version is mocked, keep the interface stable.
|
||||
|
||||
### Phase 4 — First working path
|
||||
|
||||
Implement a narrow vertical slice:
|
||||
|
||||
* input turn
|
||||
* decision/rules path
|
||||
* weather example
|
||||
* TTS response
|
||||
* follow-up window
|
||||
|
||||
### Phase 5 — Native integration expansion
|
||||
|
||||
Add native delegation for:
|
||||
|
||||
* skills
|
||||
* expression
|
||||
* visuals
|
||||
* gestures
|
||||
* local turn/open follow-up behavior
|
||||
|
||||
### Phase 6 — Hybrid AI routing
|
||||
|
||||
Add:
|
||||
|
||||
* local AI path
|
||||
* cloud AI path
|
||||
* confidence/routing policy
|
||||
|
||||
---
|
||||
|
||||
## First Vertical Slice
|
||||
|
||||
Recommended first demonstration:
|
||||
|
||||
### Example
|
||||
|
||||
User says:
|
||||
|
||||
> Hey Jibo, what’s the weather?
|
||||
|
||||
System flow:
|
||||
|
||||
1. Jibo event arrives through bridge
|
||||
2. .NET broker opens a session
|
||||
3. transcript enters routing
|
||||
4. weather capability is called
|
||||
5. planner builds a `ResponsePlan`
|
||||
6. bridge sends speech + visual action back to Jibo
|
||||
7. follow-up window stays open
|
||||
|
||||
Then:
|
||||
|
||||
> What about the low tonight?
|
||||
|
||||
The same session stays active without wake word if the follow-up window is still open.
|
||||
|
||||
---
|
||||
|
||||
## Near-Term Questions to Answer
|
||||
|
||||
* What is the cleanest robot-side bridge seam:
|
||||
|
||||
* Jetstream hook
|
||||
* skill hook
|
||||
* local service calls
|
||||
* mixed approach
|
||||
|
||||
* What is the smallest command set needed to drive Jibo usefully:
|
||||
|
||||
* speak
|
||||
* gesture
|
||||
* visual
|
||||
* launch skill
|
||||
* keep listening
|
||||
|
||||
* Which pieces should remain native the longest:
|
||||
|
||||
* expression
|
||||
* skill hosting
|
||||
* turn engine
|
||||
* wake-word flow
|
||||
|
||||
* How should custom mode selection activate the hybrid path
|
||||
|
||||
---
|
||||
|
||||
## Practical Strategy
|
||||
|
||||
For now:
|
||||
|
||||
* **develop fast in .NET 10**
|
||||
* **use Jibo as an embodied endpoint**
|
||||
* **keep the robot-side integration thin**
|
||||
* **delay deep on-robot porting until architecture proves itself**
|
||||
|
||||
This keeps experimentation fast while preserving a path toward deeper integration later.
|
||||
|
||||
---
|
||||
|
||||
## Current Working Hypothesis
|
||||
|
||||
The best long-term shape is:
|
||||
|
||||
```text
|
||||
stock Jibo embodiment + modern external cognition + thin hybrid bridge
|
||||
```
|
||||
|
||||
That gives us:
|
||||
|
||||
* rapid iteration
|
||||
* local-first experiments
|
||||
* preserved native robot personality/expression
|
||||
* reduced dependence on brittle legacy cloud paths
|
||||
- [Cloud overview](/src/Jibo.Cloud/README.md)
|
||||
- [Protocol inventory](/docs/protocol-inventory.md)
|
||||
- [Support tiers](/docs/support-tiers.md)
|
||||
- [Device bootstrap path](/docs/device-bootstrap.md)
|
||||
- [Public site plan](/docs/public-site-plan.md)
|
||||
Reference in New Issue
Block a user