Stub in framework for new .net Open Jibo cloud

This commit is contained in:
Jacob Dubin
2026-04-11 07:12:57 -05:00
parent 0c040d1348
commit 8f838787a0
54 changed files with 1933 additions and 897 deletions

View File

@@ -1,439 +1,89 @@
# Hybrid Jibo Runtime Plan
# OpenJibo
## Goal
## Summary
Build a **modern local-first Jibo runtime** while preserving the parts of native Jibo that are still useful:
OpenJibo is the working revival track for Jibo.
* native wake/turn plumbing where helpful
* native skills where helpful
* native embodiment and rendering
* fast experimentation in **.NET 10** off-robot
The near-term plan is intentionally concrete:
Jibos native runtime already exposes a layered service model centered around **Jetstream** for turn/event flow, **GlobalManagerService** for routing, **SkillsService** for skill lifecycle, and **ExpressionService** for embodiment/rendering. The SSM startup is config-driven and mode-driven, which suggests a hybrid mode is a viable path.
1. Build a stable replacement cloud on Azure.
2. Use the existing Node prototype as the protocol oracle and capture harness.
3. Port the hosted implementation to .NET as a modular monolith.
4. Bring real robots online first through RCM plus controlled DNS/TLS patching.
5. Use OTA later to reduce setup friction once the hosted cloud is proven.
---
This keeps the project grounded in what is already working while moving toward a maintainable hosted platform.
## Architecture Direction
## Current Truth
We will keep the **main experimental runtime in .NET 10** and treat Jibo as an embodied endpoint with a thin bridge layer.
The repo now has three distinct lanes:
That means:
- `src/Jibo.Cloud/node`
The discovery server. This is the best source of observed protocol behavior today.
- `src/Jibo.Cloud/dotnet`
The long-term hosted implementation. This is where the stable cloud is being built.
- `src/Jibo.Runtime.Abstractions`
The normalized runtime seam between robot/cloud traffic and modern conversation logic.
* **off-robot**: conversation logic, planning, AI routing, capabilities
* **on-robot**: thin adapter/bridge to native Jibo services
* **native Jibo**: reuse rendering, skill hosting, and useful event seams
---
## High-Level ASCII Flowchart
The key architectural idea is:
```text
+--------------------------------------------------------------+
| NATIVE JIBO LAYER |
|--------------------------------------------------------------|
| Wake / Turn Events |
| - Jetstream |
| - hjHeard / turn started / turn result |
| |
| Native Services |
| - GlobalManagerService |
| - SkillsService |
| - ExpressionService |
| - TTS / Body / Visual / Motion services |
+------------------------------+-------------------------------+
|
| events / hooks / commands
v
+--------------------------------------------------------------+
| JIBO BRIDGE LAYER |
|--------------------------------------------------------------|
| Thin adapter between Jibo and modern runtime |
| |
| Responsibilities: |
| - receive turn/wake events |
| - receive skill context / native state |
| - forward normalized events to .NET runtime |
| - accept ResponsePlans / commands from .NET runtime |
| - invoke native skills / expression / TTS / visuals |
+------------------------------+-------------------------------+
|
| normalized turn context
v
+--------------------------------------------------------------+
| MODERN .NET 10 RUNTIME |
|--------------------------------------------------------------|
| Conversation Broker |
| - session state |
| - follow-up windows |
| - topic/context tracking |
| |
| STT Strategy Selector |
| - native transcript |
| - local STT |
| - cloud STT |
| |
| Brain Strategy Selector |
| - skill/rules path |
| - local AI |
| - cloud AI |
| - hybrid routing |
| |
| Action / Orchestration Planner |
| - gestures / visuals / ESML / delegation |
| - capability/tool calls |
| - build final ResponsePlan |
| |
| Capability Registry |
| - weather / time / reminders / tools |
| - native skill delegation |
| - robot expression helpers |
+------------------------------+-------------------------------+
|
| ResponsePlan / commands
v
+--------------------------------------------------------------+
| EXECUTION TARGETS |
|--------------------------------------------------------------|
| - Native SkillsService |
| - Native ExpressionService |
| - Native TTS / visuals / motion |
| - Local AI backends |
| - Cloud AI backends |
| - External APIs / tools |
+--------------------------------------------------------------+
Jibo device -> OpenJibo cloud -> normalized runtime contracts -> capabilities and planning
```
---
## First Supported Device Path
## Runtime Flow
The first supported recovery path is enthusiast-friendly, not zero-touch:
```text
[Wake Word / Turn / Follow-up]
|
v
[Jibo Native Events]
|
v
[Jibo Bridge Layer]
|
v
[Conversation Broker (.NET)]
|
v
[STT Strategy Selection]
|
v
[Brain Strategy Selection]
/ | \
/ | \
[Skill/Rules] [Local AI] [Cloud AI]
\ | /
\ | /
[Planner]
|
v
[ResponsePlan Built]
|
v
[Jibo Bridge Layer]
|
v
[Skills / Expression / TTS / Motion / Visuals]
|
v
[Follow-up Window or Timeout]
QR Wi-Fi -> controlled router/DNS -> redirect legacy Jibo hosts ->
RCM/device patch for TLS and host acceptance -> OpenJibo cloud on Azure
```
---
That path is documented in [docs/device-bootstrap.md](C:/Projects/JiboExperiments/OpenJibo/docs/device-bootstrap.md).
## Planned Hybrid Mode
Jibos startup and service composition are mode-driven and config-driven, so the long-term plan is to add a **new custom mode** rather than replacing stock behavior outright.
### Candidate mode names
* `hybrid`
* `openjibo`
* `revival`
* `local-first`
### Intent of the mode
The custom mode should:
* preserve normal mode for stock behavior
* preserve developer mode for native debugging
* enable the bridge/runtime path for hybrid experiments
* allow selective routing between old and new Jibo behavior
---
## Design Principles
### 1. Keep Jibo-specific code at the edges
The .NET runtime should know about:
* turns
* sessions
* plans
* capabilities
* render actions
It should **not** depend directly on:
* Electron internals
* SSM implementation quirks
* old Linux deployment constraints
### 2. Reuse native embodiment
Native Jibo rendering is valuable. ExpressionService appears to own animation, attention, DOF arbitration, and embodied output, so it should be reused as long as possible.
### 3. Replace cognition before replacing embodiment
The first thing to modernize is:
* routing
* planning
* AI selection
* follow-up conversation behavior
Not necessarily:
* body motion
* TTS
* expression plumbing
### 4. Favor thin robot-side code
The bridge on Jibo should stay small and stable. Fast-moving logic belongs in .NET 10.
### 5. Everything should converge to a ResponsePlan
Regardless of source:
* skill
* rules engine
* local AI
* cloud AI
the result should become a single normalized response/output plan.
---
## Native Jibo Mapping
Based on current reverse engineering, the native service boundaries map roughly like this: Jetstream is the turn/event seam, GlobalManagerService performs routing and skill-launch logic, SkillsService manages skill lifecycle, and ExpressionService handles embodiment/rendering.
## Repo Map
```text
Our Concept Native Jibo Equivalent
---------------------------- --------------------------------
Wake / Turn Source Jetstream
Conversation Broker split across Jetstream + routing
Brain Selection GlobalManagerService + skills
Skill Execution SkillsService
Renderer / Embodiment ExpressionService
OpenJibo/
docs/
device-bootstrap.md
protocol-inventory.md
public-site-plan.md
support-tiers.md
scripts/bootstrap/
Discover-JiboHosts.ps1
Generate-JiboDnsOverrides.ps1
Test-OpenJiboRouting.ps1
src/
Jibo.Cloud/
node/
dotnet/
Jibo.Runtime.Abstractions/
Playground/
OpenJibo.Site/
```
---
## Decisions Locked In
## Proposed Project Layout
- The first milestone is `core revive`, not full protocol parity.
- Azure SQL is the relational system of record for the hosted cloud.
- Billing and donations are future-compatible concerns, not phase-one delivery requirements.
- OTA is a phase-two simplification strategy, not the initial dependency.
```text
/src
/Jibo.Runtime
Core runtime orchestration
- ConversationBroker
- Session state
- Turn pipeline
- ResponsePlan builder
## Near-Term Work
/Jibo.Runtime.Abstractions
Interfaces and models
- ITurnSource
- ISttStrategy
- IBrainStrategy
- IResponsePlanner
- IRobotAdapter
- TurnContext
- ResponsePlan
- port required endpoint and WebSocket behavior from Node to .NET
- keep protocol captures and replay fixtures current
- harden device bootstrap documentation and scripts
- stand up the initial `openjibo.com` information site
/Jibo.Bridge
Jibo adapter / compatibility layer
- robot event ingestion
- command dispatch back to Jibo
- native hook integration
## Important Docs
/Jibo.Brain.Rules
deterministic routing / skills / decision tree
/Jibo.Brain.Local
local AI experiments
/Jibo.Brain.Cloud
cloud AI experiments
/Jibo.Capabilities
tools and callable capabilities
- weather
- time
- reminders
- skill delegation
- expression helpers
/Jibo.Simulator
fake robot target for testing ResponsePlans
/docs
architecture
notes
traces
```
---
## Initial Build Plan
### Phase 1 — Contracts and runtime skeleton
Build the core models and interfaces first:
* `TurnContext`
* `ConversationSession`
* `SttResult`
* `BrainDecision`
* `ResponsePlan`
* `RenderAction`
* `FollowupPolicy`
### Phase 2 — Minimal broker
Implement:
* session open/close
* follow-up timeout
* topic/context tracking
### Phase 3 — Bridge skeleton
Create the adapter boundary for:
* inbound Jibo events
* outbound robot commands
Even if the first version is mocked, keep the interface stable.
### Phase 4 — First working path
Implement a narrow vertical slice:
* input turn
* decision/rules path
* weather example
* TTS response
* follow-up window
### Phase 5 — Native integration expansion
Add native delegation for:
* skills
* expression
* visuals
* gestures
* local turn/open follow-up behavior
### Phase 6 — Hybrid AI routing
Add:
* local AI path
* cloud AI path
* confidence/routing policy
---
## First Vertical Slice
Recommended first demonstration:
### Example
User says:
> Hey Jibo, whats the weather?
System flow:
1. Jibo event arrives through bridge
2. .NET broker opens a session
3. transcript enters routing
4. weather capability is called
5. planner builds a `ResponsePlan`
6. bridge sends speech + visual action back to Jibo
7. follow-up window stays open
Then:
> What about the low tonight?
The same session stays active without wake word if the follow-up window is still open.
---
## Near-Term Questions to Answer
* What is the cleanest robot-side bridge seam:
* Jetstream hook
* skill hook
* local service calls
* mixed approach
* What is the smallest command set needed to drive Jibo usefully:
* speak
* gesture
* visual
* launch skill
* keep listening
* Which pieces should remain native the longest:
* expression
* skill hosting
* turn engine
* wake-word flow
* How should custom mode selection activate the hybrid path
---
## Practical Strategy
For now:
* **develop fast in .NET 10**
* **use Jibo as an embodied endpoint**
* **keep the robot-side integration thin**
* **delay deep on-robot porting until architecture proves itself**
This keeps experimentation fast while preserving a path toward deeper integration later.
---
## Current Working Hypothesis
The best long-term shape is:
```text
stock Jibo embodiment + modern external cognition + thin hybrid bridge
```
That gives us:
* rapid iteration
* local-first experiments
* preserved native robot personality/expression
* reduced dependence on brittle legacy cloud paths
- [Cloud overview](/src/Jibo.Cloud/README.md)
- [Protocol inventory](/docs/protocol-inventory.md)
- [Support tiers](/docs/support-tiers.md)
- [Device bootstrap path](/docs/device-bootstrap.md)
- [Public site plan](/docs/public-site-plan.md)