Files
JiboExperiments/OpenJibo
2026-03-23 07:51:32 -05:00
..
2026-03-23 07:51:32 -05:00
2026-03-23 07:51:32 -05:00

Hybrid Jibo Runtime Plan

Goal

Build a modern local-first Jibo runtime while preserving the parts of native Jibo that are still useful:

  • native wake/turn plumbing where helpful
  • native skills where helpful
  • native embodiment and rendering
  • fast experimentation in .NET 10 off-robot

Jibos native runtime already exposes a layered service model centered around Jetstream for turn/event flow, GlobalManagerService for routing, SkillsService for skill lifecycle, and ExpressionService for embodiment/rendering. The SSM startup is config-driven and mode-driven, which suggests a hybrid mode is a viable path.


Architecture Direction

We will keep the main experimental runtime in .NET 10 and treat Jibo as an embodied endpoint with a thin bridge layer.

That means:

  • off-robot: conversation logic, planning, AI routing, capabilities
  • on-robot: thin adapter/bridge to native Jibo services
  • native Jibo: reuse rendering, skill hosting, and useful event seams

High-Level ASCII Flowchart

+--------------------------------------------------------------+
|                     NATIVE JIBO LAYER                        |
|--------------------------------------------------------------|
| Wake / Turn Events                                           |
|  - Jetstream                                                 |
|  - hjHeard / turn started / turn result                      |
|                                                              |
| Native Services                                              |
|  - GlobalManagerService                                      |
|  - SkillsService                                             |
|  - ExpressionService                                         |
|  - TTS / Body / Visual / Motion services                     |
+------------------------------+-------------------------------+
                               |
                               | events / hooks / commands
                               v
+--------------------------------------------------------------+
|                     JIBO BRIDGE LAYER                        |
|--------------------------------------------------------------|
| Thin adapter between Jibo and modern runtime                 |
|                                                              |
| Responsibilities:                                            |
|  - receive turn/wake events                                  |
|  - receive skill context / native state                      |
|  - forward normalized events to .NET runtime                 |
|  - accept ResponsePlans / commands from .NET runtime         |
|  - invoke native skills / expression / TTS / visuals         |
+------------------------------+-------------------------------+
                               |
                               | normalized turn context
                               v
+--------------------------------------------------------------+
|                  MODERN .NET 10 RUNTIME                      |
|--------------------------------------------------------------|
|  Conversation Broker                                         |
|    - session state                                           |
|    - follow-up windows                                       |
|    - topic/context tracking                                  |
|                                                              |
|  STT Strategy Selector                                       |
|    - native transcript                                       |
|    - local STT                                               |
|    - cloud STT                                               |
|                                                              |
|  Brain Strategy Selector                                     |
|    - skill/rules path                                        |
|    - local AI                                                |
|    - cloud AI                                                |
|    - hybrid routing                                          |
|                                                              |
|  Action / Orchestration Planner                              |
|    - gestures / visuals / ESML / delegation                  |
|    - capability/tool calls                                   |
|    - build final ResponsePlan                                |
|                                                              |
|  Capability Registry                                         |
|    - weather / time / reminders / tools                      |
|    - native skill delegation                                 |
|    - robot expression helpers                                |
+------------------------------+-------------------------------+
                               |
                               | ResponsePlan / commands
                               v
+--------------------------------------------------------------+
|                    EXECUTION TARGETS                         |
|--------------------------------------------------------------|
|  - Native SkillsService                                      |
|  - Native ExpressionService                                  |
|  - Native TTS / visuals / motion                             |
|  - Local AI backends                                         |
|  - Cloud AI backends                                         |
|  - External APIs / tools                                     |
+--------------------------------------------------------------+

Runtime Flow

[Wake Word / Turn / Follow-up]
              |
              v
      [Jibo Native Events]
              |
              v
        [Jibo Bridge Layer]
              |
              v
     [Conversation Broker (.NET)]
              |
              v
     [STT Strategy Selection]
              |
              v
    [Brain Strategy Selection]
      /          |           \
     /           |            \
[Skill/Rules] [Local AI] [Cloud AI]
      \           |            /
       \          |           /
              [Planner]
                 |
                 v
         [ResponsePlan Built]
                 |
                 v
          [Jibo Bridge Layer]
                 |
                 v
 [Skills / Expression / TTS / Motion / Visuals]
                 |
                 v
      [Follow-up Window or Timeout]

Planned Hybrid Mode

Jibos startup and service composition are mode-driven and config-driven, so the long-term plan is to add a new custom mode rather than replacing stock behavior outright.

Candidate mode names

  • hybrid
  • openjibo
  • revival
  • local-first

Intent of the mode

The custom mode should:

  • preserve normal mode for stock behavior
  • preserve developer mode for native debugging
  • enable the bridge/runtime path for hybrid experiments
  • allow selective routing between old and new Jibo behavior

Design Principles

1. Keep Jibo-specific code at the edges

The .NET runtime should know about:

  • turns
  • sessions
  • plans
  • capabilities
  • render actions

It should not depend directly on:

  • Electron internals
  • SSM implementation quirks
  • old Linux deployment constraints

2. Reuse native embodiment

Native Jibo rendering is valuable. ExpressionService appears to own animation, attention, DOF arbitration, and embodied output, so it should be reused as long as possible.

3. Replace cognition before replacing embodiment

The first thing to modernize is:

  • routing
  • planning
  • AI selection
  • follow-up conversation behavior

Not necessarily:

  • body motion
  • TTS
  • expression plumbing

4. Favor thin robot-side code

The bridge on Jibo should stay small and stable. Fast-moving logic belongs in .NET 10.

5. Everything should converge to a ResponsePlan

Regardless of source:

  • skill
  • rules engine
  • local AI
  • cloud AI

the result should become a single normalized response/output plan.


Native Jibo Mapping

Based on current reverse engineering, the native service boundaries map roughly like this: Jetstream is the turn/event seam, GlobalManagerService performs routing and skill-launch logic, SkillsService manages skill lifecycle, and ExpressionService handles embodiment/rendering.

Our Concept                    Native Jibo Equivalent
----------------------------  --------------------------------
Wake / Turn Source            Jetstream
Conversation Broker           split across Jetstream + routing
Brain Selection               GlobalManagerService + skills
Skill Execution               SkillsService
Renderer / Embodiment         ExpressionService

Proposed Project Layout

/src
  /Jibo.Runtime
    Core runtime orchestration
    - ConversationBroker
    - Session state
    - Turn pipeline
    - ResponsePlan builder

  /Jibo.Runtime.Abstractions
    Interfaces and models
    - ITurnSource
    - ISttStrategy
    - IBrainStrategy
    - IResponsePlanner
    - IRobotAdapter
    - TurnContext
    - ResponsePlan

  /Jibo.Bridge
    Jibo adapter / compatibility layer
    - robot event ingestion
    - command dispatch back to Jibo
    - native hook integration

  /Jibo.Brain.Rules
    deterministic routing / skills / decision tree

  /Jibo.Brain.Local
    local AI experiments

  /Jibo.Brain.Cloud
    cloud AI experiments

  /Jibo.Capabilities
    tools and callable capabilities
    - weather
    - time
    - reminders
    - skill delegation
    - expression helpers

  /Jibo.Simulator
    fake robot target for testing ResponsePlans

/docs
  architecture
  notes
  traces

Initial Build Plan

Phase 1 — Contracts and runtime skeleton

Build the core models and interfaces first:

  • TurnContext
  • ConversationSession
  • SttResult
  • BrainDecision
  • ResponsePlan
  • RenderAction
  • FollowupPolicy

Phase 2 — Minimal broker

Implement:

  • session open/close
  • follow-up timeout
  • topic/context tracking

Phase 3 — Bridge skeleton

Create the adapter boundary for:

  • inbound Jibo events
  • outbound robot commands

Even if the first version is mocked, keep the interface stable.

Phase 4 — First working path

Implement a narrow vertical slice:

  • input turn
  • decision/rules path
  • weather example
  • TTS response
  • follow-up window

Phase 5 — Native integration expansion

Add native delegation for:

  • skills
  • expression
  • visuals
  • gestures
  • local turn/open follow-up behavior

Phase 6 — Hybrid AI routing

Add:

  • local AI path
  • cloud AI path
  • confidence/routing policy

First Vertical Slice

Recommended first demonstration:

Example

User says:

Hey Jibo, whats the weather?

System flow:

  1. Jibo event arrives through bridge
  2. .NET broker opens a session
  3. transcript enters routing
  4. weather capability is called
  5. planner builds a ResponsePlan
  6. bridge sends speech + visual action back to Jibo
  7. follow-up window stays open

Then:

What about the low tonight?

The same session stays active without wake word if the follow-up window is still open.


Near-Term Questions to Answer

  • What is the cleanest robot-side bridge seam:

    • Jetstream hook
    • skill hook
    • local service calls
    • mixed approach
  • What is the smallest command set needed to drive Jibo usefully:

    • speak
    • gesture
    • visual
    • launch skill
    • keep listening
  • Which pieces should remain native the longest:

    • expression
    • skill hosting
    • turn engine
    • wake-word flow
  • How should custom mode selection activate the hybrid path


Practical Strategy

For now:

  • develop fast in .NET 10
  • use Jibo as an embodied endpoint
  • keep the robot-side integration thin
  • delay deep on-robot porting until architecture proves itself

This keeps experimentation fast while preserving a path toward deeper integration later.


Current Working Hypothesis

The best long-term shape is:

stock Jibo embodiment + modern external cognition + thin hybrid bridge

That gives us:

  • rapid iteration
  • local-first experiments
  • preserved native robot personality/expression
  • reduced dependence on brittle legacy cloud paths