Jibo-Revival-Group/JiboExperiments

Fork 0

Files

Kevin bf81fadd62

More original server design and communications documentation

2026-05-23 01:20:55 +03:00

21 KiB

Raw Blame History

Skill Framework Design Document

Overview

The Skill Framework provides the foundation for building cloud-based skills for the Jibo robot. It consists of a base class for all skills, a graph-based state machine for complex conversational flows, and a system for generating JCP (Jibo Command Protocol) actions that are sent to the robot.

Location

packages/baseskill/src/

Core Components

BaseSkill (`BaseSkill.ts`)

Abstract base class that all cloud skills must extend.

Purpose: Provides common HTTP handling and error handling for all skills.

Key Features:

Extends BaseHttpHandler from @jibo/utils
Registers POST handler at / endpoint
Validates request structure
Tracks timing for each request
Provides error response builder

Constructor:

constructor(public name: string)

Abstract Method:

protected abstract handle(request: PegasusRequest<SkillRequest>): Promise<SkillResponse>;

Lifecycle Methods:

init(): Promise<void> - Override to initialize resources (load files, connect to services)
buildErrorResponse(err: Error): ErrorResponse - Builds standardized error response

HTTP Handler:

Accepts POST requests at /
Logs request type
Calls handle() method
Adds timing information
Catches errors and returns error response

GraphSkill (`GraphSkill.ts`)

Extends BaseSkill with a graph-based state machine for complex conversational flows.

Purpose: Enables skills to define their logic as a series of interconnected nodes (states) with transitions.

Key Features:

Implements GraphFactory interface
Manages graph execution via GraphManager singleton
Supports skill redirects
Tracks analytics events
Supports supplemental behaviors (parallel/sequence)
Handles both launch and update requests

Constructor:

constructor(name: string)

Abstract Method:

abstract createGraph(): Graph<ExitTransition>

Request Handling:

Launch Requests (LISTEN_LAUNCH or PROACTIVE_LAUNCH):

Validates request data (accountID, robotID, skill ID)
Initializes skill session data
Tracks SKILL_ENTRY analytics event
Calls GraphManager.instance.start(graph, data) to begin graph execution
Returns SKILL_ACTION or SKILL_REDIRECT response

Update Requests (LISTEN_UPDATE):

Validates request data
Calls GraphManager.instance.exitNode(data) to process action results
Returns next SKILL_ACTION or final response

Response Types:

SKILL_REDIRECT - Redirects to another skill

{
  type: "SKILL_REDIRECT",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    skillID: string,
    nlu?: NLUResult,
    asr?: ASRResult,
    memo?: any
  }
}

SKILL_ACTION - Returns JCP action for robot to execute

{
  type: "SKILL_ACTION",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    action: JCPAction,
    analytics: AnalyticsData,
    final: boolean,
    fireAndForget: boolean
  }
}

Final Response - No action, transaction complete

{
  type: "SKILL_ACTION",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    action: null,
    analytics: AnalyticsData,
    final: true,
    fireAndForget: true
  }
}

Convenience Methods:

track(data, event, properties) - Track analytics event
overrideSpeaker(data, id) - Override current speaker in context
addParallelBehavior(data, behavior) - Add behavior to execute in parallel
addSequenceBehavior(data, behavior) - Add behavior to execute in sequence

Supplemental Behaviors Injection:

When a skill returns a JCP action, the framework injects any supplemental behaviors that were added during execution:

If sequence behaviors exist, wraps main action in a Sequence
If parallel behaviors exist, wraps result in a Parallel
Final JCP action is sent to robot

Example:

// Skill adds parallel behavior
this.addParallelBehavior(data, SetPresentPersonBehavior);

// Skill returns main action
return { action: SayTextBehavior };

// Framework injects: Parallel([SetPresentPersonBehavior, SayTextBehavior])

Graph System

Graph (`graph/Graph.ts`)

Represents a directed graph of connected nodes (states).

Purpose: Defines the structure of a skill's conversation flow.

Key Properties:

name: string - Graph name
initial: Node - Starting node
nodes: Set<Node> - All nodes in graph
exitTransitions: Map<ExitTransition, TransitionContainer[]> - Exit transition mappings

Constructor:

constructor(name: string, exitTransitionNames: ExitTransition[])

Methods:

setInitialNode(node) - Sets the starting node
addNode(node, transitionMapping) - Adds a node and connects its transitions
addSubGraph(subGraph, transitionMapping) - Adds a subgraph and connects its exits
finalize() - Validates graph and locks it for execution
writeDotFile(filePath) - Generates GraphViz dot file for visualization

Transition Mapping:

[
  [TransitionName, DestinationNode],  // Transition to another node
  [TransitionName, ExitTransition]   // Exit from graph
]

Validation (in finalize):

All nodes must be reachable from initial node
All exit transitions must be connected
All transitions must have valid destinations
No duplicate transition names

Subgraphs:

Graphs can be nested within other graphs
Subgraph exit transitions connect to parent graph nodes
Enables hierarchical organization of complex flows
Nodes can belong to multiple graphs (for subgraph sharing)

GraphViz Visualization:

Generates .dot files for graph visualization
Color-codes initial node, regular nodes, and exit states
Shows hierarchical structure with clusters
Labels transitions with their names

GraphManager (`graph/GraphManager.ts`)

Singleton that manages graph execution and skill sessions.

Purpose: Coordinates node execution and maintains session state.

Singleton Pattern:

GraphManager.instance  // Access singleton

Key Responsibilities:

Assigns unique IDs to all nodes
Maps node IDs to node instances
Manages skill session lifecycle
Executes node enter/exit lifecycle
Handles transitions between nodes

Session Structure:

{
  id: string,           // Session UUID
  nodeID: number,      // Current node ID
  data: any,           // Skill-specific session data
  trace: [             // History of transitions
    { nodeID: number, transition: string }
  ]
}

Execution Flow:

Start Graph (launch request):

start(graph, data)
  → Creates new session
  → Sets initial node
  → Calls enterNode()

Enter Node:

enterNode(data)
  → Fetches current node
  → Calls node.enter(data)
  → Updates trace
  → If action returned: return action
  → Else: call exitNode()

Exit Node:

exitNode(data)
  → Fetches current node
  → Calls node.exit(data)
  → If transition returned: executeTransition()
  → Else: return (terminal)

Execute Transition:

executeTransition(node, result, data)
  → Validates transition exists
  → Updates trace with transition name
  → If terminal: return null
  → Else: update nodeID, call enterNode()

Node ID Assignment:

Counter starts at 0, increments for each node
Bidirectional mapping: node ↔ ID
Enables serialization of session state

Node (`graph/nodes/Node.ts`)

Abstract base class for all graph nodes.

Purpose: Defines a state in the skill's conversation flow.

Key Properties:

id: number - Unique ID assigned by GraphManager
name: string - Node name
transitionNames: Transition[] - Valid exit transitions
graphs: Graph[] - Graphs this node belongs to
transitions: Map<Transition, TransitionContainer> - Transition destinations

Constructor:

constructor(name: string, transitionNames: Transition[])

Abstract Methods:

abstract async enter(data: Data): Promise<EnterResponse>

Called when node is entered
Returns action to execute, redirect, or nothing

abstract async exit(data: Data): Promise<ExitResponse>

Called with action results (if action was issued)
Returns next transition or nothing (terminal)

Data Structure:

Data = {
  // From request
  general: { accountID, robotID, lang, release },
  runtime: { character, location, loop, perception, dialog },
  skill: { id, session },
  result?: any,  // Action results for UPDATE
  
  // Added by framework
  req: PegasusRequest,
  log: Log,
  local: any,           // Skill-local data
  analytics: {},        // Analytics events
  behaviors: {          // Supplemental behaviors
    parallel: [],
    sequence: []
  }
}

Response Types:

EnterResponse:

{
  action?: Action,      // JCP action to execute
  redirect?: RedirectData,  // Redirect to another skill
  final?: boolean       // Is this the final response?
}

ExitResponse:

{
  transition?: string,  // Next transition to take
  result?: any,         // Result to pass to next node
  redirect?: RedirectData
}

Built-in Node Types:

DefaultNode - Simple terminal node
- Returns no action
- Transitions to Done
NoOpNode - No operation node
- Returns no action
- Can have custom transitions
JCPNode - Returns a JCP action
- Returns specified JCP behavior
- Can be terminal or continue
TrueFalseNode - Conditional branching
- Evaluates condition
- Transitions based on true/false
SetLooperIDNode - Sets speaker ID
- Updates perception.speaker in context
- Useful for multi-turn conversations

Node Traversal:

forEachDescendent(handler) - BFS traversal of all descendant nodes
Used for graph validation and analysis

Skill Request/Response Protocol

Skill Request Types

Location: packages/interfaces/src/skill/request.ts

MessageType:

LISTEN_LAUNCH - Launch skill from listen interaction
LISTEN_UPDATE - Update skill with action results
PROACTIVE_LAUNCH - Launch skill proactively

Request Structure:

{
  type: MessageType,
  msgID: "uuid",
  ts: 1234567890,
  data: {
    general: {
      accountID: string,
      robotID: string,
      lang: string,
      release: string
    },
    runtime: {
      character: { emotion, motivation },
      location: { city, state, country, lat, lng },
      loop: { users, jibo, owner, loopId },
      perception: { speaker, peoplePresent },
      dialog: { referent }
    },
    skill: {
      id: string,
      session?: {
        id: string,
        nodeID: number,
        data: any,
        trace: [{ nodeID, transition }]
      }
    },
    result?: any,  // Action results for UPDATE
    nlu?: NLUResult,
    asr?: ASRResult,
    memo?: any
  }
}

Skill Response Types

Location: packages/interfaces/src/skill/response.ts

ResponseType:

SKILL_ACTION - Returns action to execute
SKILL_REDIRECT - Redirects to another skill
ERROR - Error response

SKILL_ACTION Response:

{
  type: "SKILL_ACTION",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    action: JCPAction | null,
    analytics: AnalyticsData,
    final: boolean,
    fireAndForget: boolean
  }
}

SKILL_REDIRECT Response:

{
  type: "SKILL_REDIRECT",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    skillID: string,
    nlu?: NLUResult,
    asr?: ASRResult,
    memo?: any
  }
}

ERROR Response:

{
  type: "ERROR",
  msgID: "uuid",
  ts: 1234567890,
  data: {
    message: string,
    skill: { id: string }
  }
}

JCP Actions

Location: packages/interfaces/src/skill/action.ts

Purpose: Defines behaviors that the robot should execute.

ActionType:

JCP - Jibo Command Protocol action

JCPAction Structure:

{
  type: "JCP",
  config: {
    version: "1.0.0",
    jcp: SupportedBehaviors
  }
}

SupportedBehaviors:

SLIM - Single behavior execution
Sequence - Sequential behavior execution
Parallel - Parallel behavior execution
SetPresentPerson - Set focused person
ImpactEmotion - Modify Jibo's emotional state

Helper Function:

generateJCPAction(behavior): JCPAction

Wraps a behavior as a JCP action with version 2.0.

MIM (Motion Interaction Model) System

Location: packages/baseskill/src/graph/mims/

Purpose: Provides pre-built graph structures for playing MIM animations.

MIM Files:

.mim files contain animation definitions
Organized in directories:
- scripted-responses - Pre-scripted responses
- emotion-responses - Emotion-based responses
- core-responses - Fallback responses

MIM Factories:

ANFactory - Animation Node Factory

Creates graph for playing a single MIM
Supports prompt data injection
Can be final or continue

MANFactory - Multiple Animation Node Factory

Creates graph for playing multiple MIMs
Supports random selection
Can be final or continue

MIMFactory - General MIM Factory

Creates graph for MIM playback
Supports semi-specific responses
Handles category-based selection

QNFactory - Question Node Factory

Creates graph for asking questions
Supports opt-in flows
Handles user responses

OptInFactory - Opt-In Node Factory

Creates graph for opt-in offers
Tracks user acceptance/rejection
Handles analytics

MIM Factory Options:

{
  mimDataProvider: (data) => string[],  // Function to get MIM paths
  promptDataProvider?: (data) => any,   // Function to get prompt data
  final: boolean                        // Is this the final action?
}

Example Usage (Chitchat Skill):

const doMIMOptions: MimFactoryOptions = {
  mimDataProvider: (data) => data.local.path,
  promptDataProvider: (data) => data.local.promptData,
  final: true
};
const doMIM = new ANFactory('Do MIM', doMIMOptions).createGraph();

Semi-Specific Responses:

MIMs with _SS_ suffix are semi-specific
Match specific categories (e.g., time, weather)
CSV files define category members
Enables context-aware responses

SkillService (`SkillService.ts`)

Service wrapper that hosts a skill as an HTTP service.

Purpose: Provides the service infrastructure for running a skill.

Constructor:

constructor(private skillV1: BaseSkill)

HTTP Handler:

Registers skill at /v1/main endpoint
No authentication required (handled by Hub)

Initialization:

async init(port: number)
  → Starts HTTP server
  → Calls skill.init()

Analytics

Location: packages/interfaces/src/skill/analytics.ts

Purpose: Track skill events for analysis.

AnalyticsData Structure:

{
  [skillName: string]: [
    {
      event: string,
      properties: any
    }
  ]
}

Built-in Events:

SKILL_ENTRY - Skill launched
SKILL_OFFER - Opt-in offer presented

Skill Entry Analytics:

{
  initial_intent: string,
  domain: string,
  was_hey_jibo_launch: boolean,
  user_initiated: boolean,
  last_skill: string
}

Tracking:

this.track(data, 'CustomEvent', { key: value });

Events are automatically included in SKILL_ACTION responses.

Server-to-Robot Communication Flow

Skill Response to Hub

When a skill returns a response, the Hub forwards it to the robot:

SKILL_ACTION Response:

Skill returns SKILL_ACTION with JCP behavior
Hub adds timing information
Hub sends SKILL_ACTION to robot via WebSocket
Robot executes JCP behavior
Robot sends CMD_RESULT back to Hub
Hub sends LISTEN_UPDATE to skill
Skill processes result, returns next action

Final SKILL_ACTION:

Skill returns SKILL_ACTION with final: true
Hub sends to robot
Robot executes (if action present)
Transaction complete

SKILL_REDIRECT:

Skill returns SKILL_REDIRECT
Hub emits SKILL_REDIRECT notification to robot
Hub launches new skill
New skill proceeds normally

JCP Action Execution

Single Behavior (SLIM):

{
  type: "JCP",
  config: {
    version: "1.0.0",
    jcp: SayTextBehavior
  }
}

Robot executes single behavior immediately.

Sequence Behavior:

{
  type: "JCP",
  config: {
    version: "1.0.0",
    jcp: Sequence([
      LookAtBehavior,
      SayTextBehavior,
      GestureBehavior
    ])
  }
}

Robot executes behaviors in order.

Parallel Behavior:

{
  type: "JCP",
  config: {
    version: "1.0.0",
    jcp: Parallel([
      SetPresentPersonBehavior,
      SayTextBehavior
    ])
  }
}

Robot executes behaviors simultaneously.

Supplemental Behaviors

Skills can add behaviors that execute alongside the main action:

Parallel Supplemental:

this.addParallelBehavior(data, SetPresentPersonBehavior);
// Main action: SayTextBehavior
// Result: Parallel([SetPresentPersonBehavior, SayTextBehavior])

Sequence Supplemental:

this.addSequenceBehavior(data, LookAtBehavior);
// Main action: SayTextBehavior
// Result: Sequence([LookAtBehavior, SayTextBehavior])

Combined:

this.addSequenceBehavior(data, LookAtBehavior);
this.addParallelBehavior(data, SetPresentPersonBehavior);
// Result: Parallel([SetPresentPersonBehavior, Sequence([LookAtBehavior, SayTextBehavior])])

Example Skill Implementation

Chitchat Skill

Location: packages/chitchat-skill/src/Chitchat.ts

Purpose: Handles conversational interactions with the robot.

Graph Structure:

IntentSplitNode - Splits based on intent type
ProcessQueryNode - Processes user query, selects response
DoMIM (ANFactory) - Plays selected MIM animation
Complete (DefaultNode) - Terminates skill

Initialization:

Loads MIM files from directories
Builds semi-specific mappings
Reads category CSV files

Response Selection:

Scripted responses for common queries
Emotion responses for emotional queries
Semi-specific responses for context-aware queries
Fallback responses for unknown queries

MIM Selection:

Based on intent and entities
Considers semi-specific categories
Falls back to core responses

Skill Development Guide

Creating a Simple Skill

import { BaseSkill } from '@jibo/baseskill';
import { skill } from '@jibo/interfaces';

export class MySkill extends BaseSkill {
  constructor() {
    super('my-skill');
  }

  protected async handle(req: PegasusRequest<SkillRequest>): Promise<SkillResponse> {
    const data = req.body.data;
    
    // Process request
    const action = generateJCPAction(SayTextBehavior("Hello!"));
    
    return {
      type: skill.response.ResponseType.SKILL_ACTION,
      data: {
        action: action,
        final: true,
        fireAndForget: true
      },
      ts: Date.now(),
      msgID: getUUID()
    };
  }
}

Creating a Graph Skill

import { GraphSkill, graph } from '@jibo/baseskill';

enum Transition {
  Done = 'Done',
  Retry = 'Retry'
}

export class MyGraphSkill extends GraphSkill<Transition> {
  constructor() {
    super('my-graph-skill');
  }

  createGraph(): graph.Graph<Transition> {
    const g = new graph.Graph('My Skill', generateTransitions(Transition));
    
    const startNode = new MyStartNode('Start');
    const endNode = new graph.nodes.dn.DefaultNode('End');
    
    g.addNode(startNode, [[Transition.Done, endNode]]);
    g.addNode(endNode, [[graph.nodes.dn.Transition.Done, Transition.Done]]);
    
    g.finalize();
    return g;
  }
}

Creating a Custom Node

import { Node, Data, EnterResponse, ExitResponse } from '@jibo/baseskill';

enum MyTransition {
  Success = 'Success',
  Failure = 'Failure'
}

class MyNode extends Node<MyTransition> {
  constructor() {
    super('MyNode', [MyTransition.Success, MyTransition.Failure]);
  }

  async enter(data: Data): Promise<EnterResponse> {
    // Perform logic
    const action = generateJCPAction(SayTextBehavior("Processing..."));
    return { action };
  }

  async exit(data: Data): Promise<ExitResponse> {
    // Process action results
    if (data.result.success) {
      return { transition: MyTransition.Success };
    } else {
      return { transition: MyTransition.Failure };
    }
  }
}

Key Design Principles

State Machine - Graph-based state machine for complex flows
Single Responsibility - Each node handles one piece of logic
Reusability - Subgraphs and node types can be reused
Testability - Nodes can be tested independently
Visualization - GraphViz generation for debugging
Analytics - Built-in event tracking
Flexibility - Supports both simple and complex skills
Supplemental Behaviors - Easy to add parallel/sequence actions

21 KiB Raw Blame History

Skill Framework Design Document

Overview

Location

Core Components

BaseSkill (BaseSkill.ts)

GraphSkill (GraphSkill.ts)

Graph System

Graph (graph/Graph.ts)

GraphManager (graph/GraphManager.ts)

Node (graph/nodes/Node.ts)

Skill Request/Response Protocol

Skill Request Types

Skill Response Types

JCP Actions

MIM (Motion Interaction Model) System

SkillService (SkillService.ts)

Analytics

Server-to-Robot Communication Flow

Skill Response to Hub

JCP Action Execution

Supplemental Behaviors

Example Skill Implementation

Chitchat Skill

Skill Development Guide

Creating a Simple Skill

Creating a Graph Skill

Creating a Custom Node

Key Design Principles

21 KiB

Raw Blame History

BaseSkill (`BaseSkill.ts`)

GraphSkill (`GraphSkill.ts`)

Graph (`graph/Graph.ts`)

GraphManager (`graph/GraphManager.ts`)

Node (`graph/nodes/Node.ts`)

SkillService (`SkillService.ts`)