21 KiB
Skill Framework Design Document
Overview
The Skill Framework provides the foundation for building cloud-based skills for the Jibo robot. It consists of a base class for all skills, a graph-based state machine for complex conversational flows, and a system for generating JCP (Jibo Command Protocol) actions that are sent to the robot.
Location
packages/baseskill/src/
Core Components
BaseSkill (BaseSkill.ts)
Abstract base class that all cloud skills must extend.
Purpose: Provides common HTTP handling and error handling for all skills.
Key Features:
- Extends
BaseHttpHandlerfrom@jibo/utils - Registers POST handler at
/endpoint - Validates request structure
- Tracks timing for each request
- Provides error response builder
Constructor:
constructor(public name: string)
Abstract Method:
protected abstract handle(request: PegasusRequest<SkillRequest>): Promise<SkillResponse>;
Lifecycle Methods:
init(): Promise<void>- Override to initialize resources (load files, connect to services)buildErrorResponse(err: Error): ErrorResponse- Builds standardized error response
HTTP Handler:
- Accepts POST requests at
/ - Logs request type
- Calls
handle()method - Adds timing information
- Catches errors and returns error response
GraphSkill (GraphSkill.ts)
Extends BaseSkill with a graph-based state machine for complex conversational flows.
Purpose: Enables skills to define their logic as a series of interconnected nodes (states) with transitions.
Key Features:
- Implements
GraphFactoryinterface - Manages graph execution via
GraphManagersingleton - Supports skill redirects
- Tracks analytics events
- Supports supplemental behaviors (parallel/sequence)
- Handles both launch and update requests
Constructor:
constructor(name: string)
Abstract Method:
abstract createGraph(): Graph<ExitTransition>
Request Handling:
Launch Requests (LISTEN_LAUNCH or PROACTIVE_LAUNCH):
- Validates request data (accountID, robotID, skill ID)
- Initializes skill session data
- Tracks SKILL_ENTRY analytics event
- Calls
GraphManager.instance.start(graph, data)to begin graph execution - Returns SKILL_ACTION or SKILL_REDIRECT response
Update Requests (LISTEN_UPDATE):
- Validates request data
- Calls
GraphManager.instance.exitNode(data)to process action results - Returns next SKILL_ACTION or final response
Response Types:
-
SKILL_REDIRECT - Redirects to another skill
{ type: "SKILL_REDIRECT", msgID: "uuid", ts: 1234567890, data: { skillID: string, nlu?: NLUResult, asr?: ASRResult, memo?: any } } -
SKILL_ACTION - Returns JCP action for robot to execute
{ type: "SKILL_ACTION", msgID: "uuid", ts: 1234567890, data: { action: JCPAction, analytics: AnalyticsData, final: boolean, fireAndForget: boolean } } -
Final Response - No action, transaction complete
{ type: "SKILL_ACTION", msgID: "uuid", ts: 1234567890, data: { action: null, analytics: AnalyticsData, final: true, fireAndForget: true } }
Convenience Methods:
track(data, event, properties)- Track analytics eventoverrideSpeaker(data, id)- Override current speaker in contextaddParallelBehavior(data, behavior)- Add behavior to execute in paralleladdSequenceBehavior(data, behavior)- Add behavior to execute in sequence
Supplemental Behaviors Injection:
When a skill returns a JCP action, the framework injects any supplemental behaviors that were added during execution:
- If sequence behaviors exist, wraps main action in a Sequence
- If parallel behaviors exist, wraps result in a Parallel
- Final JCP action is sent to robot
Example:
// Skill adds parallel behavior
this.addParallelBehavior(data, SetPresentPersonBehavior);
// Skill returns main action
return { action: SayTextBehavior };
// Framework injects: Parallel([SetPresentPersonBehavior, SayTextBehavior])
Graph System
Graph (graph/Graph.ts)
Represents a directed graph of connected nodes (states).
Purpose: Defines the structure of a skill's conversation flow.
Key Properties:
name: string- Graph nameinitial: Node- Starting nodenodes: Set<Node>- All nodes in graphexitTransitions: Map<ExitTransition, TransitionContainer[]>- Exit transition mappings
Constructor:
constructor(name: string, exitTransitionNames: ExitTransition[])
Methods:
setInitialNode(node)- Sets the starting nodeaddNode(node, transitionMapping)- Adds a node and connects its transitionsaddSubGraph(subGraph, transitionMapping)- Adds a subgraph and connects its exitsfinalize()- Validates graph and locks it for executionwriteDotFile(filePath)- Generates GraphViz dot file for visualization
Transition Mapping:
[
[TransitionName, DestinationNode], // Transition to another node
[TransitionName, ExitTransition] // Exit from graph
]
Validation (in finalize):
- All nodes must be reachable from initial node
- All exit transitions must be connected
- All transitions must have valid destinations
- No duplicate transition names
Subgraphs:
- Graphs can be nested within other graphs
- Subgraph exit transitions connect to parent graph nodes
- Enables hierarchical organization of complex flows
- Nodes can belong to multiple graphs (for subgraph sharing)
GraphViz Visualization:
- Generates .dot files for graph visualization
- Color-codes initial node, regular nodes, and exit states
- Shows hierarchical structure with clusters
- Labels transitions with their names
GraphManager (graph/GraphManager.ts)
Singleton that manages graph execution and skill sessions.
Purpose: Coordinates node execution and maintains session state.
Singleton Pattern:
GraphManager.instance // Access singleton
Key Responsibilities:
- Assigns unique IDs to all nodes
- Maps node IDs to node instances
- Manages skill session lifecycle
- Executes node enter/exit lifecycle
- Handles transitions between nodes
Session Structure:
{
id: string, // Session UUID
nodeID: number, // Current node ID
data: any, // Skill-specific session data
trace: [ // History of transitions
{ nodeID: number, transition: string }
]
}
Execution Flow:
Start Graph (launch request):
start(graph, data)
→ Creates new session
→ Sets initial node
→ Calls enterNode()
Enter Node:
enterNode(data)
→ Fetches current node
→ Calls node.enter(data)
→ Updates trace
→ If action returned: return action
→ Else: call exitNode()
Exit Node:
exitNode(data)
→ Fetches current node
→ Calls node.exit(data)
→ If transition returned: executeTransition()
→ Else: return (terminal)
Execute Transition:
executeTransition(node, result, data)
→ Validates transition exists
→ Updates trace with transition name
→ If terminal: return null
→ Else: update nodeID, call enterNode()
Node ID Assignment:
- Counter starts at 0, increments for each node
- Bidirectional mapping: node ↔ ID
- Enables serialization of session state
Node (graph/nodes/Node.ts)
Abstract base class for all graph nodes.
Purpose: Defines a state in the skill's conversation flow.
Key Properties:
id: number- Unique ID assigned by GraphManagername: string- Node nametransitionNames: Transition[]- Valid exit transitionsgraphs: Graph[]- Graphs this node belongs totransitions: Map<Transition, TransitionContainer>- Transition destinations
Constructor:
constructor(name: string, transitionNames: Transition[])
Abstract Methods:
abstract async enter(data: Data): Promise<EnterResponse>
- Called when node is entered
- Returns action to execute, redirect, or nothing
abstract async exit(data: Data): Promise<ExitResponse>
- Called with action results (if action was issued)
- Returns next transition or nothing (terminal)
Data Structure:
Data = {
// From request
general: { accountID, robotID, lang, release },
runtime: { character, location, loop, perception, dialog },
skill: { id, session },
result?: any, // Action results for UPDATE
// Added by framework
req: PegasusRequest,
log: Log,
local: any, // Skill-local data
analytics: {}, // Analytics events
behaviors: { // Supplemental behaviors
parallel: [],
sequence: []
}
}
Response Types:
EnterResponse:
{
action?: Action, // JCP action to execute
redirect?: RedirectData, // Redirect to another skill
final?: boolean // Is this the final response?
}
ExitResponse:
{
transition?: string, // Next transition to take
result?: any, // Result to pass to next node
redirect?: RedirectData
}
Built-in Node Types:
-
DefaultNode - Simple terminal node
- Returns no action
- Transitions to Done
-
NoOpNode - No operation node
- Returns no action
- Can have custom transitions
-
JCPNode - Returns a JCP action
- Returns specified JCP behavior
- Can be terminal or continue
-
TrueFalseNode - Conditional branching
- Evaluates condition
- Transitions based on true/false
-
SetLooperIDNode - Sets speaker ID
- Updates perception.speaker in context
- Useful for multi-turn conversations
Node Traversal:
forEachDescendent(handler)- BFS traversal of all descendant nodes- Used for graph validation and analysis
Skill Request/Response Protocol
Skill Request Types
Location: packages/interfaces/src/skill/request.ts
MessageType:
LISTEN_LAUNCH- Launch skill from listen interactionLISTEN_UPDATE- Update skill with action resultsPROACTIVE_LAUNCH- Launch skill proactively
Request Structure:
{
type: MessageType,
msgID: "uuid",
ts: 1234567890,
data: {
general: {
accountID: string,
robotID: string,
lang: string,
release: string
},
runtime: {
character: { emotion, motivation },
location: { city, state, country, lat, lng },
loop: { users, jibo, owner, loopId },
perception: { speaker, peoplePresent },
dialog: { referent }
},
skill: {
id: string,
session?: {
id: string,
nodeID: number,
data: any,
trace: [{ nodeID, transition }]
}
},
result?: any, // Action results for UPDATE
nlu?: NLUResult,
asr?: ASRResult,
memo?: any
}
}
Skill Response Types
Location: packages/interfaces/src/skill/response.ts
ResponseType:
SKILL_ACTION- Returns action to executeSKILL_REDIRECT- Redirects to another skillERROR- Error response
SKILL_ACTION Response:
{
type: "SKILL_ACTION",
msgID: "uuid",
ts: 1234567890,
data: {
action: JCPAction | null,
analytics: AnalyticsData,
final: boolean,
fireAndForget: boolean
}
}
SKILL_REDIRECT Response:
{
type: "SKILL_REDIRECT",
msgID: "uuid",
ts: 1234567890,
data: {
skillID: string,
nlu?: NLUResult,
asr?: ASRResult,
memo?: any
}
}
ERROR Response:
{
type: "ERROR",
msgID: "uuid",
ts: 1234567890,
data: {
message: string,
skill: { id: string }
}
}
JCP Actions
Location: packages/interfaces/src/skill/action.ts
Purpose: Defines behaviors that the robot should execute.
ActionType:
JCP- Jibo Command Protocol action
JCPAction Structure:
{
type: "JCP",
config: {
version: "1.0.0",
jcp: SupportedBehaviors
}
}
SupportedBehaviors:
SLIM- Single behavior executionSequence- Sequential behavior executionParallel- Parallel behavior executionSetPresentPerson- Set focused personImpactEmotion- Modify Jibo's emotional state
Helper Function:
generateJCPAction(behavior): JCPAction
Wraps a behavior as a JCP action with version 2.0.
MIM (Motion Interaction Model) System
Location: packages/baseskill/src/graph/mims/
Purpose: Provides pre-built graph structures for playing MIM animations.
MIM Files:
.mimfiles contain animation definitions- Organized in directories:
scripted-responses- Pre-scripted responsesemotion-responses- Emotion-based responsescore-responses- Fallback responses
MIM Factories:
ANFactory - Animation Node Factory
- Creates graph for playing a single MIM
- Supports prompt data injection
- Can be final or continue
MANFactory - Multiple Animation Node Factory
- Creates graph for playing multiple MIMs
- Supports random selection
- Can be final or continue
MIMFactory - General MIM Factory
- Creates graph for MIM playback
- Supports semi-specific responses
- Handles category-based selection
QNFactory - Question Node Factory
- Creates graph for asking questions
- Supports opt-in flows
- Handles user responses
OptInFactory - Opt-In Node Factory
- Creates graph for opt-in offers
- Tracks user acceptance/rejection
- Handles analytics
MIM Factory Options:
{
mimDataProvider: (data) => string[], // Function to get MIM paths
promptDataProvider?: (data) => any, // Function to get prompt data
final: boolean // Is this the final action?
}
Example Usage (Chitchat Skill):
const doMIMOptions: MimFactoryOptions = {
mimDataProvider: (data) => data.local.path,
promptDataProvider: (data) => data.local.promptData,
final: true
};
const doMIM = new ANFactory('Do MIM', doMIMOptions).createGraph();
Semi-Specific Responses:
- MIMs with
_SS_suffix are semi-specific - Match specific categories (e.g., time, weather)
- CSV files define category members
- Enables context-aware responses
SkillService (SkillService.ts)
Service wrapper that hosts a skill as an HTTP service.
Purpose: Provides the service infrastructure for running a skill.
Constructor:
constructor(private skillV1: BaseSkill)
HTTP Handler:
- Registers skill at
/v1/mainendpoint - No authentication required (handled by Hub)
Initialization:
async init(port: number)
→ Starts HTTP server
→ Calls skill.init()
Analytics
Location: packages/interfaces/src/skill/analytics.ts
Purpose: Track skill events for analysis.
AnalyticsData Structure:
{
[skillName: string]: [
{
event: string,
properties: any
}
]
}
Built-in Events:
SKILL_ENTRY- Skill launchedSKILL_OFFER- Opt-in offer presented
Skill Entry Analytics:
{
initial_intent: string,
domain: string,
was_hey_jibo_launch: boolean,
user_initiated: boolean,
last_skill: string
}
Tracking:
this.track(data, 'CustomEvent', { key: value });
Events are automatically included in SKILL_ACTION responses.
Server-to-Robot Communication Flow
Skill Response to Hub
When a skill returns a response, the Hub forwards it to the robot:
SKILL_ACTION Response:
- Skill returns SKILL_ACTION with JCP behavior
- Hub adds timing information
- Hub sends SKILL_ACTION to robot via WebSocket
- Robot executes JCP behavior
- Robot sends CMD_RESULT back to Hub
- Hub sends LISTEN_UPDATE to skill
- Skill processes result, returns next action
Final SKILL_ACTION:
- Skill returns SKILL_ACTION with
final: true - Hub sends to robot
- Robot executes (if action present)
- Transaction complete
SKILL_REDIRECT:
- Skill returns SKILL_REDIRECT
- Hub emits SKILL_REDIRECT notification to robot
- Hub launches new skill
- New skill proceeds normally
JCP Action Execution
Single Behavior (SLIM):
{
type: "JCP",
config: {
version: "1.0.0",
jcp: SayTextBehavior
}
}
Robot executes single behavior immediately.
Sequence Behavior:
{
type: "JCP",
config: {
version: "1.0.0",
jcp: Sequence([
LookAtBehavior,
SayTextBehavior,
GestureBehavior
])
}
}
Robot executes behaviors in order.
Parallel Behavior:
{
type: "JCP",
config: {
version: "1.0.0",
jcp: Parallel([
SetPresentPersonBehavior,
SayTextBehavior
])
}
}
Robot executes behaviors simultaneously.
Supplemental Behaviors
Skills can add behaviors that execute alongside the main action:
Parallel Supplemental:
this.addParallelBehavior(data, SetPresentPersonBehavior);
// Main action: SayTextBehavior
// Result: Parallel([SetPresentPersonBehavior, SayTextBehavior])
Sequence Supplemental:
this.addSequenceBehavior(data, LookAtBehavior);
// Main action: SayTextBehavior
// Result: Sequence([LookAtBehavior, SayTextBehavior])
Combined:
this.addSequenceBehavior(data, LookAtBehavior);
this.addParallelBehavior(data, SetPresentPersonBehavior);
// Result: Parallel([SetPresentPersonBehavior, Sequence([LookAtBehavior, SayTextBehavior])])
Example Skill Implementation
Chitchat Skill
Location: packages/chitchat-skill/src/Chitchat.ts
Purpose: Handles conversational interactions with the robot.
Graph Structure:
- IntentSplitNode - Splits based on intent type
- ProcessQueryNode - Processes user query, selects response
- DoMIM (ANFactory) - Plays selected MIM animation
- Complete (DefaultNode) - Terminates skill
Initialization:
- Loads MIM files from directories
- Builds semi-specific mappings
- Reads category CSV files
Response Selection:
- Scripted responses for common queries
- Emotion responses for emotional queries
- Semi-specific responses for context-aware queries
- Fallback responses for unknown queries
MIM Selection:
- Based on intent and entities
- Considers semi-specific categories
- Falls back to core responses
Skill Development Guide
Creating a Simple Skill
import { BaseSkill } from '@jibo/baseskill';
import { skill } from '@jibo/interfaces';
export class MySkill extends BaseSkill {
constructor() {
super('my-skill');
}
protected async handle(req: PegasusRequest<SkillRequest>): Promise<SkillResponse> {
const data = req.body.data;
// Process request
const action = generateJCPAction(SayTextBehavior("Hello!"));
return {
type: skill.response.ResponseType.SKILL_ACTION,
data: {
action: action,
final: true,
fireAndForget: true
},
ts: Date.now(),
msgID: getUUID()
};
}
}
Creating a Graph Skill
import { GraphSkill, graph } from '@jibo/baseskill';
enum Transition {
Done = 'Done',
Retry = 'Retry'
}
export class MyGraphSkill extends GraphSkill<Transition> {
constructor() {
super('my-graph-skill');
}
createGraph(): graph.Graph<Transition> {
const g = new graph.Graph('My Skill', generateTransitions(Transition));
const startNode = new MyStartNode('Start');
const endNode = new graph.nodes.dn.DefaultNode('End');
g.addNode(startNode, [[Transition.Done, endNode]]);
g.addNode(endNode, [[graph.nodes.dn.Transition.Done, Transition.Done]]);
g.finalize();
return g;
}
}
Creating a Custom Node
import { Node, Data, EnterResponse, ExitResponse } from '@jibo/baseskill';
enum MyTransition {
Success = 'Success',
Failure = 'Failure'
}
class MyNode extends Node<MyTransition> {
constructor() {
super('MyNode', [MyTransition.Success, MyTransition.Failure]);
}
async enter(data: Data): Promise<EnterResponse> {
// Perform logic
const action = generateJCPAction(SayTextBehavior("Processing..."));
return { action };
}
async exit(data: Data): Promise<ExitResponse> {
// Process action results
if (data.result.success) {
return { transition: MyTransition.Success };
} else {
return { transition: MyTransition.Failure };
}
}
}
Key Design Principles
- State Machine - Graph-based state machine for complex flows
- Single Responsibility - Each node handles one piece of logic
- Reusability - Subgraphs and node types can be reused
- Testability - Nodes can be tested independently
- Visualization - GraphViz generation for debugging
- Analytics - Built-in event tracking
- Flexibility - Supports both simple and complex skills
- Supplemental Behaviors - Easy to add parallel/sequence actions