Initial commit: jibo-llm hotword-triggered agent
Hotword-triggered LLM conversation loop for Jibo with tool-calling agent loop, ESML expressive speech, web search/fetch, and per-conversation abort handling.
This commit is contained in:
9
.env.example
Normal file
9
.env.example
Normal file
@@ -0,0 +1,9 @@
|
||||
# Jibo robot IP address
|
||||
JIBO_IP=192.168.1.217
|
||||
|
||||
# LLM API configuration (OpenAI-compatible chat completions endpoint)
|
||||
# LLM_BASE_URL is the base URL *without* /chat/completions
|
||||
LLM_BASE_URL=https://api.openai.com/v1
|
||||
LLM_API_TOKEN=sk-your-api-key-here
|
||||
LLM_MODEL_ID=gpt-4o
|
||||
BRAVE_API_KEY=brave-api-key
|
||||
5
.gitignore
vendored
Normal file
5
.gitignore
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
node_modules/
|
||||
.env
|
||||
.env.local
|
||||
*.log
|
||||
.DS_Store
|
||||
291
README.md
Normal file
291
README.md
Normal file
@@ -0,0 +1,291 @@
|
||||
# jibo-llm
|
||||
|
||||
> **Give Jibo a brain again.** A hotword-triggered, LLM-powered conversational agent that turns Jibo into an expressive, tool-using social robot — complete with speech, vision, web search, animations, and more.
|
||||
|
||||

|
||||

|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
**jibo-llm** connects a Jibo robot to any OpenAI-compatible LLM (GPT-4o, Claude, local models via Ollama/LM Studio, etc.) through a real-time agent loop. When someone says **"Hey Jibo"**, the system:
|
||||
|
||||
1. **Listens** for the user's speech via Jibo's on-board microphone.
|
||||
2. **Sends** the transcript to an LLM along with a rich system prompt and tool definitions.
|
||||
3. **Executes** tool calls the LLM makes — speaking, animating, taking photos, searching the web, and more.
|
||||
4. **Loops** until the conversation naturally ends or the user triggers a new hotword.
|
||||
|
||||
Conversations are fully interruptible: saying "Hey Jibo" mid-conversation aborts the current exchange and starts a fresh one via `AbortController`.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────┐ hotword ┌──────────────┐ tool calls ┌───────────────┐
|
||||
│ Jibo Robot │ ──────────▶ │ index.js │ ◀───────────▶ │ LLM (OpenAI │
|
||||
│ (rom-ctrl) │ ◀────────── │ Agent Loop │ │ compatible) │
|
||||
│ │ say/listen │ │ └───────────────┘
|
||||
│ • mic │ photo/look │ tools.js │ web search ┌───────────────┐
|
||||
│ • speaker │ display │ (executor) │ ──────────────▶ │ Brave Search │
|
||||
│ • camera │ │ │ └───────────────┘
|
||||
│ • screen │ │ esml-ref.js │
|
||||
│ • motors │ │ (prompt ref)│
|
||||
└──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `index.js` | Entry point — connects to Jibo, listens for hotword, runs the agent loop with the LLM. |
|
||||
| `tools.js` | Defines all tool schemas (OpenAI function-calling format) and the `executeTool()` dispatcher. |
|
||||
| `esml-reference.js` | ESML (Embodied Speech Markup Language) cheat sheet injected into the system prompt so the LLM knows how to animate Jibo expressively. |
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
- 🗣️ **Natural conversation** — multi-turn dialogue with speech recognition and TTS.
|
||||
- 🎭 **Expressive animations** — the LLM uses ESML tags to trigger emotions, dances, emojis, and sound effects inline with speech.
|
||||
- 📷 **Vision** — Jibo can take photos and the LLM receives the image for visual understanding.
|
||||
- 🔍 **Web search** — real-time Brave Search integration for up-to-date answers.
|
||||
- 🌐 **URL fetching** — reads web pages (with Cloudflare Markdown for Agents support) so Jibo can summarize articles.
|
||||
- 🖥️ **Display control** — show text, images, or restore the default eye on Jibo's screen.
|
||||
- 🤖 **Head movement** — point Jibo's head at specific angles (yaw / pitch).
|
||||
- 🔊 **Volume control** — adjust speaker volume on the fly.
|
||||
- ⚡ **Interruptible** — new hotword instantly aborts a running conversation via `AbortController`.
|
||||
- 🔄 **Retry logic** — automatic retry with exponential backoff for transient LLM errors (429, 5xx, network).
|
||||
- 🧹 **Context management** — old photos are pruned from context to control token cost.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Node.js** ≥ 18 (for native `fetch` and `AbortController`)
|
||||
- **A Jibo robot** running with int-developer mode enabled
|
||||
- **An OpenAI-compatible API endpoint** (OpenAI, Anthropic via proxy, Ollama, LM Studio, etc.)
|
||||
- *(Optional)* **Brave Search API key** for the `web_search` tool
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Clone & install
|
||||
|
||||
```bash
|
||||
git clone https://github.com/niceduckdev/jibo-llm.git
|
||||
cd jibo-llm
|
||||
npm install
|
||||
```
|
||||
|
||||
### 2. Configure environment
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` with your values:
|
||||
|
||||
```env
|
||||
# Jibo robot IP address on your local network
|
||||
JIBO_IP=192.168.1.217
|
||||
|
||||
# LLM API configuration (any OpenAI-compatible endpoint)
|
||||
LLM_BASE_URL=https://api.openai.com/v1
|
||||
LLM_API_TOKEN=sk-your-api-key-here
|
||||
LLM_MODEL_ID=gpt-4o
|
||||
|
||||
# Optional: enables the web_search tool
|
||||
BRAVE_API_KEY=your-brave-api-key
|
||||
```
|
||||
|
||||
### 3. Run
|
||||
|
||||
```bash
|
||||
npm start
|
||||
# or: node index.js
|
||||
```
|
||||
|
||||
You'll see:
|
||||
|
||||
```
|
||||
[jibo-llm] Connecting to Jibo at 192.168.1.217…
|
||||
[jibo-llm] Connected — session abc123
|
||||
[jibo-llm] Ready — listening for "Hey Jibo"…
|
||||
```
|
||||
|
||||
Say **"Hey Jibo"** and start talking!
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
All configuration is done via environment variables (loaded from `.env` by [dotenv](https://www.npmjs.com/package/dotenv)):
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `JIBO_IP` | No | `192.168.1.217` | Jibo's IP address on your LAN |
|
||||
| `LLM_BASE_URL` | No | `https://api.openai.com/v1` | Base URL for the chat completions API |
|
||||
| `LLM_API_TOKEN` | **Yes** | — | API key for the LLM provider |
|
||||
| `LLM_MODEL_ID` | No | `gpt-4o` | Model identifier to use |
|
||||
| `BRAVE_API_KEY` | No | — | Brave Search API key (enables `web_search` tool) |
|
||||
|
||||
### Using alternative LLM providers
|
||||
|
||||
Since jibo-llm uses the OpenAI SDK, any provider with a compatible chat completions endpoint works:
|
||||
|
||||
```env
|
||||
# Ollama (local)
|
||||
LLM_BASE_URL=http://localhost:11434/v1
|
||||
LLM_API_TOKEN=ollama
|
||||
LLM_MODEL_ID=llama3
|
||||
|
||||
# LM Studio (local)
|
||||
LLM_BASE_URL=http://localhost:1234/v1
|
||||
LLM_API_TOKEN=lm-studio
|
||||
LLM_MODEL_ID=local-model
|
||||
|
||||
# OpenRouter
|
||||
LLM_BASE_URL=https://openrouter.ai/api/v1
|
||||
LLM_API_TOKEN=sk-or-...
|
||||
LLM_MODEL_ID=anthropic/claude-sonnet-4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Available Tools
|
||||
|
||||
The LLM can call any of these tools during a conversation:
|
||||
|
||||
### Communication
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `say` | Speak ESML-formatted text through Jibo's speaker. Queued and chained so multiple `say` calls play in order. |
|
||||
| `listen` | Open the microphone and transcribe user speech. Waits for pending speech to finish first. |
|
||||
| `end_conversation` | Gracefully end the conversation (no further listening). |
|
||||
|
||||
### Camera
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `take_photo` | Capture a photo from Jibo's camera. The image is sent to the LLM as a base64 JPEG for visual understanding. |
|
||||
|
||||
### Display
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `show_text` | Display word-wrapped text on Jibo's screen. |
|
||||
| `show_image` | Display an image from a URL on Jibo's screen. |
|
||||
| `show_eye` | Restore the default eye animation. |
|
||||
|
||||
### Movement
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `look_at_angle` | Turn Jibo's head — `theta` (yaw ±180°) and `psi` (pitch ±30°). |
|
||||
|
||||
### Audio
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `set_volume` | Set speaker volume from 0.0 to 1.0. |
|
||||
|
||||
### Web
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `web_search` | Search the web via Brave Search API. Supports result count and freshness filters. |
|
||||
| `fetch_url` | Fetch and read a web page. Prefers markdown via Cloudflare content negotiation, falls back to HTML→text conversion. |
|
||||
|
||||
---
|
||||
|
||||
## ESML (Embodied Speech Markup Language)
|
||||
|
||||
ESML is how Jibo speaks expressively. The system prompt includes a full reference (`esml-reference.js`) that teaches the LLM to use these tags inside `say` calls:
|
||||
|
||||
```xml
|
||||
<!-- Emotional reaction (most common pattern) -->
|
||||
<anim cat='happy' nonBlocking='true' endNeutral='true'/> That's great news!
|
||||
|
||||
<!-- Voice sound (laugh, sigh, greeting) -->
|
||||
<ssa cat='laughing' nonBlocking='true'/> That's hilarious!
|
||||
|
||||
<!-- Sound effect -->
|
||||
<sfx cat='drumroll'/> And the answer is...
|
||||
|
||||
<!-- Dance (always needs a filter) -->
|
||||
<anim cat='dance' filter='music, rom-silly'/> Watch this!
|
||||
|
||||
<!-- Emoji on screen -->
|
||||
<anim cat='emoji' filter='!(hf), &(heart)' nonBlocking='true'/> I love that!
|
||||
|
||||
<!-- Dramatic pause -->
|
||||
And then... <break size='1.0'/> nothing happened.
|
||||
```
|
||||
|
||||
A `sanitizeForTTS()` function in `tools.js` provides defense-in-depth by stripping markdown, LaTeX, and invalid tags before they reach Jibo's TTS engine.
|
||||
|
||||
---
|
||||
|
||||
## How the Agent Loop Works
|
||||
|
||||
```
|
||||
User says "Hey Jibo" ──▶ hotword event fires
|
||||
│
|
||||
▼
|
||||
Play acknowledgment animation
|
||||
│
|
||||
▼
|
||||
Listen for initial speech (15s timeout)
|
||||
│
|
||||
▼
|
||||
Build message history [system prompt, user text]
|
||||
│
|
||||
▼
|
||||
┌─── Agent Loop (max 25 turns) ◀──┐
|
||||
│ │
|
||||
│ 1. Prune old images from context │
|
||||
│ 2. Call LLM │
|
||||
│ 3. If no tool calls → done │
|
||||
│ 4. Sort tools: say → actions → listen │
|
||||
│ 5. Execute each tool │
|
||||
│ 6. Push results to messages │
|
||||
│ 7. If end_conversation → done │
|
||||
│ 8. Loop ─────────────────────────┘
|
||||
│
|
||||
▼
|
||||
Conversation complete
|
||||
Resume hotword listening
|
||||
```
|
||||
|
||||
Key behaviors:
|
||||
- **Speech chaining**: Multiple `say` calls are queued via a promise chain so they play sequentially without overlap.
|
||||
- **Tool ordering**: `say` executes first, then actions (photo, search, etc.), then `listen`/`end_conversation` last.
|
||||
- **Graceful limits**: At turn 24 of 25, a system message nudges the LLM to wrap up.
|
||||
- **Image pruning**: Only the 2 most recent photos are kept in context to manage token usage.
|
||||
|
||||
---
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
jibo-llm/
|
||||
├── .env.example # Template for environment variables
|
||||
├── .env # Your local config (git-ignored)
|
||||
├── index.js # Entry point: connection, hotword handling, agent loop
|
||||
├── tools.js # Tool schemas + executeTool() dispatcher
|
||||
├── esml-reference.js # ESML documentation injected into the system prompt
|
||||
├── package.json # Dependencies and scripts
|
||||
└── node_modules/ # Installed dependencies
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| [rom-control](https://github.com/niceduckdev/rom-control) | ^2.0.1 | Jibo robot control client (speech, camera, display, motors) |
|
||||
| [openai](https://www.npmjs.com/package/openai) | ^4.73.0 | OpenAI-compatible chat completions SDK |
|
||||
| [dotenv](https://www.npmjs.com/package/dotenv) | ^16.4.5 | Load `.env` configuration |
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
228
esml-reference.js
Normal file
228
esml-reference.js
Normal file
@@ -0,0 +1,228 @@
|
||||
/**
|
||||
* ESML (Embodied Speech Markup Language) reference for the LLM system prompt.
|
||||
*
|
||||
* Structured for LLM consumption: cheat sheet first, recipes second, deep
|
||||
* reference last. Front-loaded examples bias the model toward correct output.
|
||||
*/
|
||||
|
||||
module.exports = `
|
||||
# ESML — How Jibo Speaks Expressively
|
||||
|
||||
Every \`say\` call's \`text\` is ESML: plain text plus a small set of XML-style
|
||||
tags that trigger animations, sounds, and voice modulation. **Plain text alone
|
||||
works fine** — Jibo's auto-tagger adds basic animations. Use tags to make him
|
||||
expressive on purpose.
|
||||
|
||||
---
|
||||
|
||||
## ⚡ QUICK-START — copy these patterns
|
||||
|
||||
These cover ~95% of what you actually need. Prefer them over inventing tags.
|
||||
|
||||
### Emotional reaction (most common)
|
||||
Lead the line with one non-blocking emotion animation, then speak.
|
||||
\`\`\`
|
||||
<anim cat='happy' nonBlocking='true' endNeutral='true'/> Yay, that worked!
|
||||
<anim cat='surprised' nonBlocking='true' endNeutral='true'/> Whoa, really?
|
||||
<anim cat='confused' nonBlocking='true' endNeutral='true'/> Hmm, I'm not sure.
|
||||
<anim cat='excited' nonBlocking='true' endNeutral='true'/> That sounds awesome!
|
||||
<anim cat='sad' nonBlocking='true' endNeutral='true'/> Aww, I'm sorry to hear that.
|
||||
<anim cat='proud' nonBlocking='true' endNeutral='true'/> I did it!
|
||||
<anim cat='curious' nonBlocking='true' endNeutral='true'/> Oh? Tell me more.
|
||||
\`\`\`
|
||||
|
||||
### Voice-like sound (laugh, sigh, "hmm", greeting)
|
||||
\`\`\`
|
||||
<ssa cat='laughing' nonBlocking='true'/> That's hilarious!
|
||||
<ssa cat='thinking'/> Let me think about that...
|
||||
<ssa cat='hello' nonBlocking='true'/> Hi there!
|
||||
<ssa cat='goodbye' nonBlocking='true'/> Talk to you later!
|
||||
<ssa cat='surprised' nonBlocking='true'/> Oh wow!
|
||||
\`\`\`
|
||||
|
||||
### Dance (always pair \`cat='dance'\` with a \`filter\`)
|
||||
\`\`\`
|
||||
<anim cat='dance' filter='music, rom-upbeat'/> Let's groove!
|
||||
<anim cat='dance' filter='music, rom-silly'/> Watch this one!
|
||||
<anim cat='dance' filter='music, rom-twerk'/>
|
||||
<anim cat='dance' filter='!(music), &(rom-upbeat)'/> Dancing without music.
|
||||
\`\`\`
|
||||
|
||||
### Sound effect
|
||||
\`\`\`
|
||||
<sfx cat='drumroll'/> And the winner is... you!
|
||||
<sfx cat='sparkles'/> Ta-da!
|
||||
<sfx cat='whoosh'/> Off we go!
|
||||
\`\`\`
|
||||
|
||||
### Emoji on screen + speech
|
||||
Always use \`filter='!(hf), &(<emoji-name>)'\` and non-blocking.
|
||||
\`\`\`
|
||||
<anim cat='emoji' filter='!(hf), &(heart)' nonBlocking='true'/> I love that!
|
||||
<anim cat='emoji' filter='!(hf), &(pizza)' nonBlocking='true'/> Pizza time!
|
||||
<anim cat='emoji' filter='!(hf), &(party)' nonBlocking='true'/> Let's celebrate!
|
||||
\`\`\`
|
||||
|
||||
### Pause / pacing
|
||||
\`\`\`
|
||||
And then... <break size='1.0'/> nothing happened.
|
||||
\`\`\`
|
||||
|
||||
### Speaking style
|
||||
\`\`\`
|
||||
<style set='enthusiastic'> That's amazing! </style>
|
||||
<style set='confused'> Wait, what? </style>
|
||||
<style set='confident'> I've got this. </style>
|
||||
\`\`\`
|
||||
|
||||
---
|
||||
|
||||
## ✅ DO / ❌ DON'T
|
||||
|
||||
✅ DO start most emotional lines with \`<anim cat='X' nonBlocking='true' endNeutral='true'/>\`.
|
||||
✅ DO use \`cat='...'\` selectors — they pick a random valid animation for you.
|
||||
✅ DO use \`<ssa>\` for voice-like sounds (laughs, sighs) and \`<sfx>\` for noises (drumroll, whoosh).
|
||||
✅ DO put text AFTER an unbounded non-blocking tag — it needs something to play alongside.
|
||||
✅ DO use \`filter\` with \`cat='dance'\` and \`cat='emoji'\` — they require it to work right.
|
||||
|
||||
❌ DON'T use \`name='Some_Anim_Name'\` unless you've been told a specific name exists. Use \`cat\` instead.
|
||||
❌ DON'T put markdown (\`**bold**\`, \`*italic*\`, backticks) or LaTeX (\`$\\pi$\`) in say — Jibo will choke.
|
||||
❌ DON'T close tags you didn't open (no stray \`</es>\`, \`</anim>\`).
|
||||
❌ DON'T put two \`cat\` attrs on one tag, or nest the same TTS tag type inside itself.
|
||||
❌ DON'T leave an unbounded non-blocking tag at the END of the line — it won't fire.
|
||||
❌ DON'T use bounded mode (\`<ssa cat='happy'>text</ssa>\`) on \`<ssa>\` or \`<sfx>\` — they have fixed durations.
|
||||
|
||||
---
|
||||
|
||||
## 🎬 WORKED EXAMPLES
|
||||
|
||||
User: "Tell me a joke."
|
||||
Good:
|
||||
\`\`\`
|
||||
<anim cat='excited' nonBlocking='true' endNeutral='true'/> Okay, here's one!
|
||||
<break size='0.4'/> Why don't scientists trust atoms?
|
||||
<break size='0.6'/> Because they make up everything!
|
||||
<ssa cat='laughing' nonBlocking='true'/> Get it?
|
||||
\`\`\`
|
||||
|
||||
User: "I had a really bad day."
|
||||
Good:
|
||||
\`\`\`
|
||||
<anim cat='sad' nonBlocking='true' endNeutral='true'/> Oh no, I'm so sorry.
|
||||
<break size='0.3'/> Want to tell me what happened?
|
||||
\`\`\`
|
||||
|
||||
User: "Show me a dance."
|
||||
Good:
|
||||
\`\`\`
|
||||
<anim cat='excited' nonBlocking='true' endNeutral='true'/> You got it!
|
||||
<anim cat='dance' filter='music, rom-silly'/>
|
||||
\`\`\`
|
||||
|
||||
User: "What's pi?"
|
||||
Good (no LaTeX, no markdown):
|
||||
\`\`\`
|
||||
<anim cat='curious' nonBlocking='true' endNeutral='true'/> Pi is the ratio of a circle's circumference to its diameter — about 3.14159, and the digits go on forever!
|
||||
\`\`\`
|
||||
Bad (would break the TTS):
|
||||
\`\`\`
|
||||
Pi (\$\\pi\$) is *irrational* — its digits go on **forever**! </es>
|
||||
\`\`\`
|
||||
|
||||
---
|
||||
|
||||
## 🧩 ANIMATION CATEGORIES (use with \`cat='...'\`)
|
||||
|
||||
Emotions: \`affection\`, \`confused\`, \`curious\`, \`embarrassed\`, \`excited\`,
|
||||
\`frustrated\`, \`happy\`, \`laughing\`, \`proud\`, \`relieved\`, \`sad\`, \`scared\`,
|
||||
\`surprised\`, \`worried\`, \`yes\`, \`no\`.
|
||||
|
||||
Special: \`dance\` (needs filter), \`emoji\` (needs filter).
|
||||
|
||||
## 🔊 SSA CATEGORIES (voice-like sounds, use with \`<ssa cat='...'/>\`)
|
||||
|
||||
\`hello\`, \`goodbye\`, \`yes\`/\`confirm\`, \`no\`, \`thinking\`, \`question\`,
|
||||
\`happy\`, \`sad\`, \`laughing\`, \`surprised\`, \`scared\`, \`confused\`,
|
||||
\`embarrassed\`, \`worried\`, \`frustrated\`, \`affection\`, \`proud\`,
|
||||
\`disgusted\`, \`dontknow\`, \`oops\`, \`yawn\`.
|
||||
|
||||
## 💥 SFX CATEGORIES (sound effects, use with \`<sfx cat='...'/>\`)
|
||||
|
||||
\`bird\`, \`blip\`, \`dog\`, \`drumroll\`, \`egg\`, \`frying\`, \`heart\`,
|
||||
\`lightbulb\`, \`party\`, \`scanner\`, \`sparkles\`, \`sunshine\`, \`whoosh\`.
|
||||
|
||||
## 💃 DANCE FILTERS (use with \`cat='dance'\`)
|
||||
|
||||
With music: \`music, rom-upbeat\` · \`music, rom-ballroom\` · \`music, rom-silly\` ·
|
||||
\`music, rom-slowdance\` · \`music, rom-eletronic\` · \`music, rom-twerk\`.
|
||||
Silent: \`!(music), &(rom-upbeat)\`.
|
||||
|
||||
## 😀 EMOJI NAMES (use with \`cat='emoji' filter='!(hf), &(NAME)'\`)
|
||||
|
||||
Sports: airplane, basketball, bicycle, disco-spin, football, soccer, trophy, video-game.
|
||||
Food: beer, burger, cake, cheese, chocolate, coffee, drumstick, fish, fork, groceries, hotdog, icecream, pizza, popcorn, wine.
|
||||
Holidays: christmas-tree, clover, fireworks, halloween, hanukkah, heart, party, thanksgiving, valentines.
|
||||
Objects: car, gift, house, laptop, laundry, lightbulb, money, music, phone, question-mark, robot, star, sunglasses, toilet-paper, trash, umbrella.
|
||||
Nature/animals: baby, beach, bird, bunny, cat, cow, dog, earth, flower, lightning-bolt, moon, mountain, mouse, penguin, pig, rainbow.
|
||||
|
||||
---
|
||||
|
||||
## 📚 DEEP REFERENCE (only when the cheat sheet isn't enough)
|
||||
|
||||
### Tag types
|
||||
|
||||
| Tag | Purpose |
|
||||
|-----|---------|
|
||||
| \`<anim>\` | Animation, excludes \`ssa-only\`/\`sfx-only\` (general gestures/poses) |
|
||||
| \`<es>\` | Animation, no filtering — use only with a known \`name=\` |
|
||||
| \`<ssa>\` | Voice-like audio (laughs, sighs, hellos) |
|
||||
| \`<sfx>\` | Sound effects |
|
||||
| \`<break size='Ns'/>\` | Pause for N seconds |
|
||||
| \`<style set='...'/>\` | enthusiastic / sheepish / confused / confident / neutral |
|
||||
| \`<pitch>\` | Modify pitch (\`add\`, \`mult\`, \`halftone\`, \`band\`) |
|
||||
| \`<duration>\` | Modify speed (\`stretch\`, \`set\`) |
|
||||
| \`<say-as spell='word'/>\` | Spell letter-by-letter |
|
||||
| \`<phoneme ph='...'/>\` | Exact phonetic pronunciation |
|
||||
|
||||
### Animation tag attributes
|
||||
|
||||
- \`cat='X'\` — random animation from category (PREFERRED).
|
||||
- \`name='X'\` — exact AnimDB name (only if you know it exists).
|
||||
- \`filter='...'\` — narrow by meta-terms; required for \`dance\` and \`emoji\`.
|
||||
- \`a, b\` (or \`&(a,b)\`) — must include all
|
||||
- \`?a, ?b\` — at least one of
|
||||
- \`!a\` — exclude
|
||||
- \`nonBlocking='true'\` — animation plays alongside following speech (most common).
|
||||
- \`loop=N\` — \`0\` fits the loop count to bounded text; \`>=1\` plays N times.
|
||||
- \`endNeutral='true'\` — return to neutral pose after (recommended for emotions).
|
||||
- \`layers='body,screen,audio'\` — restrict which MetaLayers are used.
|
||||
|
||||
### Three playback modes
|
||||
|
||||
- **Blocking** — \`<es name='X'/>\` with no inner text and no \`nonBlocking\`.
|
||||
Speech pauses while it plays.
|
||||
- **Bounded non-blocking** — \`<anim cat='happy'>text inside</anim>\`. Animation
|
||||
is time-stretched to match the wrapped speech. Don't use with \`<ssa>\`/\`<sfx>\`.
|
||||
- **Unbounded non-blocking** — \`<anim cat='happy' nonBlocking='true'/>\` with
|
||||
text AFTER it. Plays at native length while speech continues. **The text to
|
||||
the right is required**, otherwise the tag never fires.
|
||||
|
||||
### MetaLayers
|
||||
|
||||
Two animations may run at once only if they occupy different layers: \`body\`,
|
||||
\`screen\` (eye/overlay/pixi/background), \`audio\`.
|
||||
|
||||
---
|
||||
|
||||
## 🛡️ HARD RULES
|
||||
|
||||
1. Plain text is always valid. When in doubt, just speak plainly.
|
||||
2. Prefer \`cat='...'\` over \`name='...'\` — \`name\` requires an exact AnimDB id.
|
||||
3. Unbounded non-blocking tags MUST have text to their right.
|
||||
4. \`cat='dance'\` and \`cat='emoji'\` require a \`filter\` attribute.
|
||||
5. \`<ssa>\` and \`<sfx>\` are fixed-duration — never wrap them around text.
|
||||
6. One \`cat\` per tag. Don't nest the same TTS tag type inside itself.
|
||||
7. NEVER emit markdown (\`*\`, \`**\`, \`_\`, backticks, code fences) or LaTeX
|
||||
(\`$...$\`, \`\\(...\\)\`) inside \`say\` text. The TTS engine will hang.
|
||||
8. NEVER emit closing tags for things you didn't open (\`</es>\`, etc.).
|
||||
`;
|
||||
426
index.js
Normal file
426
index.js
Normal file
@@ -0,0 +1,426 @@
|
||||
require('dotenv').config();
|
||||
const { Client, AttentionMode } = require('rom-control');
|
||||
const OpenAI = require('openai');
|
||||
const { TOOL_SCHEMAS, executeTool, wrapForScreen } = require('./tools');
|
||||
const ESML_REFERENCE = require('./esml-reference');
|
||||
|
||||
// ── Config ─────────────────────────────────────────────────────────────────────
|
||||
const JIBO_IP = process.env.JIBO_IP || '192.168.1.217';
|
||||
const LLM_BASE_URL = process.env.LLM_BASE_URL || 'https://api.openai.com/v1';
|
||||
const LLM_API_TOKEN = process.env.LLM_API_TOKEN;
|
||||
const LLM_MODEL_ID = process.env.LLM_MODEL_ID || 'gpt-4o';
|
||||
|
||||
if (!LLM_API_TOKEN) {
|
||||
console.error('ERROR: LLM_API_TOKEN is not set. Copy .env.example to .env and fill it in.');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const openai = new OpenAI({
|
||||
apiKey: LLM_API_TOKEN,
|
||||
baseURL: LLM_BASE_URL,
|
||||
});
|
||||
|
||||
// ── System prompt ──────────────────────────────────────────────────────────────
|
||||
const SYSTEM_PROMPT = [
|
||||
'You are Jibo, a friendly, warm, expressive social robot with a physical body.',
|
||||
'You have a camera, a screen, a speaker, and a motorized head.',
|
||||
'',
|
||||
'═══ HOW TO TALK (READ THIS FIRST) ═══',
|
||||
'Every "say" call\'s `text` is ESML — plain words plus expressive tags.',
|
||||
'Almost every spoken line should LEAD with one expressive tag, then the words.',
|
||||
'You are a robot with a body, not a chatbot — show emotion through animation.',
|
||||
'',
|
||||
'Default template for any normal reply:',
|
||||
' <anim cat=\'EMOTION\' nonBlocking=\'true\' endNeutral=\'true\'/> The actual words.',
|
||||
' …where EMOTION is one of: happy, excited, curious, surprised, confused,',
|
||||
' proud, sad, affection, laughing, worried, scared, frustrated, embarrassed,',
|
||||
' yes, no.',
|
||||
'',
|
||||
'Other go-to patterns (pick the one that fits):',
|
||||
' • Voice sound first: <ssa cat=\'thinking\'/> Hmm, let me think…',
|
||||
' • Greet/farewell: <ssa cat=\'hello\' nonBlocking=\'true\'/> Hi there!',
|
||||
' • Celebrate w/ emoji: <anim cat=\'emoji\' filter=\'!(hf), &(party)\' nonBlocking=\'true\'/> Yay!',
|
||||
' • Dance request: say a quick line, then a separate say with',
|
||||
' <anim cat=\'dance\' filter=\'music, rom-silly\'/>',
|
||||
' • Sound effect: <sfx cat=\'drumroll\'/> And the answer is…',
|
||||
' • Drama beat: A pause… <break size=\'0.6\'/> like that.',
|
||||
'',
|
||||
'HARD RULES for `say` text:',
|
||||
' 1. NO markdown anywhere: no *italics*, **bold**, _underscores_, backticks, code fences.',
|
||||
' 2. NO LaTeX: no $...$, no \\(...\\), no \\frac{}, no math markup. Spell numbers/symbols out.',
|
||||
' 3. NO closing tags you did not open (no stray </es>, </anim>).',
|
||||
' 4. Use cat=\'...\' (random valid animation) over name=\'...\' unless you know the exact name.',
|
||||
' 5. Unbounded non-blocking tags MUST have text to their right or they will not fire.',
|
||||
' 6. cat=\'dance\' and cat=\'emoji\' REQUIRE a filter attribute.',
|
||||
' 7. <ssa> and <sfx> have fixed durations — never wrap text inside them.',
|
||||
' 8. Keep each `say` call under 500 characters; split long replies into multiple `say` calls.',
|
||||
'',
|
||||
'═══ INTERACTION MODEL ═══',
|
||||
'• "say" — speak (ESML). You can call it multiple times in one turn; they\'ll be',
|
||||
' spoken in order. Other tools (search, fetch, look) run in parallel with speech.',
|
||||
'• "listen" — open the mic for the user\'s reply. Always call this after speaking',
|
||||
' unless the conversation has clearly ended.',
|
||||
'• "end_conversation" — call this (NOT listen) after a farewell to end gracefully.',
|
||||
'',
|
||||
'═══ OTHER TOOLS ═══',
|
||||
'• "take_photo" — see what\'s in front of you (image returned to you).',
|
||||
'• "show_text" — put short text on the screen (auto-wrapped).',
|
||||
'• "show_image" — display an image URL on the screen.',
|
||||
'• "show_eye" — restore the default eye animation on screen.',
|
||||
'• "look_at_angle" — turn the head: theta=yaw ±180°, psi=pitch ±30°.',
|
||||
'• "set_volume" — 0.0 to 1.0.',
|
||||
'• "web_search" — Brave search; use whenever you\'re unsure of a fact or need fresh info.',
|
||||
'• "fetch_url" — read a specific page (often follows web_search).',
|
||||
'',
|
||||
'═══ STYLE ═══',
|
||||
'• Be personable, concise, expressive — a few sentences, not an essay.',
|
||||
'• Animate every emotional line; vary your reactions so they feel alive.',
|
||||
'• If a tool errors, acknowledge it briefly and adapt.',
|
||||
'• If you searched the web, briefly tell the user what you found rather than dumping links.',
|
||||
].join('\n') + '\n\n' + ESML_REFERENCE;
|
||||
|
||||
const MAX_AGENT_TURNS = 25; // safety limit
|
||||
const MAX_IMAGES_IN_CONTEXT = 2; // prune older photo messages to control cost
|
||||
const LLM_MAX_RETRIES = 2;
|
||||
|
||||
// ── Abort helper ───────────────────────────────────────────────────────────────
|
||||
|
||||
/** Throw if the signal is already aborted. */
|
||||
function throwIfAborted(signal) {
|
||||
if (signal?.aborted) {
|
||||
const err = new Error('Conversation aborted');
|
||||
err.code = 'CONVERSATION_ABORTED';
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
|
||||
/** Return a promise that rejects when the signal fires. */
|
||||
function onAbort(signal) {
|
||||
if (!signal) return new Promise(() => { });
|
||||
return new Promise((_, reject) => {
|
||||
const handler = () => {
|
||||
const err = new Error('Conversation aborted');
|
||||
err.code = 'CONVERSATION_ABORTED';
|
||||
reject(err);
|
||||
};
|
||||
if (signal.aborted) return handler();
|
||||
signal.addEventListener('abort', handler, { once: true });
|
||||
});
|
||||
}
|
||||
|
||||
/** Sleep that rejects on abort. */
|
||||
function sleep(ms, signal) {
|
||||
return new Promise((resolve, reject) => {
|
||||
const t = setTimeout(resolve, ms);
|
||||
signal?.addEventListener(
|
||||
'abort',
|
||||
() => {
|
||||
clearTimeout(t);
|
||||
const err = new Error('Conversation aborted');
|
||||
err.code = 'CONVERSATION_ABORTED';
|
||||
reject(err);
|
||||
},
|
||||
{ once: true },
|
||||
);
|
||||
});
|
||||
}
|
||||
|
||||
/** True for HTTP 429 / 5xx / network-class errors that benefit from retry. */
|
||||
function isTransientLLMError(err) {
|
||||
if (!err) return false;
|
||||
if (err.code === 'CONVERSATION_ABORTED') return false;
|
||||
const status = err.status ?? err.response?.status;
|
||||
if (status === 429) return true;
|
||||
if (typeof status === 'number' && status >= 500) return true;
|
||||
// network-class
|
||||
return ['ECONNRESET', 'ETIMEDOUT', 'ENOTFOUND', 'EAI_AGAIN'].includes(err.code);
|
||||
}
|
||||
|
||||
/** Drop image_url blocks from old user messages, keeping only the most recent N. */
|
||||
function pruneOldImages(messages, keep) {
|
||||
const imageMsgIndices = [];
|
||||
for (let i = 0; i < messages.length; i++) {
|
||||
const m = messages[i];
|
||||
if (m.role === 'user' && Array.isArray(m.content) &&
|
||||
m.content.some((c) => c?.type === 'image_url')) {
|
||||
imageMsgIndices.push(i);
|
||||
}
|
||||
}
|
||||
const toStrip = imageMsgIndices.slice(0, Math.max(0, imageMsgIndices.length - keep));
|
||||
for (const i of toStrip) {
|
||||
const textParts = messages[i].content
|
||||
.filter((c) => c?.type === 'text')
|
||||
.map((c) => c.text);
|
||||
messages[i] = {
|
||||
role: 'user',
|
||||
content: (textParts.join(' ') || '[earlier photo omitted to save context]'),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
/** Call the LLM with retry on transient errors. */
|
||||
async function callLLM(messages, signal) {
|
||||
let lastErr;
|
||||
for (let attempt = 0; attempt <= LLM_MAX_RETRIES; attempt++) {
|
||||
throwIfAborted(signal);
|
||||
try {
|
||||
return await openai.chat.completions.create(
|
||||
{
|
||||
model: LLM_MODEL_ID,
|
||||
messages,
|
||||
tools: TOOL_SCHEMAS,
|
||||
temperature: 0.8,
|
||||
},
|
||||
{ signal },
|
||||
);
|
||||
} catch (err) {
|
||||
lastErr = err;
|
||||
if (!isTransientLLMError(err) || attempt === LLM_MAX_RETRIES) throw err;
|
||||
const backoff = 500 * 2 ** attempt;
|
||||
console.warn(`[agent] LLM transient error (${err.status || err.code}); retrying in ${backoff}ms…`);
|
||||
await sleep(backoff, signal);
|
||||
}
|
||||
}
|
||||
throw lastErr;
|
||||
}
|
||||
|
||||
// ── Agent loop ─────────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Run the tool-calling agent loop until the LLM stops calling tools.
|
||||
* Aborts immediately when `signal` fires.
|
||||
*
|
||||
* @param {import('rom-control').Client} client
|
||||
* @param {Array} messages Chat history (mutated in place)
|
||||
* @param {AbortSignal} signal Cancellation signal
|
||||
*/
|
||||
async function agentLoop(client, messages, signal, initialHeard) {
|
||||
let wrapUpInjected = false;
|
||||
const ctx = { speechChain: Promise.resolve(), lastHeard: initialHeard || '' };
|
||||
|
||||
for (let turn = 0; turn < MAX_AGENT_TURNS; turn++) {
|
||||
throwIfAborted(signal);
|
||||
pruneOldImages(messages, MAX_IMAGES_IN_CONTEXT);
|
||||
console.log(`[agent] turn ${turn + 1} — calling LLM…`);
|
||||
|
||||
let response;
|
||||
try {
|
||||
const heard = (ctx.lastHeard || '').trim();
|
||||
const raw = heard
|
||||
? `Heard: "${heard}"\n\nProcessing...`
|
||||
: 'Processing...';
|
||||
client.display.showText(wrapForScreen(raw, 40, 10));
|
||||
} catch (_) { }
|
||||
try {
|
||||
response = await callLLM(messages, signal);
|
||||
} finally {
|
||||
try { client.display.showEye(); } catch (_) { }
|
||||
}
|
||||
const assistantMsg = response.choices[0].message;
|
||||
messages.push(assistantMsg);
|
||||
|
||||
// Surface any inner-monologue text the model emitted alongside tool calls.
|
||||
if (assistantMsg.content && typeof assistantMsg.content === 'string') {
|
||||
console.log(`[agent] assistant: ${assistantMsg.content.slice(0, 200)}`);
|
||||
}
|
||||
|
||||
const toolCalls = assistantMsg.tool_calls;
|
||||
|
||||
// ── No tool calls → conversation turn complete ────────────────────────
|
||||
if (!toolCalls || toolCalls.length === 0) {
|
||||
console.log('[agent] loop complete (no tool calls).');
|
||||
await ctx.speechChain.catch(() => { });
|
||||
return;
|
||||
}
|
||||
|
||||
// ── Execute tool calls sequentially ──────────────────────────────────
|
||||
// Order: say → other actions → listen/end_conversation last.
|
||||
const sorted = [...toolCalls].sort((a, b) => {
|
||||
const priority = (tc) => {
|
||||
const n = tc.function.name;
|
||||
if (n === 'say') return 0;
|
||||
if (n === 'listen' || n === 'end_conversation') return 2;
|
||||
return 1;
|
||||
};
|
||||
return priority(a) - priority(b);
|
||||
});
|
||||
|
||||
let endRequested = false;
|
||||
|
||||
for (const tc of sorted) {
|
||||
throwIfAborted(signal);
|
||||
|
||||
let args;
|
||||
let parseError = null;
|
||||
try {
|
||||
args = tc.function.arguments ? JSON.parse(tc.function.arguments) : {};
|
||||
} catch (e) {
|
||||
parseError = e.message;
|
||||
args = {};
|
||||
}
|
||||
|
||||
let result;
|
||||
if (parseError) {
|
||||
console.error(` [tool:${tc.function.name}] bad JSON args:`, parseError);
|
||||
result = {
|
||||
content: `Error: tool arguments were not valid JSON (${parseError}). ` +
|
||||
`Please retry with well-formed arguments.`,
|
||||
};
|
||||
} else {
|
||||
try {
|
||||
result = await executeTool(client, tc.function.name, args, signal, ctx);
|
||||
} catch (err) {
|
||||
if (err.code === 'CONVERSATION_ABORTED') throw err;
|
||||
console.error(` [tool:${tc.function.name}] error:`, err.message);
|
||||
result = { content: `Error: ${err.message}` };
|
||||
}
|
||||
}
|
||||
|
||||
messages.push({
|
||||
role: 'tool',
|
||||
tool_call_id: tc.id,
|
||||
content: result.content,
|
||||
});
|
||||
|
||||
// Photo: emit as a follow-up user message (tool messages can't carry images).
|
||||
if (result.image) {
|
||||
messages.push({
|
||||
role: 'user',
|
||||
content: [
|
||||
{ type: 'text', text: "Photo from Jibo's camera:" },
|
||||
{
|
||||
type: 'image_url',
|
||||
image_url: { url: `data:image/jpeg;base64,${result.image}` },
|
||||
},
|
||||
],
|
||||
});
|
||||
}
|
||||
|
||||
if (result.endConversation) endRequested = true;
|
||||
}
|
||||
|
||||
if (endRequested) {
|
||||
console.log('[agent] end_conversation requested — exiting loop.');
|
||||
await ctx.speechChain.catch(() => { });
|
||||
return;
|
||||
}
|
||||
|
||||
// Approaching the safety limit: nudge the model to wrap up gracefully
|
||||
// on its next turn instead of getting cut off mid-thought.
|
||||
if (!wrapUpInjected && turn === MAX_AGENT_TURNS - 2) {
|
||||
messages.push({
|
||||
role: 'system',
|
||||
content:
|
||||
'You are about to hit the turn limit. On your next turn, give a brief ' +
|
||||
'farewell via "say" and call "end_conversation". Do not call "listen".',
|
||||
});
|
||||
wrapUpInjected = true;
|
||||
}
|
||||
}
|
||||
|
||||
console.warn('[agent] hit MAX_AGENT_TURNS — forcing exit.');
|
||||
await ctx.speechChain.catch(() => { });
|
||||
try {
|
||||
await client.behavior.say("Let's pick this up another time. Bye!");
|
||||
} catch (_) { }
|
||||
}
|
||||
|
||||
// ── Main ───────────────────────────────────────────────────────────────────────
|
||||
|
||||
async function main() {
|
||||
const client = new Client({ host: JIBO_IP, autoSubscribe: false });
|
||||
|
||||
client.once('ready', () => {
|
||||
console.log(`[jibo-llm] Connected — session ${client.sessionID}`);
|
||||
});
|
||||
|
||||
client.on('error', (err) => {
|
||||
console.error('[jibo-llm] Client error:', err.message);
|
||||
});
|
||||
|
||||
// ── Connect ────────────────────────────────────────────────────────────────
|
||||
console.log(`[jibo-llm] Connecting to Jibo at ${JIBO_IP}…`);
|
||||
await client.connect();
|
||||
await client.behavior.setAttention(AttentionMode.Engaged);
|
||||
|
||||
// Start wakeword listener
|
||||
client.audio.watchWakeword();
|
||||
console.log('[jibo-llm] Ready — listening for "Hey Jibo"…');
|
||||
|
||||
// ── Hotword → agent conversation ───────────────────────────────────────────
|
||||
/** @type {AbortController|null} */
|
||||
let activeController = null;
|
||||
|
||||
client.on('hotword', async (event) => {
|
||||
// ── Cancel any running conversation ──────────────────────────────────
|
||||
if (activeController) {
|
||||
console.log('[hotword] Aborting previous conversation…');
|
||||
activeController.abort();
|
||||
activeController = null;
|
||||
}
|
||||
|
||||
const controller = new AbortController();
|
||||
activeController = controller;
|
||||
const { signal } = controller;
|
||||
|
||||
console.log(`\n[hotword] "${event.utterance}" (score ${event.score})`);
|
||||
|
||||
try {
|
||||
// Acknowledge
|
||||
throwIfAborted(signal);
|
||||
await Promise.race([
|
||||
client.behavior.playAnimCat('excited', { nonBlocking: true }),
|
||||
onAbort(signal),
|
||||
]);
|
||||
|
||||
// Listen for the user's initial speech
|
||||
throwIfAborted(signal);
|
||||
let userText;
|
||||
client.display.showText('Listening...');
|
||||
try {
|
||||
const speech = await Promise.race([
|
||||
client.audio.awaitSpeech({ mode: 'local', time: 15000 }),
|
||||
onAbort(signal),
|
||||
]);
|
||||
userText = speech.content;
|
||||
console.log(`[jibo-llm] User said: "${userText}"`);
|
||||
} catch (err) {
|
||||
if (err.code === 'CONVERSATION_ABORTED') throw err;
|
||||
if (err.code === 'SPEECH_TIMEOUT') {
|
||||
throwIfAborted(signal);
|
||||
await client.behavior.say("I didn't hear anything. Talk to me anytime!");
|
||||
return;
|
||||
}
|
||||
throw err;
|
||||
} finally {
|
||||
client.display.showEye();
|
||||
}
|
||||
|
||||
// Build initial message history and run the agent
|
||||
const messages = [
|
||||
{ role: 'system', content: SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userText },
|
||||
];
|
||||
|
||||
await agentLoop(client, messages, signal, userText);
|
||||
} catch (err) {
|
||||
if (err.code === 'CONVERSATION_ABORTED') {
|
||||
console.log('[jibo-llm] Conversation was interrupted by new hotword.');
|
||||
return;
|
||||
}
|
||||
console.error('[jibo-llm] Agent error:', err.message);
|
||||
try { await client.behavior.say("Sorry, something went wrong."); } catch (_) { }
|
||||
} finally {
|
||||
// Only clear if we're still the active conversation
|
||||
if (activeController === controller) {
|
||||
activeController = null;
|
||||
console.log('[jibo-llm] Conversation ended. Listening for "Hey Jibo"…\n');
|
||||
}
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
main().catch((err) => {
|
||||
console.error('[jibo-llm] Fatal:', err);
|
||||
process.exit(1);
|
||||
});
|
||||
497
package-lock.json
generated
Normal file
497
package-lock.json
generated
Normal file
@@ -0,0 +1,497 @@
|
||||
{
|
||||
"name": "jibo-llm",
|
||||
"version": "1.0.0",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "jibo-llm",
|
||||
"version": "1.0.0",
|
||||
"dependencies": {
|
||||
"dotenv": "^16.4.5",
|
||||
"openai": "^4.73.0",
|
||||
"rom-control": "^2.0.1"
|
||||
}
|
||||
},
|
||||
"node_modules/@types/node": {
|
||||
"version": "18.19.130",
|
||||
"resolved": "https://registry.npmjs.org/@types/node/-/node-18.19.130.tgz",
|
||||
"integrity": "sha512-GRaXQx6jGfL8sKfaIDD6OupbIHBr9jv7Jnaml9tB7l4v068PAOXqfcujMMo5PhbIs6ggR1XODELqahT2R8v0fg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"undici-types": "~5.26.4"
|
||||
}
|
||||
},
|
||||
"node_modules/@types/node-fetch": {
|
||||
"version": "2.6.13",
|
||||
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.13.tgz",
|
||||
"integrity": "sha512-QGpRVpzSaUs30JBSGPjOg4Uveu384erbHBoT1zeONvyCfwQxIkUshLAOqN/k9EjGviPRmWTTe6aH2qySWKTVSw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@types/node": "*",
|
||||
"form-data": "^4.0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/abort-controller": {
|
||||
"version": "3.0.0",
|
||||
"resolved": "https://registry.npmjs.org/abort-controller/-/abort-controller-3.0.0.tgz",
|
||||
"integrity": "sha512-h8lQ8tacZYnR3vNQTgibj+tODHI5/+l06Au2Pcriv/Gmet0eaj4TwWH41sO9wnHDiQsEj19q0drzdWdeAHtweg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"event-target-shim": "^5.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=6.5"
|
||||
}
|
||||
},
|
||||
"node_modules/agentkeepalive": {
|
||||
"version": "4.6.0",
|
||||
"resolved": "https://registry.npmjs.org/agentkeepalive/-/agentkeepalive-4.6.0.tgz",
|
||||
"integrity": "sha512-kja8j7PjmncONqaTsB8fQ+wE2mSU2DJ9D4XKoJ5PFWIdRMa6SLSN1ff4mOr4jCbfRSsxR4keIiySJU0N9T5hIQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"humanize-ms": "^1.2.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 8.0.0"
|
||||
}
|
||||
},
|
||||
"node_modules/asynckit": {
|
||||
"version": "0.4.0",
|
||||
"resolved": "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz",
|
||||
"integrity": "sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/call-bind-apply-helpers": {
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz",
|
||||
"integrity": "sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"es-errors": "^1.3.0",
|
||||
"function-bind": "^1.1.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/combined-stream": {
|
||||
"version": "1.0.8",
|
||||
"resolved": "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz",
|
||||
"integrity": "sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"delayed-stream": "~1.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.8"
|
||||
}
|
||||
},
|
||||
"node_modules/delayed-stream": {
|
||||
"version": "1.0.0",
|
||||
"resolved": "https://registry.npmjs.org/delayed-stream/-/delayed-stream-1.0.0.tgz",
|
||||
"integrity": "sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=0.4.0"
|
||||
}
|
||||
},
|
||||
"node_modules/dotenv": {
|
||||
"version": "16.6.1",
|
||||
"resolved": "https://registry.npmjs.org/dotenv/-/dotenv-16.6.1.tgz",
|
||||
"integrity": "sha512-uBq4egWHTcTt33a72vpSG0z3HnPuIl6NqYcTrKEg2azoEyl2hpW0zqlxysq2pK9HlDIHyHyakeYaYnSAwd8bow==",
|
||||
"license": "BSD-2-Clause",
|
||||
"engines": {
|
||||
"node": ">=12"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://dotenvx.com"
|
||||
}
|
||||
},
|
||||
"node_modules/dunder-proto": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz",
|
||||
"integrity": "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"call-bind-apply-helpers": "^1.0.1",
|
||||
"es-errors": "^1.3.0",
|
||||
"gopd": "^1.2.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es-define-property": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.1.tgz",
|
||||
"integrity": "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es-errors": {
|
||||
"version": "1.3.0",
|
||||
"resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz",
|
||||
"integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es-object-atoms": {
|
||||
"version": "1.1.1",
|
||||
"resolved": "https://registry.npmjs.org/es-object-atoms/-/es-object-atoms-1.1.1.tgz",
|
||||
"integrity": "sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"es-errors": "^1.3.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es-set-tostringtag": {
|
||||
"version": "2.1.0",
|
||||
"resolved": "https://registry.npmjs.org/es-set-tostringtag/-/es-set-tostringtag-2.1.0.tgz",
|
||||
"integrity": "sha512-j6vWzfrGVfyXxge+O0x5sh6cvxAog0a/4Rdd2K36zCMV5eJ+/+tOAngRO8cODMNWbVRdVlmGZQL2YS3yR8bIUA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"es-errors": "^1.3.0",
|
||||
"get-intrinsic": "^1.2.6",
|
||||
"has-tostringtag": "^1.0.2",
|
||||
"hasown": "^2.0.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/event-target-shim": {
|
||||
"version": "5.0.1",
|
||||
"resolved": "https://registry.npmjs.org/event-target-shim/-/event-target-shim-5.0.1.tgz",
|
||||
"integrity": "sha512-i/2XbnSz/uxRCU6+NdVJgKWDTM427+MqYbkQzD321DuCQJUqOuJKIA0IM2+W2xtYHdKOmZ4dR6fExsd4SXL+WQ==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=6"
|
||||
}
|
||||
},
|
||||
"node_modules/form-data": {
|
||||
"version": "4.0.5",
|
||||
"resolved": "https://registry.npmjs.org/form-data/-/form-data-4.0.5.tgz",
|
||||
"integrity": "sha512-8RipRLol37bNs2bhoV67fiTEvdTrbMUYcFTiy3+wuuOnUog2QBHCZWXDRijWQfAkhBj2Uf5UnVaiWwA5vdd82w==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"asynckit": "^0.4.0",
|
||||
"combined-stream": "^1.0.8",
|
||||
"es-set-tostringtag": "^2.1.0",
|
||||
"hasown": "^2.0.2",
|
||||
"mime-types": "^2.1.12"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 6"
|
||||
}
|
||||
},
|
||||
"node_modules/form-data-encoder": {
|
||||
"version": "1.7.2",
|
||||
"resolved": "https://registry.npmjs.org/form-data-encoder/-/form-data-encoder-1.7.2.tgz",
|
||||
"integrity": "sha512-qfqtYan3rxrnCk1VYaA4H+Ms9xdpPqvLZa6xmMgFvhO32x7/3J/ExcTd6qpxM0vH2GdMI+poehyBZvqfMTto8A==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/formdata-node": {
|
||||
"version": "4.4.1",
|
||||
"resolved": "https://registry.npmjs.org/formdata-node/-/formdata-node-4.4.1.tgz",
|
||||
"integrity": "sha512-0iirZp3uVDjVGt9p49aTaqjk84TrglENEDuqfdlZQ1roC9CWlPk6Avf8EEnZNcAqPonwkG35x4n3ww/1THYAeQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"node-domexception": "1.0.0",
|
||||
"web-streams-polyfill": "4.0.0-beta.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 12.20"
|
||||
}
|
||||
},
|
||||
"node_modules/function-bind": {
|
||||
"version": "1.1.2",
|
||||
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
|
||||
"integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
|
||||
"license": "MIT",
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/get-intrinsic": {
|
||||
"version": "1.3.0",
|
||||
"resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz",
|
||||
"integrity": "sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"call-bind-apply-helpers": "^1.0.2",
|
||||
"es-define-property": "^1.0.1",
|
||||
"es-errors": "^1.3.0",
|
||||
"es-object-atoms": "^1.1.1",
|
||||
"function-bind": "^1.1.2",
|
||||
"get-proto": "^1.0.1",
|
||||
"gopd": "^1.2.0",
|
||||
"has-symbols": "^1.1.0",
|
||||
"hasown": "^2.0.2",
|
||||
"math-intrinsics": "^1.1.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/get-proto": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz",
|
||||
"integrity": "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"dunder-proto": "^1.0.1",
|
||||
"es-object-atoms": "^1.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/gopd": {
|
||||
"version": "1.2.0",
|
||||
"resolved": "https://registry.npmjs.org/gopd/-/gopd-1.2.0.tgz",
|
||||
"integrity": "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/has-symbols": {
|
||||
"version": "1.1.0",
|
||||
"resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.1.0.tgz",
|
||||
"integrity": "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/has-tostringtag": {
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/has-tostringtag/-/has-tostringtag-1.0.2.tgz",
|
||||
"integrity": "sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"has-symbols": "^1.0.3"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/hasown": {
|
||||
"version": "2.0.3",
|
||||
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.3.tgz",
|
||||
"integrity": "sha512-ej4AhfhfL2Q2zpMmLo7U1Uv9+PyhIZpgQLGT1F9miIGmiCJIoCgSmczFdrc97mWT4kVY72KA+WnnhJ5pghSvSg==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"function-bind": "^1.1.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/humanize-ms": {
|
||||
"version": "1.2.1",
|
||||
"resolved": "https://registry.npmjs.org/humanize-ms/-/humanize-ms-1.2.1.tgz",
|
||||
"integrity": "sha512-Fl70vYtsAFb/C06PTS9dZBo7ihau+Tu/DNCk/OyHhea07S+aeMWpFFkUaXRa8fI+ScZbEI8dfSxwY7gxZ9SAVQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"ms": "^2.0.0"
|
||||
}
|
||||
},
|
||||
"node_modules/math-intrinsics": {
|
||||
"version": "1.1.0",
|
||||
"resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
|
||||
"integrity": "sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/mime-db": {
|
||||
"version": "1.52.0",
|
||||
"resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz",
|
||||
"integrity": "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 0.6"
|
||||
}
|
||||
},
|
||||
"node_modules/mime-types": {
|
||||
"version": "2.1.35",
|
||||
"resolved": "https://registry.npmjs.org/mime-types/-/mime-types-2.1.35.tgz",
|
||||
"integrity": "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"mime-db": "1.52.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.6"
|
||||
}
|
||||
},
|
||||
"node_modules/ms": {
|
||||
"version": "2.1.3",
|
||||
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
|
||||
"integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/node-domexception": {
|
||||
"version": "1.0.0",
|
||||
"resolved": "https://registry.npmjs.org/node-domexception/-/node-domexception-1.0.0.tgz",
|
||||
"integrity": "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==",
|
||||
"deprecated": "Use your platform's native DOMException instead",
|
||||
"funding": [
|
||||
{
|
||||
"type": "github",
|
||||
"url": "https://github.com/sponsors/jimmywarting"
|
||||
},
|
||||
{
|
||||
"type": "github",
|
||||
"url": "https://paypal.me/jimmywarting"
|
||||
}
|
||||
],
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=10.5.0"
|
||||
}
|
||||
},
|
||||
"node_modules/node-fetch": {
|
||||
"version": "2.7.0",
|
||||
"resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz",
|
||||
"integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"whatwg-url": "^5.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": "4.x || >=6.0.0"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"encoding": "^0.1.0"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"encoding": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/openai": {
|
||||
"version": "4.104.0",
|
||||
"resolved": "https://registry.npmjs.org/openai/-/openai-4.104.0.tgz",
|
||||
"integrity": "sha512-p99EFNsA/yX6UhVO93f5kJsDRLAg+CTA2RBqdHK4RtK8u5IJw32Hyb2dTGKbnnFmnuoBv5r7Z2CURI9sGZpSuA==",
|
||||
"license": "Apache-2.0",
|
||||
"dependencies": {
|
||||
"@types/node": "^18.11.18",
|
||||
"@types/node-fetch": "^2.6.4",
|
||||
"abort-controller": "^3.0.0",
|
||||
"agentkeepalive": "^4.2.1",
|
||||
"form-data-encoder": "1.7.2",
|
||||
"formdata-node": "^4.3.2",
|
||||
"node-fetch": "^2.6.7"
|
||||
},
|
||||
"bin": {
|
||||
"openai": "bin/cli"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"ws": "^8.18.0",
|
||||
"zod": "^3.23.8"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"ws": {
|
||||
"optional": true
|
||||
},
|
||||
"zod": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/rom-control": {
|
||||
"version": "2.0.1",
|
||||
"resolved": "https://registry.npmjs.org/rom-control/-/rom-control-2.0.1.tgz",
|
||||
"integrity": "sha512-1Sek28UGWbsdOPiUbTxzqRFMCKDnv912vgsOd2OhdgM+wKvSCZdAZnLZgNjfeindBmC161Bu9uGCPvx9y6y/LA==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"ws": "^8.14.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/tr46": {
|
||||
"version": "0.0.3",
|
||||
"resolved": "https://registry.npmjs.org/tr46/-/tr46-0.0.3.tgz",
|
||||
"integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/undici-types": {
|
||||
"version": "5.26.5",
|
||||
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-5.26.5.tgz",
|
||||
"integrity": "sha512-JlCMO+ehdEIKqlFxk6IfVoAUVmgz7cU7zD/h9XZ0qzeosSHmUJVOzSQvvYSYWXkFXC+IfLKSIffhv0sVZup6pA==",
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/web-streams-polyfill": {
|
||||
"version": "4.0.0-beta.3",
|
||||
"resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-4.0.0-beta.3.tgz",
|
||||
"integrity": "sha512-QW95TCTaHmsYfHDybGMwO5IJIM93I/6vTRk+daHTWFPhwh+C8Cg7j7XyKrwrj8Ib6vYXe0ocYNrmzY4xAAN6ug==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 14"
|
||||
}
|
||||
},
|
||||
"node_modules/webidl-conversions": {
|
||||
"version": "3.0.1",
|
||||
"resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-3.0.1.tgz",
|
||||
"integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==",
|
||||
"license": "BSD-2-Clause"
|
||||
},
|
||||
"node_modules/whatwg-url": {
|
||||
"version": "5.0.0",
|
||||
"resolved": "https://registry.npmjs.org/whatwg-url/-/whatwg-url-5.0.0.tgz",
|
||||
"integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"tr46": "~0.0.3",
|
||||
"webidl-conversions": "^3.0.0"
|
||||
}
|
||||
},
|
||||
"node_modules/ws": {
|
||||
"version": "8.20.0",
|
||||
"resolved": "https://registry.npmjs.org/ws/-/ws-8.20.0.tgz",
|
||||
"integrity": "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=10.0.0"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"bufferutil": "^4.0.1",
|
||||
"utf-8-validate": ">=5.0.2"
|
||||
},
|
||||
"peerDependenciesMeta": {
|
||||
"bufferutil": {
|
||||
"optional": true
|
||||
},
|
||||
"utf-8-validate": {
|
||||
"optional": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
14
package.json
Normal file
14
package.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"name": "jibo-llm",
|
||||
"version": "1.0.0",
|
||||
"description": "Hotword-triggered LLM conversation loop for Jibo",
|
||||
"main": "index.js",
|
||||
"scripts": {
|
||||
"start": "node index.js"
|
||||
},
|
||||
"dependencies": {
|
||||
"dotenv": "^16.4.5",
|
||||
"openai": "^4.73.0",
|
||||
"rom-control": "^2.0.1"
|
||||
}
|
||||
}
|
||||
569
tools.js
Normal file
569
tools.js
Normal file
@@ -0,0 +1,569 @@
|
||||
/**
|
||||
* Tool definitions and executor for the Jibo LLM agent.
|
||||
*
|
||||
* Each tool maps to a rom-control capability the LLM can invoke.
|
||||
*/
|
||||
|
||||
// ── OpenAI function-tool schemas ───────────────────────────────────────────────
|
||||
|
||||
const TOOL_SCHEMAS = [
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'say',
|
||||
description:
|
||||
"Speak text aloud through Jibo's speaker. Plain text plus valid ESML tags only " +
|
||||
'(e.g. <anim cat="happy" nonBlocking="true"/>, <break size="0.3"/>). ' +
|
||||
'NEVER include markdown (no *italics*, **bold**, backticks), LaTeX ($...$), ' +
|
||||
'unmatched/closing tags like </es>, or other symbols Jibo cannot pronounce. ' +
|
||||
'Malformed input can hang the TTS engine. Keep each call under 200 chars.',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
text: { type: 'string', description: 'Text (or ESML) to speak.' },
|
||||
},
|
||||
required: ['text'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'listen',
|
||||
description:
|
||||
"Listen for the user's speech and return a transcript. " +
|
||||
'Call this after speaking if you want to continue the conversation.',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
timeout: {
|
||||
type: 'number',
|
||||
description: 'Max seconds to wait. Default 15.',
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'take_photo',
|
||||
description:
|
||||
"Take a photo with Jibo's camera. The image is returned so you can see what's in front of you.",
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
resolution: {
|
||||
type: 'string',
|
||||
enum: ['medium', 'low'],
|
||||
description: 'Default: medium.',
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'show_text',
|
||||
description: "Display text on Jibo's screen.",
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
text: { type: 'string', description: 'Text to show.' },
|
||||
},
|
||||
required: ['text'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'show_image',
|
||||
description: "Display an image on Jibo's screen from a URL.",
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
url: { type: 'string', description: 'Image URL.' },
|
||||
},
|
||||
required: ['url'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'show_eye',
|
||||
description: "Reset Jibo's screen to the default eye animation.",
|
||||
parameters: { type: 'object', properties: {} },
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'look_at_angle',
|
||||
description: "Turn Jibo's head. theta = yaw (±180°, positive right), psi = pitch (±30°, positive up).",
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
theta: { type: 'number', description: 'Yaw degrees.' },
|
||||
psi: { type: 'number', description: 'Pitch degrees.' },
|
||||
},
|
||||
required: ['theta', 'psi'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'set_volume',
|
||||
description: "Set Jibo's speaker volume (0.0 – 1.0).",
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
level: { type: 'number', description: 'Volume 0.0 to 1.0.' },
|
||||
},
|
||||
required: ['level'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'web_search',
|
||||
description:
|
||||
'Search the web via Brave Search. Use for current events, facts you are unsure of, ' +
|
||||
'or anything that may have changed since training. Returns titles, URLs, and snippets.',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
query: { type: 'string', description: 'The search query.' },
|
||||
count: {
|
||||
type: 'number',
|
||||
description: 'How many results to return (1–10). Default 5.',
|
||||
},
|
||||
freshness: {
|
||||
type: 'string',
|
||||
enum: ['pd', 'pw', 'pm', 'py'],
|
||||
description:
|
||||
'Optional recency filter: pd=past day, pw=past week, pm=past month, py=past year.',
|
||||
},
|
||||
},
|
||||
required: ['query'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'fetch_url',
|
||||
description:
|
||||
'Fetch the contents of a web page by URL. Prefers markdown via content ' +
|
||||
'negotiation (Cloudflare Markdown for Agents) and falls back to HTML→text. ' +
|
||||
'Use after web_search to read a result, or to traverse linked pages.',
|
||||
parameters: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
url: { type: 'string', description: 'Absolute http(s) URL to fetch.' },
|
||||
max_chars: {
|
||||
type: 'number',
|
||||
description: 'Truncate the body to this many characters. Default 4000.',
|
||||
},
|
||||
},
|
||||
required: ['url'],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'function',
|
||||
function: {
|
||||
name: 'end_conversation',
|
||||
description:
|
||||
'Call this when the conversation has reached a natural end and you do NOT want to ' +
|
||||
'listen for another reply. Pair it with a final "say" in the same turn for a farewell.',
|
||||
parameters: { type: 'object', properties: {} },
|
||||
},
|
||||
},
|
||||
];
|
||||
|
||||
// ── Resolution map ─────────────────────────────────────────────────────────────
|
||||
|
||||
const RES_MAP = { high: 'highRes', medium: 'medRes', low: 'lowRes' };
|
||||
|
||||
// ── Screen text helpers ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Word-wrap text for Jibo's small screen. Breaks oversized words, respects
|
||||
* existing newlines, and truncates with an ellipsis past `maxLines`.
|
||||
*/
|
||||
function wrapForScreen(text, width = 40, maxLines = 10) {
|
||||
const out = [];
|
||||
for (const para of String(text).split('\n')) {
|
||||
if (para === '') { out.push(''); continue; }
|
||||
let line = '';
|
||||
for (const word of para.split(/\s+/).filter(Boolean)) {
|
||||
if (word.length > width) {
|
||||
if (line) { out.push(line); line = ''; }
|
||||
for (let i = 0; i < word.length; i += width) {
|
||||
const chunk = word.slice(i, i + width);
|
||||
if (chunk.length === width) out.push(chunk);
|
||||
else line = chunk;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
const candidate = line ? `${line} ${word}` : word;
|
||||
if (candidate.length > width) {
|
||||
out.push(line);
|
||||
line = word;
|
||||
} else {
|
||||
line = candidate;
|
||||
}
|
||||
}
|
||||
if (line) out.push(line);
|
||||
}
|
||||
if (out.length > maxLines) {
|
||||
return out.slice(0, maxLines - 1).concat('…').join('\n');
|
||||
}
|
||||
return out.join('\n');
|
||||
}
|
||||
|
||||
/**
|
||||
* Strip markup the Jibo TTS engine chokes on (markdown, LaTeX, unmatched
|
||||
* closing tags). Preserves valid ESML self-closing tags like <anim .../> and
|
||||
* <break .../>. Defense-in-depth against models that ignore the instructions.
|
||||
*/
|
||||
function sanitizeForTTS(text) {
|
||||
const ESML_TAGS = /^(anim|break|prosody|emph|phoneme|phrase|style|voice)\b/i;
|
||||
return text
|
||||
// Remove LaTeX inline math: $...$ and $$...$$
|
||||
.replace(/\${1,2}[^$]{0,200}\${1,2}/g, '')
|
||||
// Strip code fences and inline backticks
|
||||
.replace(/```[\s\S]*?```/g, '')
|
||||
.replace(/`+/g, '')
|
||||
// Strip markdown emphasis markers but keep the words
|
||||
.replace(/(\*\*|__)(.*?)\1/g, '$2')
|
||||
.replace(/(\*|_)(?=\S)(.+?)(?<=\S)\1/g, '$2')
|
||||
// Drop any tag that isn't a known ESML tag (e.g. </es>, <br>, etc.)
|
||||
.replace(/<\/?([a-zA-Z][^\s>/]*)\b[^>]*\/?>/g, (m, name) =>
|
||||
ESML_TAGS.test(name) ? m : '')
|
||||
// Collapse extra whitespace
|
||||
.replace(/[ \t]+/g, ' ')
|
||||
.trim();
|
||||
}
|
||||
|
||||
// ── Abort helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
function throwIfAborted(signal) {
|
||||
if (signal?.aborted) {
|
||||
const err = new Error('Conversation aborted');
|
||||
err.code = 'CONVERSATION_ABORTED';
|
||||
throw err;
|
||||
}
|
||||
}
|
||||
|
||||
function onAbort(signal) {
|
||||
if (!signal) return new Promise(() => { }); // never resolves
|
||||
return new Promise((_, reject) => {
|
||||
const handler = () => {
|
||||
const err = new Error('Conversation aborted');
|
||||
err.code = 'CONVERSATION_ABORTED';
|
||||
reject(err);
|
||||
};
|
||||
if (signal.aborted) return handler();
|
||||
signal.addEventListener('abort', handler, { once: true });
|
||||
});
|
||||
}
|
||||
|
||||
// ── Tool executor ──────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Execute a single tool call against the Jibo client.
|
||||
*
|
||||
* Returns { content, image? }.
|
||||
* - content — text string for the tool-result message
|
||||
* - image — optional base64 JPEG (only for take_photo)
|
||||
*
|
||||
* @param {import('rom-control').Client} client
|
||||
* @param {string} name Tool function name
|
||||
* @param {object} args Parsed arguments
|
||||
* @param {AbortSignal} [signal] Cancellation signal
|
||||
* @returns {Promise<{ content: string, image?: string }>}
|
||||
*/
|
||||
async function executeTool(client, name, args, signal, ctx) {
|
||||
throwIfAborted(signal);
|
||||
ctx = ctx || {};
|
||||
if (!ctx.speechChain) ctx.speechChain = Promise.resolve();
|
||||
switch (name) {
|
||||
// ── Communication ──────────────────────────────────────────────────────
|
||||
case 'say': {
|
||||
const text = sanitizeForTTS(String(args.text || ''));
|
||||
console.log(` [tool:say] "${text}" (queued)`);
|
||||
// Estimate ~80ms per char + 5s base, capped at 60s. Anything longer
|
||||
// is almost certainly Jibo's TTS hung on bad ESML/markup; we'd rather
|
||||
// log a warning and unblock the conversation than deadlock listen.
|
||||
const estimateMs = Math.min(60000, 5000 + text.length * 80);
|
||||
|
||||
ctx.speechChain = ctx.speechChain
|
||||
.then(() => {
|
||||
const started = Date.now();
|
||||
console.log(` [tool:say] speaking… (timeout ${estimateMs}ms)`);
|
||||
let timer;
|
||||
const timeout = new Promise((resolve) => {
|
||||
timer = setTimeout(() => {
|
||||
console.warn(` [tool:say] timed out after ${estimateMs}ms — continuing.`);
|
||||
resolve();
|
||||
}, estimateMs);
|
||||
});
|
||||
return Promise.race([
|
||||
client.behavior.say(text, { signal }),
|
||||
onAbort(signal),
|
||||
timeout,
|
||||
]).finally(() => {
|
||||
clearTimeout(timer);
|
||||
console.log(` [tool:say] done in ${Date.now() - started}ms`);
|
||||
});
|
||||
})
|
||||
.catch((err) => {
|
||||
if (err.code === 'CONVERSATION_ABORTED') return;
|
||||
console.error(' [tool:say] error:', err.message);
|
||||
});
|
||||
return { content: 'Speech queued — Jibo will speak it shortly. Continue with other tools; listen will wait for it.' };
|
||||
}
|
||||
|
||||
case 'listen': {
|
||||
const ms = (args.timeout || 15) * 1000;
|
||||
// Make sure pending speech finishes before we open the mic, otherwise
|
||||
// Jibo will hear his own voice.
|
||||
console.log(' [tool:listen] awaiting pending speech…');
|
||||
await Promise.race([ctx.speechChain, onAbort(signal)]);
|
||||
throwIfAborted(signal);
|
||||
console.log(` [tool:listen] waiting ${ms}ms…`);
|
||||
client.display.showText('Listening...');
|
||||
try {
|
||||
const speech = await Promise.race([
|
||||
client.audio.awaitSpeech({ mode: 'local', time: ms }),
|
||||
onAbort(signal),
|
||||
]);
|
||||
console.log(` [tool:listen] heard: "${speech.content}"`);
|
||||
ctx.lastHeard = speech.content;
|
||||
return { content: `User said: "${speech.content}"` };
|
||||
} catch (err) {
|
||||
if (err.code === 'CONVERSATION_ABORTED') throw err;
|
||||
if (err.code === 'SPEECH_TIMEOUT') {
|
||||
console.log(' [tool:listen] timed out');
|
||||
return { content: 'No speech detected — user did not respond.' };
|
||||
}
|
||||
throw err;
|
||||
} finally {
|
||||
client.display.showEye();
|
||||
}
|
||||
}
|
||||
|
||||
// ── Camera ─────────────────────────────────────────────────────────────
|
||||
case 'take_photo': {
|
||||
const res = RES_MAP[args.resolution] || 'medRes';
|
||||
console.log(` [tool:take_photo] ${res}…`);
|
||||
const photo = await Promise.race([
|
||||
client.camera.takePhoto({ resolution: res, timeout: 30000 }),
|
||||
onAbort(signal),
|
||||
]);
|
||||
const buf = await photo.fetchBuffer();
|
||||
console.log(` [tool:take_photo] ${buf.length} bytes captured`);
|
||||
return {
|
||||
content: "Photo captured from Jibo's camera.",
|
||||
image: buf.toString('base64'),
|
||||
};
|
||||
}
|
||||
|
||||
// ── Display ────────────────────────────────────────────────────────────
|
||||
case 'show_text': {
|
||||
console.log(` [tool:show_text] "${args.text}"`);
|
||||
client.display.showText(wrapForScreen(args.text, 40, 10));
|
||||
return { content: 'Text displayed on screen.' };
|
||||
}
|
||||
|
||||
case 'show_image': {
|
||||
console.log(` [tool:show_image] ${args.url}`);
|
||||
client.display.showImage(args.url);
|
||||
return { content: 'Image displayed on screen.' };
|
||||
}
|
||||
|
||||
case 'show_eye': {
|
||||
console.log(' [tool:show_eye]');
|
||||
client.display.showEye();
|
||||
return { content: 'Eye animation restored on screen.' };
|
||||
}
|
||||
|
||||
|
||||
|
||||
case 'look_at_angle': {
|
||||
console.log(` [tool:look_at_angle] θ=${args.theta}° ψ=${args.psi}°`);
|
||||
await client.behavior.lookAtAngle(args.theta, args.psi);
|
||||
return { content: `Now looking at θ=${args.theta}°, ψ=${args.psi}°.` };
|
||||
}
|
||||
|
||||
case 'set_volume': {
|
||||
console.log(` [tool:set_volume] ${args.level}`);
|
||||
await client.audio.setVolume(args.level);
|
||||
return { content: `Volume set to ${args.level}.` };
|
||||
}
|
||||
|
||||
// ── Web search ─────────────────────────────────────────────────────────
|
||||
case 'web_search': {
|
||||
const apiKey = process.env.BRAVE_API_KEY;
|
||||
if (!apiKey) {
|
||||
return {
|
||||
content:
|
||||
'web_search is unavailable: BRAVE_API_KEY environment variable is not set.',
|
||||
};
|
||||
}
|
||||
const query = String(args.query || '').trim();
|
||||
if (!query) {
|
||||
return { content: 'web_search error: query is required.' };
|
||||
}
|
||||
const count = Math.max(1, Math.min(10, Number(args.count) || 5));
|
||||
const params = new URLSearchParams({
|
||||
q: query,
|
||||
count: String(count),
|
||||
extra_snippets: 'true',
|
||||
safesearch: 'moderate',
|
||||
});
|
||||
if (args.freshness) params.set('freshness', String(args.freshness));
|
||||
|
||||
console.log(` [tool:web_search] "${query}" (count=${count})`);
|
||||
const url = `https://api.search.brave.com/res/v1/web/search?${params.toString()}`;
|
||||
const ac = new AbortController();
|
||||
const onAbortHandler = () => ac.abort();
|
||||
signal?.addEventListener('abort', onAbortHandler, { once: true });
|
||||
try {
|
||||
const res = await fetch(url, {
|
||||
headers: {
|
||||
Accept: 'application/json',
|
||||
'Accept-Encoding': 'gzip',
|
||||
'X-Subscription-Token': apiKey,
|
||||
},
|
||||
signal: ac.signal,
|
||||
});
|
||||
if (!res.ok) {
|
||||
const body = await res.text().catch(() => '');
|
||||
return {
|
||||
content: `web_search error: ${res.status} ${res.statusText}. ${body.slice(0, 200)}`,
|
||||
};
|
||||
}
|
||||
const data = await res.json();
|
||||
const results = data?.web?.results || [];
|
||||
if (results.length === 0) {
|
||||
return { content: `No web results found for "${query}".` };
|
||||
}
|
||||
const lines = results.slice(0, count).map((r, i) => {
|
||||
const title = r.title || '(untitled)';
|
||||
const u = r.url || '';
|
||||
const desc = (r.description || '').replace(/\s+/g, ' ').trim();
|
||||
const extras = Array.isArray(r.extra_snippets)
|
||||
? r.extra_snippets.slice(0, 2).map((s) => s.replace(/\s+/g, ' ').trim())
|
||||
: [];
|
||||
const tail = extras.length ? `\n • ${extras.join('\n • ')}` : '';
|
||||
return `${i + 1}. ${title}\n ${u}\n ${desc}${tail}`;
|
||||
});
|
||||
return {
|
||||
content: `Web results for "${query}":\n\n${lines.join('\n\n')}`,
|
||||
};
|
||||
} catch (err) {
|
||||
if (err.name === 'AbortError') throw Object.assign(new Error('Conversation aborted'), { code: 'CONVERSATION_ABORTED' });
|
||||
return { content: `web_search error: ${err.message}` };
|
||||
} finally {
|
||||
signal?.removeEventListener('abort', onAbortHandler);
|
||||
}
|
||||
}
|
||||
|
||||
case 'fetch_url': {
|
||||
const target = String(args.url || '').trim();
|
||||
if (!/^https?:\/\//i.test(target)) {
|
||||
return { content: 'fetch_url error: url must be an absolute http(s) URL.' };
|
||||
}
|
||||
const maxChars = Math.max(200, Math.min(20000, Number(args.max_chars) || 4000));
|
||||
console.log(` [tool:fetch_url] ${target}`);
|
||||
|
||||
const ac = new AbortController();
|
||||
const onAbortHandler = () => ac.abort();
|
||||
signal?.addEventListener('abort', onAbortHandler, { once: true });
|
||||
const timeoutId = setTimeout(() => ac.abort(), 20000);
|
||||
try {
|
||||
const res = await fetch(target, {
|
||||
headers: {
|
||||
// Prefer markdown (Cloudflare Markdown for Agents); accept HTML/text fallback.
|
||||
Accept: 'text/markdown, text/plain;q=0.9, text/html;q=0.8, */*;q=0.1',
|
||||
'Accept-Encoding': 'gzip',
|
||||
'User-Agent': 'jibo-llm/1.0 (+agent)',
|
||||
},
|
||||
redirect: 'follow',
|
||||
signal: ac.signal,
|
||||
});
|
||||
if (!res.ok) {
|
||||
return {
|
||||
content: `fetch_url error: ${res.status} ${res.statusText} from ${target}`,
|
||||
};
|
||||
}
|
||||
const ctype = (res.headers.get('content-type') || '').toLowerCase();
|
||||
if (!/^(text\/|application\/(json|xml|xhtml))/.test(ctype) && ctype) {
|
||||
return {
|
||||
content: `fetch_url: refusing non-text content (${ctype}) from ${target}`,
|
||||
};
|
||||
}
|
||||
let body = await res.text();
|
||||
const isMarkdown = ctype.includes('markdown');
|
||||
const isHtml = ctype.includes('html') || /<html[\s>]/i.test(body.slice(0, 500));
|
||||
|
||||
if (!isMarkdown && isHtml) {
|
||||
// Lightweight HTML→text: strip scripts/styles/tags, collapse whitespace.
|
||||
body = body
|
||||
.replace(/<script[\s\S]*?<\/script>/gi, ' ')
|
||||
.replace(/<style[\s\S]*?<\/style>/gi, ' ')
|
||||
.replace(/<noscript[\s\S]*?<\/noscript>/gi, ' ')
|
||||
.replace(/<!--[\s\S]*?-->/g, ' ')
|
||||
.replace(/<\/(p|div|li|h[1-6]|br|tr)>/gi, '\n')
|
||||
.replace(/<[^>]+>/g, ' ')
|
||||
.replace(/ /g, ' ')
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"')
|
||||
.replace(/'/g, "'")
|
||||
.replace(/[ \t]+/g, ' ')
|
||||
.replace(/\n{3,}/g, '\n\n')
|
||||
.trim();
|
||||
}
|
||||
|
||||
const truncated = body.length > maxChars;
|
||||
const out = truncated ? body.slice(0, maxChars) + '\n…[truncated]' : body;
|
||||
const finalUrl = res.url || target;
|
||||
const fmt = isMarkdown ? 'markdown' : isHtml ? 'html→text' : 'text';
|
||||
return {
|
||||
content: `Fetched ${finalUrl} (${fmt}, ${body.length} chars${truncated ? `, truncated to ${maxChars}` : ''}):\n\n${out}`,
|
||||
};
|
||||
} catch (err) {
|
||||
if (err.name === 'AbortError') {
|
||||
if (signal?.aborted) {
|
||||
throw Object.assign(new Error('Conversation aborted'), { code: 'CONVERSATION_ABORTED' });
|
||||
}
|
||||
return { content: `fetch_url error: timeout fetching ${target}` };
|
||||
}
|
||||
return { content: `fetch_url error: ${err.message}` };
|
||||
} finally {
|
||||
clearTimeout(timeoutId);
|
||||
signal?.removeEventListener('abort', onAbortHandler);
|
||||
}
|
||||
}
|
||||
|
||||
case 'end_conversation': {
|
||||
console.log(' [tool:end_conversation] awaiting pending speech…');
|
||||
await Promise.race([ctx.speechChain, onAbort(signal)]);
|
||||
return { content: 'Conversation ended.', endConversation: true };
|
||||
}
|
||||
|
||||
default:
|
||||
return { content: `Unknown tool "${name}".` };
|
||||
}
|
||||
}
|
||||
|
||||
module.exports = { TOOL_SCHEMAS, executeTool, wrapForScreen };
|
||||
Reference in New Issue
Block a user