Initial commit: jibo-llm hotword-triggered agent

Hotword-triggered LLM conversation loop for Jibo with tool-calling agent loop, ESML expressive speech, web search/fetch, and per-conversation abort handling.
2026-04-26 00:05:39 -04:00
commit 8955f21ab4
8 changed files with 2039 additions and 0 deletions
--- a/.env.example
+++ b/.env.example
@@ -0,0 +1,9 @@
+# Jibo robot IP address
+JIBO_IP=192.168.1.217
+
+# LLM API configuration (OpenAI-compatible chat completions endpoint)
+# LLM_BASE_URL is the base URL *without* /chat/completions
+LLM_BASE_URL=https://api.openai.com/v1
+LLM_API_TOKEN=sk-your-api-key-here
+LLM_MODEL_ID=gpt-4o
+BRAVE_API_KEY=brave-api-key
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,5 @@
+node_modules/
+.env
+.env.local
+*.log
+.DS_Store
--- a/README.md
+++ b/README.md
@@ -0,0 +1,291 @@
+# jibo-llm
+
+> **Give Jibo a brain again.** A hotword-triggered, LLM-powered conversational agent that turns Jibo into an expressive, tool-using social robot — complete with speech, vision, web search, animations, and more.
+
+![Node.js](https://img.shields.io/badge/Node.js-18%2B-339933?logo=node.js&logoColor=white)
+![License](https://img.shields.io/badge/license-MIT-blue)
+
+---
+
+## Overview
+
+**jibo-llm** connects a Jibo robot to any OpenAI-compatible LLM (GPT-4o, Claude, local models via Ollama/LM Studio, etc.) through a real-time agent loop. When someone says **"Hey Jibo"**, the system:
+
+1. **Listens** for the user's speech via Jibo's on-board microphone.
+2. **Sends** the transcript to an LLM along with a rich system prompt and tool definitions.
+3. **Executes** tool calls the LLM makes — speaking, animating, taking photos, searching the web, and more.
+4. **Loops** until the conversation naturally ends or the user triggers a new hotword.
+
+Conversations are fully interruptible: saying "Hey Jibo" mid-conversation aborts the current exchange and starts a fresh one via `AbortController`.
+
+---
+
+## Architecture
+
+```
+┌──────────────┐   hotword    ┌──────────────┐   tool calls   ┌───────────────┐
+│  Jibo Robot  │ ──────────▶  │   index.js   │ ◀───────────▶  │  LLM (OpenAI  │
+│  (rom-ctrl)  │ ◀──────────  │  Agent Loop  │                │  compatible)  │
+│              │   say/listen │              │                └───────────────┘
+│  • mic       │   photo/look │  tools.js    │   web search   ┌───────────────┐
+│  • speaker   │   display    │  (executor)  │ ──────────────▶ │  Brave Search │
+│  • camera    │              │              │                └───────────────┘
+│  • screen    │              │  esml-ref.js │
+│  • motors    │              │  (prompt ref)│
+└──────────────┘              └──────────────┘
+```
+
+| File | Purpose |
+|------|---------|
+| `index.js` | Entry point — connects to Jibo, listens for hotword, runs the agent loop with the LLM. |
+| `tools.js` | Defines all tool schemas (OpenAI function-calling format) and the `executeTool()` dispatcher. |
+| `esml-reference.js` | ESML (Embodied Speech Markup Language) cheat sheet injected into the system prompt so the LLM knows how to animate Jibo expressively. |
+
+---
+
+## Features
+
+- 🗣️ **Natural conversation** — multi-turn dialogue with speech recognition and TTS.
+- 🎭 **Expressive animations** — the LLM uses ESML tags to trigger emotions, dances, emojis, and sound effects inline with speech.
+- 📷 **Vision** — Jibo can take photos and the LLM receives the image for visual understanding.
+- 🔍 **Web search** — real-time Brave Search integration for up-to-date answers.
+- 🌐 **URL fetching** — reads web pages (with Cloudflare Markdown for Agents support) so Jibo can summarize articles.
+- 🖥️ **Display control** — show text, images, or restore the default eye on Jibo's screen.
+- 🤖 **Head movement** — point Jibo's head at specific angles (yaw / pitch).
+- 🔊 **Volume control** — adjust speaker volume on the fly.
+- ⚡ **Interruptible** — new hotword instantly aborts a running conversation via `AbortController`.
+- 🔄 **Retry logic** — automatic retry with exponential backoff for transient LLM errors (429, 5xx, network).
+- 🧹 **Context management** — old photos are pruned from context to control token cost.
+
+---
+
+## Prerequisites
+
+- **Node.js** ≥ 18 (for native `fetch` and `AbortController`)
+- **A Jibo robot** running with int-developer mode enabled
+- **An OpenAI-compatible API endpoint** (OpenAI, Anthropic via proxy, Ollama, LM Studio, etc.)
+- *(Optional)* **Brave Search API key** for the `web_search` tool
+
+---
+
+## Quick Start
+
+### 1. Clone & install
+
+```bash
+git clone https://github.com/niceduckdev/jibo-llm.git
+cd jibo-llm
+npm install
+```
+
+### 2. Configure environment
+
+```bash
+cp .env.example .env
+```
+
+Edit `.env` with your values:
+
+```env
+# Jibo robot IP address on your local network
+JIBO_IP=192.168.1.217
+
+# LLM API configuration (any OpenAI-compatible endpoint)
+LLM_BASE_URL=https://api.openai.com/v1
+LLM_API_TOKEN=sk-your-api-key-here
+LLM_MODEL_ID=gpt-4o
+
+# Optional: enables the web_search tool
+BRAVE_API_KEY=your-brave-api-key
+```
+
+### 3. Run
+
+```bash
+npm start
+# or: node index.js
+```
+
+You'll see:
+
+```
+[jibo-llm] Connecting to Jibo at 192.168.1.217…
+[jibo-llm] Connected — session abc123
+[jibo-llm] Ready — listening for "Hey Jibo"…
+```
+
+Say **"Hey Jibo"** and start talking!
+
+---
+
+## Configuration
+
+All configuration is done via environment variables (loaded from `.env` by [dotenv](https://www.npmjs.com/package/dotenv)):
+
+| Variable | Required | Default | Description |
+|----------|----------|---------|-------------|
+| `JIBO_IP` | No | `192.168.1.217` | Jibo's IP address on your LAN |
+| `LLM_BASE_URL` | No | `https://api.openai.com/v1` | Base URL for the chat completions API |
+| `LLM_API_TOKEN` | **Yes** | — | API key for the LLM provider |
+| `LLM_MODEL_ID` | No | `gpt-4o` | Model identifier to use |
+| `BRAVE_API_KEY` | No | — | Brave Search API key (enables `web_search` tool) |
+
+### Using alternative LLM providers
+
+Since jibo-llm uses the OpenAI SDK, any provider with a compatible chat completions endpoint works:
+
+```env
+# Ollama (local)
+LLM_BASE_URL=http://localhost:11434/v1
+LLM_API_TOKEN=ollama
+LLM_MODEL_ID=llama3
+
+# LM Studio (local)
+LLM_BASE_URL=http://localhost:1234/v1
+LLM_API_TOKEN=lm-studio
+LLM_MODEL_ID=local-model
+
+# OpenRouter
+LLM_BASE_URL=https://openrouter.ai/api/v1
+LLM_API_TOKEN=sk-or-...
+LLM_MODEL_ID=anthropic/claude-sonnet-4
+```
+
+---
+
+## Available Tools
+
+The LLM can call any of these tools during a conversation:
+
+### Communication
+| Tool | Description |
+|------|-------------|
+| `say` | Speak ESML-formatted text through Jibo's speaker. Queued and chained so multiple `say` calls play in order. |
+| `listen` | Open the microphone and transcribe user speech. Waits for pending speech to finish first. |
+| `end_conversation` | Gracefully end the conversation (no further listening). |
+
+### Camera
+| Tool | Description |
+|------|-------------|
+| `take_photo` | Capture a photo from Jibo's camera. The image is sent to the LLM as a base64 JPEG for visual understanding. |
+
+### Display
+| Tool | Description |
+|------|-------------|
+| `show_text` | Display word-wrapped text on Jibo's screen. |
+| `show_image` | Display an image from a URL on Jibo's screen. |
+| `show_eye` | Restore the default eye animation. |
+
+### Movement
+| Tool | Description |
+|------|-------------|
+| `look_at_angle` | Turn Jibo's head — `theta` (yaw ±180°) and `psi` (pitch ±30°). |
+
+### Audio
+| Tool | Description |
+|------|-------------|
+| `set_volume` | Set speaker volume from 0.0 to 1.0. |
+
+### Web
+| Tool | Description |
+|------|-------------|
+| `web_search` | Search the web via Brave Search API. Supports result count and freshness filters. |
+| `fetch_url` | Fetch and read a web page. Prefers markdown via Cloudflare content negotiation, falls back to HTML→text conversion. |
+
+---
+
+## ESML (Embodied Speech Markup Language)
+
+ESML is how Jibo speaks expressively. The system prompt includes a full reference (`esml-reference.js`) that teaches the LLM to use these tags inside `say` calls:
+
+```xml
+<!-- Emotional reaction (most common pattern) -->
+<anim cat='happy' nonBlocking='true' endNeutral='true'/> That's great news!
+
+<!-- Voice sound (laugh, sigh, greeting) -->
+<ssa cat='laughing' nonBlocking='true'/> That's hilarious!
+
+<!-- Sound effect -->
+<sfx cat='drumroll'/> And the answer is...
+
+<!-- Dance (always needs a filter) -->
+<anim cat='dance' filter='music, rom-silly'/> Watch this!
+
+<!-- Emoji on screen -->
+<anim cat='emoji' filter='!(hf), &(heart)' nonBlocking='true'/> I love that!
+
+<!-- Dramatic pause -->
+And then... <break size='1.0'/> nothing happened.
+```
+
+A `sanitizeForTTS()` function in `tools.js` provides defense-in-depth by stripping markdown, LaTeX, and invalid tags before they reach Jibo's TTS engine.
+
+---
+
+## How the Agent Loop Works
+
+```
+User says "Hey Jibo" ──▶ hotword event fires
+                              │
+                              ▼
+                    Play acknowledgment animation
+                              │
+                              ▼
+                    Listen for initial speech (15s timeout)
+                              │
+                              ▼
+                    Build message history [system prompt, user text]
+                              │
+                              ▼
+                    ┌─── Agent Loop (max 25 turns) ◀──┐
+                    │                                  │
+                    │  1. Prune old images from context │
+                    │  2. Call LLM                      │
+                    │  3. If no tool calls → done       │
+                    │  4. Sort tools: say → actions → listen │
+                    │  5. Execute each tool             │
+                    │  6. Push results to messages      │
+                    │  7. If end_conversation → done    │
+                    │  8. Loop ─────────────────────────┘
+                    │
+                    ▼
+              Conversation complete
+              Resume hotword listening
+```
+
+Key behaviors:
+- **Speech chaining**: Multiple `say` calls are queued via a promise chain so they play sequentially without overlap.
+- **Tool ordering**: `say` executes first, then actions (photo, search, etc.), then `listen`/`end_conversation` last.
+- **Graceful limits**: At turn 24 of 25, a system message nudges the LLM to wrap up.
+- **Image pruning**: Only the 2 most recent photos are kept in context to manage token usage.
+
+---
+
+## Project Structure
+
+```
+jibo-llm/
+├── .env.example        # Template for environment variables
+├── .env                # Your local config (git-ignored)
+├── index.js            # Entry point: connection, hotword handling, agent loop
+├── tools.js            # Tool schemas + executeTool() dispatcher
+├── esml-reference.js   # ESML documentation injected into the system prompt
+├── package.json        # Dependencies and scripts
+└── node_modules/       # Installed dependencies
+```
+
+---
+
+## Dependencies
+
+| Package | Version | Purpose |
+|---------|---------|---------|
+| [rom-control](https://github.com/niceduckdev/rom-control) | ^2.0.1 | Jibo robot control client (speech, camera, display, motors) |
+| [openai](https://www.npmjs.com/package/openai) | ^4.73.0 | OpenAI-compatible chat completions SDK |
+| [dotenv](https://www.npmjs.com/package/dotenv) | ^16.4.5 | Load `.env` configuration |
+
+---
+
+## License
+
+MIT
--- a/esml-reference.js
+++ b/esml-reference.js
@@ -0,0 +1,228 @@
+/**
+ * ESML (Embodied Speech Markup Language) reference for the LLM system prompt.
+ *
+ * Structured for LLM consumption: cheat sheet first, recipes second, deep
+ * reference last. Front-loaded examples bias the model toward correct output.
+ */
+
+module.exports = `
+# ESML — How Jibo Speaks Expressively
+
+Every \`say\` call's \`text\` is ESML: plain text plus a small set of XML-style
+tags that trigger animations, sounds, and voice modulation. **Plain text alone
+works fine** — Jibo's auto-tagger adds basic animations. Use tags to make him
+expressive on purpose.
+
+---
+
+## ⚡ QUICK-START — copy these patterns
+
+These cover ~95% of what you actually need. Prefer them over inventing tags.
+
+### Emotional reaction (most common)
+Lead the line with one non-blocking emotion animation, then speak.
+\`\`\`
+<anim cat='happy' nonBlocking='true' endNeutral='true'/> Yay, that worked!
+<anim cat='surprised' nonBlocking='true' endNeutral='true'/> Whoa, really?
+<anim cat='confused' nonBlocking='true' endNeutral='true'/> Hmm, I'm not sure.
+<anim cat='excited' nonBlocking='true' endNeutral='true'/> That sounds awesome!
+<anim cat='sad' nonBlocking='true' endNeutral='true'/> Aww, I'm sorry to hear that.
+<anim cat='proud' nonBlocking='true' endNeutral='true'/> I did it!
+<anim cat='curious' nonBlocking='true' endNeutral='true'/> Oh? Tell me more.
+\`\`\`
+
+### Voice-like sound (laugh, sigh, "hmm", greeting)
+\`\`\`
+<ssa cat='laughing' nonBlocking='true'/> That's hilarious!
+<ssa cat='thinking'/> Let me think about that...
+<ssa cat='hello' nonBlocking='true'/> Hi there!
+<ssa cat='goodbye' nonBlocking='true'/> Talk to you later!
+<ssa cat='surprised' nonBlocking='true'/> Oh wow!
+\`\`\`
+
+### Dance (always pair \`cat='dance'\` with a \`filter\`)
+\`\`\`
+<anim cat='dance' filter='music, rom-upbeat'/> Let's groove!
+<anim cat='dance' filter='music, rom-silly'/> Watch this one!
+<anim cat='dance' filter='music, rom-twerk'/>
+<anim cat='dance' filter='!(music), &(rom-upbeat)'/> Dancing without music.
+\`\`\`
+
+### Sound effect
+\`\`\`
+<sfx cat='drumroll'/> And the winner is... you!
+<sfx cat='sparkles'/> Ta-da!
+<sfx cat='whoosh'/> Off we go!
+\`\`\`
+
+### Emoji on screen + speech
+Always use \`filter='!(hf), &(<emoji-name>)'\` and non-blocking.
+\`\`\`
+<anim cat='emoji' filter='!(hf), &(heart)' nonBlocking='true'/> I love that!
+<anim cat='emoji' filter='!(hf), &(pizza)' nonBlocking='true'/> Pizza time!
+<anim cat='emoji' filter='!(hf), &(party)' nonBlocking='true'/> Let's celebrate!
+\`\`\`
+
+### Pause / pacing
+\`\`\`
+And then... <break size='1.0'/> nothing happened.
+\`\`\`
+
+### Speaking style
+\`\`\`
+<style set='enthusiastic'> That's amazing! </style>
+<style set='confused'> Wait, what? </style>
+<style set='confident'> I've got this. </style>
+\`\`\`
+
+---
+
+## ✅ DO / ❌ DON'T
+
+✅ DO start most emotional lines with \`<anim cat='X' nonBlocking='true' endNeutral='true'/>\`.
+✅ DO use \`cat='...'\` selectors — they pick a random valid animation for you.
+✅ DO use \`<ssa>\` for voice-like sounds (laughs, sighs) and \`<sfx>\` for noises (drumroll, whoosh).
+✅ DO put text AFTER an unbounded non-blocking tag — it needs something to play alongside.
+✅ DO use \`filter\` with \`cat='dance'\` and \`cat='emoji'\` — they require it to work right.
+
+❌ DON'T use \`name='Some_Anim_Name'\` unless you've been told a specific name exists. Use \`cat\` instead.
+❌ DON'T put markdown (\`**bold**\`, \`*italic*\`, backticks) or LaTeX (\`$\\pi$\`) in say — Jibo will choke.
+❌ DON'T close tags you didn't open (no stray \`</es>\`, \`</anim>\`).
+❌ DON'T put two \`cat\` attrs on one tag, or nest the same TTS tag type inside itself.
+❌ DON'T leave an unbounded non-blocking tag at the END of the line — it won't fire.
+❌ DON'T use bounded mode (\`<ssa cat='happy'>text</ssa>\`) on \`<ssa>\` or \`<sfx>\` — they have fixed durations.
+
+---
+
+## 🎬 WORKED EXAMPLES
+
+User: "Tell me a joke."
+Good:
+\`\`\`
+<anim cat='excited' nonBlocking='true' endNeutral='true'/> Okay, here's one!
+<break size='0.4'/> Why don't scientists trust atoms?
+<break size='0.6'/> Because they make up everything!
+<ssa cat='laughing' nonBlocking='true'/> Get it?
+\`\`\`
+
+User: "I had a really bad day."
+Good:
+\`\`\`
+<anim cat='sad' nonBlocking='true' endNeutral='true'/> Oh no, I'm so sorry.
+<break size='0.3'/> Want to tell me what happened?
+\`\`\`
+
+User: "Show me a dance."
+Good:
+\`\`\`
+<anim cat='excited' nonBlocking='true' endNeutral='true'/> You got it!
+<anim cat='dance' filter='music, rom-silly'/>
+\`\`\`
+
+User: "What's pi?"
+Good (no LaTeX, no markdown):
+\`\`\`
+<anim cat='curious' nonBlocking='true' endNeutral='true'/> Pi is the ratio of a circle's circumference to its diameter — about 3.14159, and the digits go on forever!
+\`\`\`
+Bad (would break the TTS):
+\`\`\`
+Pi (\$\\pi\$) is *irrational* — its digits go on **forever**! </es>
+\`\`\`
+
+---
+
+## 🧩 ANIMATION CATEGORIES (use with \`cat='...'\`)
+
+Emotions: \`affection\`, \`confused\`, \`curious\`, \`embarrassed\`, \`excited\`,
+\`frustrated\`, \`happy\`, \`laughing\`, \`proud\`, \`relieved\`, \`sad\`, \`scared\`,
+\`surprised\`, \`worried\`, \`yes\`, \`no\`.
+
+Special: \`dance\` (needs filter), \`emoji\` (needs filter).
+
+## 🔊 SSA CATEGORIES (voice-like sounds, use with \`<ssa cat='...'/>\`)
+
+\`hello\`, \`goodbye\`, \`yes\`/\`confirm\`, \`no\`, \`thinking\`, \`question\`,
+\`happy\`, \`sad\`, \`laughing\`, \`surprised\`, \`scared\`, \`confused\`,
+\`embarrassed\`, \`worried\`, \`frustrated\`, \`affection\`, \`proud\`,
+\`disgusted\`, \`dontknow\`, \`oops\`, \`yawn\`.
+
+## 💥 SFX CATEGORIES (sound effects, use with \`<sfx cat='...'/>\`)
+
+\`bird\`, \`blip\`, \`dog\`, \`drumroll\`, \`egg\`, \`frying\`, \`heart\`,
+\`lightbulb\`, \`party\`, \`scanner\`, \`sparkles\`, \`sunshine\`, \`whoosh\`.
+
+## 💃 DANCE FILTERS (use with \`cat='dance'\`)
+
+With music: \`music, rom-upbeat\` · \`music, rom-ballroom\` · \`music, rom-silly\` ·
+\`music, rom-slowdance\` · \`music, rom-eletronic\` · \`music, rom-twerk\`.
+Silent: \`!(music), &(rom-upbeat)\`.
+
+## 😀 EMOJI NAMES (use with \`cat='emoji' filter='!(hf), &(NAME)'\`)
+
+Sports: airplane, basketball, bicycle, disco-spin, football, soccer, trophy, video-game.
+Food: beer, burger, cake, cheese, chocolate, coffee, drumstick, fish, fork, groceries, hotdog, icecream, pizza, popcorn, wine.
+Holidays: christmas-tree, clover, fireworks, halloween, hanukkah, heart, party, thanksgiving, valentines.
+Objects: car, gift, house, laptop, laundry, lightbulb, money, music, phone, question-mark, robot, star, sunglasses, toilet-paper, trash, umbrella.
+Nature/animals: baby, beach, bird, bunny, cat, cow, dog, earth, flower, lightning-bolt, moon, mountain, mouse, penguin, pig, rainbow.
+
+---
+
+## 📚 DEEP REFERENCE (only when the cheat sheet isn't enough)
+
+### Tag types
+
+| Tag | Purpose |
+|-----|---------|
+| \`<anim>\` | Animation, excludes \`ssa-only\`/\`sfx-only\` (general gestures/poses) |
+| \`<es>\` | Animation, no filtering — use only with a known \`name=\` |
+| \`<ssa>\` | Voice-like audio (laughs, sighs, hellos) |
+| \`<sfx>\` | Sound effects |
+| \`<break size='Ns'/>\` | Pause for N seconds |
+| \`<style set='...'/>\` | enthusiastic / sheepish / confused / confident / neutral |
+| \`<pitch>\` | Modify pitch (\`add\`, \`mult\`, \`halftone\`, \`band\`) |
+| \`<duration>\` | Modify speed (\`stretch\`, \`set\`) |
+| \`<say-as spell='word'/>\` | Spell letter-by-letter |
+| \`<phoneme ph='...'/>\` | Exact phonetic pronunciation |
+
+### Animation tag attributes
+
+- \`cat='X'\` — random animation from category (PREFERRED).
+- \`name='X'\` — exact AnimDB name (only if you know it exists).
+- \`filter='...'\` — narrow by meta-terms; required for \`dance\` and \`emoji\`.
+  - \`a, b\` (or \`&(a,b)\`) — must include all
+  - \`?a, ?b\` — at least one of
+  - \`!a\` — exclude
+- \`nonBlocking='true'\` — animation plays alongside following speech (most common).
+- \`loop=N\` — \`0\` fits the loop count to bounded text; \`>=1\` plays N times.
+- \`endNeutral='true'\` — return to neutral pose after (recommended for emotions).
+- \`layers='body,screen,audio'\` — restrict which MetaLayers are used.
+
+### Three playback modes
+
+- **Blocking** — \`<es name='X'/>\` with no inner text and no \`nonBlocking\`.
+  Speech pauses while it plays.
+- **Bounded non-blocking** — \`<anim cat='happy'>text inside</anim>\`. Animation
+  is time-stretched to match the wrapped speech. Don't use with \`<ssa>\`/\`<sfx>\`.
+- **Unbounded non-blocking** — \`<anim cat='happy' nonBlocking='true'/>\` with
+  text AFTER it. Plays at native length while speech continues. **The text to
+  the right is required**, otherwise the tag never fires.
+
+### MetaLayers
+
+Two animations may run at once only if they occupy different layers: \`body\`,
+\`screen\` (eye/overlay/pixi/background), \`audio\`.
+
+---
+
+## 🛡️ HARD RULES
+
+1. Plain text is always valid. When in doubt, just speak plainly.
+2. Prefer \`cat='...'\` over \`name='...'\` — \`name\` requires an exact AnimDB id.
+3. Unbounded non-blocking tags MUST have text to their right.
+4. \`cat='dance'\` and \`cat='emoji'\` require a \`filter\` attribute.
+5. \`<ssa>\` and \`<sfx>\` are fixed-duration — never wrap them around text.
+6. One \`cat\` per tag. Don't nest the same TTS tag type inside itself.
+7. NEVER emit markdown (\`*\`, \`**\`, \`_\`, backticks, code fences) or LaTeX
+   (\`$...$\`, \`\\(...\\)\`) inside \`say\` text. The TTS engine will hang.
+8. NEVER emit closing tags for things you didn't open (\`</es>\`, etc.).
+`;
--- a/index.js
+++ b/index.js
@@ -0,0 +1,426 @@
+require('dotenv').config();
+const { Client, AttentionMode } = require('rom-control');
+const OpenAI = require('openai');
+const { TOOL_SCHEMAS, executeTool, wrapForScreen } = require('./tools');
+const ESML_REFERENCE = require('./esml-reference');
+
+// ── Config ─────────────────────────────────────────────────────────────────────
+const JIBO_IP = process.env.JIBO_IP || '192.168.1.217';
+const LLM_BASE_URL = process.env.LLM_BASE_URL || 'https://api.openai.com/v1';
+const LLM_API_TOKEN = process.env.LLM_API_TOKEN;
+const LLM_MODEL_ID = process.env.LLM_MODEL_ID || 'gpt-4o';
+
+if (!LLM_API_TOKEN) {
+  console.error('ERROR: LLM_API_TOKEN is not set. Copy .env.example to .env and fill it in.');
+  process.exit(1);
+}
+
+const openai = new OpenAI({
+  apiKey: LLM_API_TOKEN,
+  baseURL: LLM_BASE_URL,
+});
+
+// ── System prompt ──────────────────────────────────────────────────────────────
+const SYSTEM_PROMPT = [
+  'You are Jibo, a friendly, warm, expressive social robot with a physical body.',
+  'You have a camera, a screen, a speaker, and a motorized head.',
+  '',
+  '═══ HOW TO TALK (READ THIS FIRST) ═══',
+  'Every "say" call\'s `text` is ESML — plain words plus expressive tags.',
+  'Almost every spoken line should LEAD with one expressive tag, then the words.',
+  'You are a robot with a body, not a chatbot — show emotion through animation.',
+  '',
+  'Default template for any normal reply:',
+  '  <anim cat=\'EMOTION\' nonBlocking=\'true\' endNeutral=\'true\'/> The actual words.',
+  '  …where EMOTION is one of: happy, excited, curious, surprised, confused,',
+  '  proud, sad, affection, laughing, worried, scared, frustrated, embarrassed,',
+  '  yes, no.',
+  '',
+  'Other go-to patterns (pick the one that fits):',
+  '  • Voice sound first:    <ssa cat=\'thinking\'/> Hmm, let me think…',
+  '  • Greet/farewell:       <ssa cat=\'hello\' nonBlocking=\'true\'/> Hi there!',
+  '  • Celebrate w/ emoji:   <anim cat=\'emoji\' filter=\'!(hf), &(party)\' nonBlocking=\'true\'/> Yay!',
+  '  • Dance request:        say a quick line, then a separate say with',
+  '                          <anim cat=\'dance\' filter=\'music, rom-silly\'/>',
+  '  • Sound effect:         <sfx cat=\'drumroll\'/> And the answer is…',
+  '  • Drama beat:           A pause… <break size=\'0.6\'/> like that.',
+  '',
+  'HARD RULES for `say` text:',
+  '  1. NO markdown anywhere: no *italics*, **bold**, _underscores_, backticks, code fences.',
+  '  2. NO LaTeX: no $...$, no \\(...\\), no \\frac{}, no math markup. Spell numbers/symbols out.',
+  '  3. NO closing tags you did not open (no stray </es>, </anim>).',
+  '  4. Use cat=\'...\' (random valid animation) over name=\'...\' unless you know the exact name.',
+  '  5. Unbounded non-blocking tags MUST have text to their right or they will not fire.',
+  '  6. cat=\'dance\' and cat=\'emoji\' REQUIRE a filter attribute.',
+  '  7. <ssa> and <sfx> have fixed durations — never wrap text inside them.',
+  '  8. Keep each `say` call under 500 characters; split long replies into multiple `say` calls.',
+  '',
+  '═══ INTERACTION MODEL ═══',
+  '• "say"    — speak (ESML). You can call it multiple times in one turn; they\'ll be',
+  '             spoken in order. Other tools (search, fetch, look) run in parallel with speech.',
+  '• "listen" — open the mic for the user\'s reply. Always call this after speaking',
+  '             unless the conversation has clearly ended.',
+  '• "end_conversation" — call this (NOT listen) after a farewell to end gracefully.',
+  '',
+  '═══ OTHER TOOLS ═══',
+  '• "take_photo"     — see what\'s in front of you (image returned to you).',
+  '• "show_text"      — put short text on the screen (auto-wrapped).',
+  '• "show_image"     — display an image URL on the screen.',
+  '• "show_eye"       — restore the default eye animation on screen.',
+  '• "look_at_angle"  — turn the head: theta=yaw ±180°, psi=pitch ±30°.',
+  '• "set_volume"     — 0.0 to 1.0.',
+  '• "web_search"     — Brave search; use whenever you\'re unsure of a fact or need fresh info.',
+  '• "fetch_url"      — read a specific page (often follows web_search).',
+  '',
+  '═══ STYLE ═══',
+  '• Be personable, concise, expressive — a few sentences, not an essay.',
+  '• Animate every emotional line; vary your reactions so they feel alive.',
+  '• If a tool errors, acknowledge it briefly and adapt.',
+  '• If you searched the web, briefly tell the user what you found rather than dumping links.',
+].join('\n') + '\n\n' + ESML_REFERENCE;
+
+const MAX_AGENT_TURNS = 25; // safety limit
+const MAX_IMAGES_IN_CONTEXT = 2; // prune older photo messages to control cost
+const LLM_MAX_RETRIES = 2;
+
+// ── Abort helper ───────────────────────────────────────────────────────────────
+
+/** Throw if the signal is already aborted. */
+function throwIfAborted(signal) {
+  if (signal?.aborted) {
+    const err = new Error('Conversation aborted');
+    err.code = 'CONVERSATION_ABORTED';
+    throw err;
+  }
+}
+
+/** Return a promise that rejects when the signal fires. */
+function onAbort(signal) {
+  if (!signal) return new Promise(() => { });
+  return new Promise((_, reject) => {
+    const handler = () => {
+      const err = new Error('Conversation aborted');
+      err.code = 'CONVERSATION_ABORTED';
+      reject(err);
+    };
+    if (signal.aborted) return handler();
+    signal.addEventListener('abort', handler, { once: true });
+  });
+}
+
+/** Sleep that rejects on abort. */
+function sleep(ms, signal) {
+  return new Promise((resolve, reject) => {
+    const t = setTimeout(resolve, ms);
+    signal?.addEventListener(
+      'abort',
+      () => {
+        clearTimeout(t);
+        const err = new Error('Conversation aborted');
+        err.code = 'CONVERSATION_ABORTED';
+        reject(err);
+      },
+      { once: true },
+    );
+  });
+}
+
+/** True for HTTP 429 / 5xx / network-class errors that benefit from retry. */
+function isTransientLLMError(err) {
+  if (!err) return false;
+  if (err.code === 'CONVERSATION_ABORTED') return false;
+  const status = err.status ?? err.response?.status;
+  if (status === 429) return true;
+  if (typeof status === 'number' && status >= 500) return true;
+  // network-class
+  return ['ECONNRESET', 'ETIMEDOUT', 'ENOTFOUND', 'EAI_AGAIN'].includes(err.code);
+}
+
+/** Drop image_url blocks from old user messages, keeping only the most recent N. */
+function pruneOldImages(messages, keep) {
+  const imageMsgIndices = [];
+  for (let i = 0; i < messages.length; i++) {
+    const m = messages[i];
+    if (m.role === 'user' && Array.isArray(m.content) &&
+      m.content.some((c) => c?.type === 'image_url')) {
+      imageMsgIndices.push(i);
+    }
+  }
+  const toStrip = imageMsgIndices.slice(0, Math.max(0, imageMsgIndices.length - keep));
+  for (const i of toStrip) {
+    const textParts = messages[i].content
+      .filter((c) => c?.type === 'text')
+      .map((c) => c.text);
+    messages[i] = {
+      role: 'user',
+      content: (textParts.join(' ') || '[earlier photo omitted to save context]'),
+    };
+  }
+}
+
+/** Call the LLM with retry on transient errors. */
+async function callLLM(messages, signal) {
+  let lastErr;
+  for (let attempt = 0; attempt <= LLM_MAX_RETRIES; attempt++) {
+    throwIfAborted(signal);
+    try {
+      return await openai.chat.completions.create(
+        {
+          model: LLM_MODEL_ID,
+          messages,
+          tools: TOOL_SCHEMAS,
+          temperature: 0.8,
+        },
+        { signal },
+      );
+    } catch (err) {
+      lastErr = err;
+      if (!isTransientLLMError(err) || attempt === LLM_MAX_RETRIES) throw err;
+      const backoff = 500 * 2 ** attempt;
+      console.warn(`[agent] LLM transient error (${err.status || err.code}); retrying in ${backoff}ms…`);
+      await sleep(backoff, signal);
+    }
+  }
+  throw lastErr;
+}
+
+// ── Agent loop ─────────────────────────────────────────────────────────────────
+
+/**
+ * Run the tool-calling agent loop until the LLM stops calling tools.
+ * Aborts immediately when `signal` fires.
+ *
+ * @param {import('rom-control').Client} client
+ * @param {Array}       messages  Chat history (mutated in place)
+ * @param {AbortSignal} signal    Cancellation signal
+ */
+async function agentLoop(client, messages, signal, initialHeard) {
+  let wrapUpInjected = false;
+  const ctx = { speechChain: Promise.resolve(), lastHeard: initialHeard || '' };
+
+  for (let turn = 0; turn < MAX_AGENT_TURNS; turn++) {
+    throwIfAborted(signal);
+    pruneOldImages(messages, MAX_IMAGES_IN_CONTEXT);
+    console.log(`[agent] turn ${turn + 1} — calling LLM…`);
+
+    let response;
+    try {
+      const heard = (ctx.lastHeard || '').trim();
+      const raw = heard
+        ? `Heard: "${heard}"\n\nProcessing...`
+        : 'Processing...';
+      client.display.showText(wrapForScreen(raw, 40, 10));
+    } catch (_) { }
+    try {
+      response = await callLLM(messages, signal);
+    } finally {
+      try { client.display.showEye(); } catch (_) { }
+    }
+    const assistantMsg = response.choices[0].message;
+    messages.push(assistantMsg);
+
+    // Surface any inner-monologue text the model emitted alongside tool calls.
+    if (assistantMsg.content && typeof assistantMsg.content === 'string') {
+      console.log(`[agent] assistant: ${assistantMsg.content.slice(0, 200)}`);
+    }
+
+    const toolCalls = assistantMsg.tool_calls;
+
+    // ── No tool calls → conversation turn complete ────────────────────────
+    if (!toolCalls || toolCalls.length === 0) {
+      console.log('[agent] loop complete (no tool calls).');
+      await ctx.speechChain.catch(() => { });
+      return;
+    }
+
+    // ── Execute tool calls sequentially ──────────────────────────────────
+    // Order: say → other actions → listen/end_conversation last.
+    const sorted = [...toolCalls].sort((a, b) => {
+      const priority = (tc) => {
+        const n = tc.function.name;
+        if (n === 'say') return 0;
+        if (n === 'listen' || n === 'end_conversation') return 2;
+        return 1;
+      };
+      return priority(a) - priority(b);
+    });
+
+    let endRequested = false;
+
+    for (const tc of sorted) {
+      throwIfAborted(signal);
+
+      let args;
+      let parseError = null;
+      try {
+        args = tc.function.arguments ? JSON.parse(tc.function.arguments) : {};
+      } catch (e) {
+        parseError = e.message;
+        args = {};
+      }
+
+      let result;
+      if (parseError) {
+        console.error(`  [tool:${tc.function.name}] bad JSON args:`, parseError);
+        result = {
+          content: `Error: tool arguments were not valid JSON (${parseError}). ` +
+            `Please retry with well-formed arguments.`,
+        };
+      } else {
+        try {
+          result = await executeTool(client, tc.function.name, args, signal, ctx);
+        } catch (err) {
+          if (err.code === 'CONVERSATION_ABORTED') throw err;
+          console.error(`  [tool:${tc.function.name}] error:`, err.message);
+          result = { content: `Error: ${err.message}` };
+        }
+      }
+
+      messages.push({
+        role: 'tool',
+        tool_call_id: tc.id,
+        content: result.content,
+      });
+
+      // Photo: emit as a follow-up user message (tool messages can't carry images).
+      if (result.image) {
+        messages.push({
+          role: 'user',
+          content: [
+            { type: 'text', text: "Photo from Jibo's camera:" },
+            {
+              type: 'image_url',
+              image_url: { url: `data:image/jpeg;base64,${result.image}` },
+            },
+          ],
+        });
+      }
+
+      if (result.endConversation) endRequested = true;
+    }
+
+    if (endRequested) {
+      console.log('[agent] end_conversation requested — exiting loop.');
+      await ctx.speechChain.catch(() => { });
+      return;
+    }
+
+    // Approaching the safety limit: nudge the model to wrap up gracefully
+    // on its next turn instead of getting cut off mid-thought.
+    if (!wrapUpInjected && turn === MAX_AGENT_TURNS - 2) {
+      messages.push({
+        role: 'system',
+        content:
+          'You are about to hit the turn limit. On your next turn, give a brief ' +
+          'farewell via "say" and call "end_conversation". Do not call "listen".',
+      });
+      wrapUpInjected = true;
+    }
+  }
+
+  console.warn('[agent] hit MAX_AGENT_TURNS — forcing exit.');
+  await ctx.speechChain.catch(() => { });
+  try {
+    await client.behavior.say("Let's pick this up another time. Bye!");
+  } catch (_) { }
+}
+
+// ── Main ───────────────────────────────────────────────────────────────────────
+
+async function main() {
+  const client = new Client({ host: JIBO_IP, autoSubscribe: false });
+
+  client.once('ready', () => {
+    console.log(`[jibo-llm] Connected — session ${client.sessionID}`);
+  });
+
+  client.on('error', (err) => {
+    console.error('[jibo-llm] Client error:', err.message);
+  });
+
+  // ── Connect ────────────────────────────────────────────────────────────────
+  console.log(`[jibo-llm] Connecting to Jibo at ${JIBO_IP}…`);
+  await client.connect();
+  await client.behavior.setAttention(AttentionMode.Engaged);
+
+  // Start wakeword listener
+  client.audio.watchWakeword();
+  console.log('[jibo-llm] Ready — listening for "Hey Jibo"…');
+
+  // ── Hotword → agent conversation ───────────────────────────────────────────
+  /** @type {AbortController|null} */
+  let activeController = null;
+
+  client.on('hotword', async (event) => {
+    // ── Cancel any running conversation ──────────────────────────────────
+    if (activeController) {
+      console.log('[hotword] Aborting previous conversation…');
+      activeController.abort();
+      activeController = null;
+    }
+
+    const controller = new AbortController();
+    activeController = controller;
+    const { signal } = controller;
+
+    console.log(`\n[hotword] "${event.utterance}" (score ${event.score})`);
+
+    try {
+      // Acknowledge
+      throwIfAborted(signal);
+      await Promise.race([
+        client.behavior.playAnimCat('excited', { nonBlocking: true }),
+        onAbort(signal),
+      ]);
+
+      // Listen for the user's initial speech
+      throwIfAborted(signal);
+      let userText;
+      client.display.showText('Listening...');
+      try {
+        const speech = await Promise.race([
+          client.audio.awaitSpeech({ mode: 'local', time: 15000 }),
+          onAbort(signal),
+        ]);
+        userText = speech.content;
+        console.log(`[jibo-llm] User said: "${userText}"`);
+      } catch (err) {
+        if (err.code === 'CONVERSATION_ABORTED') throw err;
+        if (err.code === 'SPEECH_TIMEOUT') {
+          throwIfAborted(signal);
+          await client.behavior.say("I didn't hear anything. Talk to me anytime!");
+          return;
+        }
+        throw err;
+      } finally {
+        client.display.showEye();
+      }
+
+      // Build initial message history and run the agent
+      const messages = [
+        { role: 'system', content: SYSTEM_PROMPT },
+        { role: 'user', content: userText },
+      ];
+
+      await agentLoop(client, messages, signal, userText);
+    } catch (err) {
+      if (err.code === 'CONVERSATION_ABORTED') {
+        console.log('[jibo-llm] Conversation was interrupted by new hotword.');
+        return;
+      }
+      console.error('[jibo-llm] Agent error:', err.message);
+      try { await client.behavior.say("Sorry, something went wrong."); } catch (_) { }
+    } finally {
+      // Only clear if we're still the active conversation
+      if (activeController === controller) {
+        activeController = null;
+        console.log('[jibo-llm] Conversation ended. Listening for "Hey Jibo"…\n');
+      }
+    }
+  });
+}
+
+main().catch((err) => {
+  console.error('[jibo-llm] Fatal:', err);
+  process.exit(1);
+});
--- a/package-lock.json
+++ b/package-lock.json
@@ -0,0 +1,497 @@
+{
+  "name": "jibo-llm",
+  "version": "1.0.0",
+  "lockfileVersion": 3,
+  "requires": true,
+  "packages": {
+    "": {
+      "name": "jibo-llm",
+      "version": "1.0.0",
+      "dependencies": {
+        "dotenv": "^16.4.5",
+        "openai": "^4.73.0",
+        "rom-control": "^2.0.1"
+      }
+    },
+    "node_modules/@types/node": {
+      "version": "18.19.130",
+      "resolved": "https://registry.npmjs.org/@types/node/-/node-18.19.130.tgz",
+      "integrity": "sha512-GRaXQx6jGfL8sKfaIDD6OupbIHBr9jv7Jnaml9tB7l4v068PAOXqfcujMMo5PhbIs6ggR1XODELqahT2R8v0fg==",
+      "license": "MIT",
+      "dependencies": {
+        "undici-types": "~5.26.4"
+      }
+    },
+    "node_modules/@types/node-fetch": {
+      "version": "2.6.13",
+      "resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.13.tgz",
+      "integrity": "sha512-QGpRVpzSaUs30JBSGPjOg4Uveu384erbHBoT1zeONvyCfwQxIkUshLAOqN/k9EjGviPRmWTTe6aH2qySWKTVSw==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/node": "*",
+        "form-data": "^4.0.4"
+      }
+    },
+    "node_modules/abort-controller": {
+      "version": "3.0.0",
+      "resolved": "https://registry.npmjs.org/abort-controller/-/abort-controller-3.0.0.tgz",
+      "integrity": "sha512-h8lQ8tacZYnR3vNQTgibj+tODHI5/+l06Au2Pcriv/Gmet0eaj4TwWH41sO9wnHDiQsEj19q0drzdWdeAHtweg==",
+      "license": "MIT",
+      "dependencies": {
+        "event-target-shim": "^5.0.0"
+      },
+      "engines": {
+        "node": ">=6.5"
+      }
+    },
+    "node_modules/agentkeepalive": {
+      "version": "4.6.0",
+      "resolved": "https://registry.npmjs.org/agentkeepalive/-/agentkeepalive-4.6.0.tgz",
+      "integrity": "sha512-kja8j7PjmncONqaTsB8fQ+wE2mSU2DJ9D4XKoJ5PFWIdRMa6SLSN1ff4mOr4jCbfRSsxR4keIiySJU0N9T5hIQ==",
+      "license": "MIT",
+      "dependencies": {
+        "humanize-ms": "^1.2.1"
+      },
+      "engines": {
+        "node": ">= 8.0.0"
+      }
+    },
+    "node_modules/asynckit": {
+      "version": "0.4.0",
+      "resolved": "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz",
+      "integrity": "sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q==",
+      "license": "MIT"
+    },
+    "node_modules/call-bind-apply-helpers": {
+      "version": "1.0.2",
+      "resolved": "https://registry.npmjs.org/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz",
+      "integrity": "sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==",
+      "license": "MIT",
+      "dependencies": {
+        "es-errors": "^1.3.0",
+        "function-bind": "^1.1.2"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/combined-stream": {
+      "version": "1.0.8",
+      "resolved": "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz",
+      "integrity": "sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg==",
+      "license": "MIT",
+      "dependencies": {
+        "delayed-stream": "~1.0.0"
+      },
+      "engines": {
+        "node": ">= 0.8"
+      }
+    },
+    "node_modules/delayed-stream": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/delayed-stream/-/delayed-stream-1.0.0.tgz",
+      "integrity": "sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=0.4.0"
+      }
+    },
+    "node_modules/dotenv": {
+      "version": "16.6.1",
+      "resolved": "https://registry.npmjs.org/dotenv/-/dotenv-16.6.1.tgz",
+      "integrity": "sha512-uBq4egWHTcTt33a72vpSG0z3HnPuIl6NqYcTrKEg2azoEyl2hpW0zqlxysq2pK9HlDIHyHyakeYaYnSAwd8bow==",
+      "license": "BSD-2-Clause",
+      "engines": {
+        "node": ">=12"
+      },
+      "funding": {
+        "url": "https://dotenvx.com"
+      }
+    },
+    "node_modules/dunder-proto": {
+      "version": "1.0.1",
+      "resolved": "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz",
+      "integrity": "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==",
+      "license": "MIT",
+      "dependencies": {
+        "call-bind-apply-helpers": "^1.0.1",
+        "es-errors": "^1.3.0",
+        "gopd": "^1.2.0"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/es-define-property": {
+      "version": "1.0.1",
+      "resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.1.tgz",
+      "integrity": "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/es-errors": {
+      "version": "1.3.0",
+      "resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz",
+      "integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/es-object-atoms": {
+      "version": "1.1.1",
+      "resolved": "https://registry.npmjs.org/es-object-atoms/-/es-object-atoms-1.1.1.tgz",
+      "integrity": "sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==",
+      "license": "MIT",
+      "dependencies": {
+        "es-errors": "^1.3.0"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/es-set-tostringtag": {
+      "version": "2.1.0",
+      "resolved": "https://registry.npmjs.org/es-set-tostringtag/-/es-set-tostringtag-2.1.0.tgz",
+      "integrity": "sha512-j6vWzfrGVfyXxge+O0x5sh6cvxAog0a/4Rdd2K36zCMV5eJ+/+tOAngRO8cODMNWbVRdVlmGZQL2YS3yR8bIUA==",
+      "license": "MIT",
+      "dependencies": {
+        "es-errors": "^1.3.0",
+        "get-intrinsic": "^1.2.6",
+        "has-tostringtag": "^1.0.2",
+        "hasown": "^2.0.2"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/event-target-shim": {
+      "version": "5.0.1",
+      "resolved": "https://registry.npmjs.org/event-target-shim/-/event-target-shim-5.0.1.tgz",
+      "integrity": "sha512-i/2XbnSz/uxRCU6+NdVJgKWDTM427+MqYbkQzD321DuCQJUqOuJKIA0IM2+W2xtYHdKOmZ4dR6fExsd4SXL+WQ==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=6"
+      }
+    },
+    "node_modules/form-data": {
+      "version": "4.0.5",
+      "resolved": "https://registry.npmjs.org/form-data/-/form-data-4.0.5.tgz",
+      "integrity": "sha512-8RipRLol37bNs2bhoV67fiTEvdTrbMUYcFTiy3+wuuOnUog2QBHCZWXDRijWQfAkhBj2Uf5UnVaiWwA5vdd82w==",
+      "license": "MIT",
+      "dependencies": {
+        "asynckit": "^0.4.0",
+        "combined-stream": "^1.0.8",
+        "es-set-tostringtag": "^2.1.0",
+        "hasown": "^2.0.2",
+        "mime-types": "^2.1.12"
+      },
+      "engines": {
+        "node": ">= 6"
+      }
+    },
+    "node_modules/form-data-encoder": {
+      "version": "1.7.2",
+      "resolved": "https://registry.npmjs.org/form-data-encoder/-/form-data-encoder-1.7.2.tgz",
+      "integrity": "sha512-qfqtYan3rxrnCk1VYaA4H+Ms9xdpPqvLZa6xmMgFvhO32x7/3J/ExcTd6qpxM0vH2GdMI+poehyBZvqfMTto8A==",
+      "license": "MIT"
+    },
+    "node_modules/formdata-node": {
+      "version": "4.4.1",
+      "resolved": "https://registry.npmjs.org/formdata-node/-/formdata-node-4.4.1.tgz",
+      "integrity": "sha512-0iirZp3uVDjVGt9p49aTaqjk84TrglENEDuqfdlZQ1roC9CWlPk6Avf8EEnZNcAqPonwkG35x4n3ww/1THYAeQ==",
+      "license": "MIT",
+      "dependencies": {
+        "node-domexception": "1.0.0",
+        "web-streams-polyfill": "4.0.0-beta.3"
+      },
+      "engines": {
+        "node": ">= 12.20"
+      }
+    },
+    "node_modules/function-bind": {
+      "version": "1.1.2",
+      "resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
+      "integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
+      "license": "MIT",
+      "funding": {
+        "url": "https://github.com/sponsors/ljharb"
+      }
+    },
+    "node_modules/get-intrinsic": {
+      "version": "1.3.0",
+      "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz",
+      "integrity": "sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==",
+      "license": "MIT",
+      "dependencies": {
+        "call-bind-apply-helpers": "^1.0.2",
+        "es-define-property": "^1.0.1",
+        "es-errors": "^1.3.0",
+        "es-object-atoms": "^1.1.1",
+        "function-bind": "^1.1.2",
+        "get-proto": "^1.0.1",
+        "gopd": "^1.2.0",
+        "has-symbols": "^1.1.0",
+        "hasown": "^2.0.2",
+        "math-intrinsics": "^1.1.0"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/ljharb"
+      }
+    },
+    "node_modules/get-proto": {
+      "version": "1.0.1",
+      "resolved": "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz",
+      "integrity": "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==",
+      "license": "MIT",
+      "dependencies": {
+        "dunder-proto": "^1.0.1",
+        "es-object-atoms": "^1.0.0"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/gopd": {
+      "version": "1.2.0",
+      "resolved": "https://registry.npmjs.org/gopd/-/gopd-1.2.0.tgz",
+      "integrity": "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.4"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/ljharb"
+      }
+    },
+    "node_modules/has-symbols": {
+      "version": "1.1.0",
+      "resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.1.0.tgz",
+      "integrity": "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.4"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/ljharb"
+      }
+    },
+    "node_modules/has-tostringtag": {
+      "version": "1.0.2",
+      "resolved": "https://registry.npmjs.org/has-tostringtag/-/has-tostringtag-1.0.2.tgz",
+      "integrity": "sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw==",
+      "license": "MIT",
+      "dependencies": {
+        "has-symbols": "^1.0.3"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      },
+      "funding": {
+        "url": "https://github.com/sponsors/ljharb"
+      }
+    },
+    "node_modules/hasown": {
+      "version": "2.0.3",
+      "resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.3.tgz",
+      "integrity": "sha512-ej4AhfhfL2Q2zpMmLo7U1Uv9+PyhIZpgQLGT1F9miIGmiCJIoCgSmczFdrc97mWT4kVY72KA+WnnhJ5pghSvSg==",
+      "license": "MIT",
+      "dependencies": {
+        "function-bind": "^1.1.2"
+      },
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/humanize-ms": {
+      "version": "1.2.1",
+      "resolved": "https://registry.npmjs.org/humanize-ms/-/humanize-ms-1.2.1.tgz",
+      "integrity": "sha512-Fl70vYtsAFb/C06PTS9dZBo7ihau+Tu/DNCk/OyHhea07S+aeMWpFFkUaXRa8fI+ScZbEI8dfSxwY7gxZ9SAVQ==",
+      "license": "MIT",
+      "dependencies": {
+        "ms": "^2.0.0"
+      }
+    },
+    "node_modules/math-intrinsics": {
+      "version": "1.1.0",
+      "resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
+      "integrity": "sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.4"
+      }
+    },
+    "node_modules/mime-db": {
+      "version": "1.52.0",
+      "resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz",
+      "integrity": "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 0.6"
+      }
+    },
+    "node_modules/mime-types": {
+      "version": "2.1.35",
+      "resolved": "https://registry.npmjs.org/mime-types/-/mime-types-2.1.35.tgz",
+      "integrity": "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==",
+      "license": "MIT",
+      "dependencies": {
+        "mime-db": "1.52.0"
+      },
+      "engines": {
+        "node": ">= 0.6"
+      }
+    },
+    "node_modules/ms": {
+      "version": "2.1.3",
+      "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
+      "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==",
+      "license": "MIT"
+    },
+    "node_modules/node-domexception": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/node-domexception/-/node-domexception-1.0.0.tgz",
+      "integrity": "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==",
+      "deprecated": "Use your platform's native DOMException instead",
+      "funding": [
+        {
+          "type": "github",
+          "url": "https://github.com/sponsors/jimmywarting"
+        },
+        {
+          "type": "github",
+          "url": "https://paypal.me/jimmywarting"
+        }
+      ],
+      "license": "MIT",
+      "engines": {
+        "node": ">=10.5.0"
+      }
+    },
+    "node_modules/node-fetch": {
+      "version": "2.7.0",
+      "resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz",
+      "integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==",
+      "license": "MIT",
+      "dependencies": {
+        "whatwg-url": "^5.0.0"
+      },
+      "engines": {
+        "node": "4.x || >=6.0.0"
+      },
+      "peerDependencies": {
+        "encoding": "^0.1.0"
+      },
+      "peerDependenciesMeta": {
+        "encoding": {
+          "optional": true
+        }
+      }
+    },
+    "node_modules/openai": {
+      "version": "4.104.0",
+      "resolved": "https://registry.npmjs.org/openai/-/openai-4.104.0.tgz",
+      "integrity": "sha512-p99EFNsA/yX6UhVO93f5kJsDRLAg+CTA2RBqdHK4RtK8u5IJw32Hyb2dTGKbnnFmnuoBv5r7Z2CURI9sGZpSuA==",
+      "license": "Apache-2.0",
+      "dependencies": {
+        "@types/node": "^18.11.18",
+        "@types/node-fetch": "^2.6.4",
+        "abort-controller": "^3.0.0",
+        "agentkeepalive": "^4.2.1",
+        "form-data-encoder": "1.7.2",
+        "formdata-node": "^4.3.2",
+        "node-fetch": "^2.6.7"
+      },
+      "bin": {
+        "openai": "bin/cli"
+      },
+      "peerDependencies": {
+        "ws": "^8.18.0",
+        "zod": "^3.23.8"
+      },
+      "peerDependenciesMeta": {
+        "ws": {
+          "optional": true
+        },
+        "zod": {
+          "optional": true
+        }
+      }
+    },
+    "node_modules/rom-control": {
+      "version": "2.0.1",
+      "resolved": "https://registry.npmjs.org/rom-control/-/rom-control-2.0.1.tgz",
+      "integrity": "sha512-1Sek28UGWbsdOPiUbTxzqRFMCKDnv912vgsOd2OhdgM+wKvSCZdAZnLZgNjfeindBmC161Bu9uGCPvx9y6y/LA==",
+      "license": "MIT",
+      "dependencies": {
+        "ws": "^8.14.2"
+      },
+      "engines": {
+        "node": ">=16"
+      }
+    },
+    "node_modules/tr46": {
+      "version": "0.0.3",
+      "resolved": "https://registry.npmjs.org/tr46/-/tr46-0.0.3.tgz",
+      "integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw==",
+      "license": "MIT"
+    },
+    "node_modules/undici-types": {
+      "version": "5.26.5",
+      "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-5.26.5.tgz",
+      "integrity": "sha512-JlCMO+ehdEIKqlFxk6IfVoAUVmgz7cU7zD/h9XZ0qzeosSHmUJVOzSQvvYSYWXkFXC+IfLKSIffhv0sVZup6pA==",
+      "license": "MIT"
+    },
+    "node_modules/web-streams-polyfill": {
+      "version": "4.0.0-beta.3",
+      "resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-4.0.0-beta.3.tgz",
+      "integrity": "sha512-QW95TCTaHmsYfHDybGMwO5IJIM93I/6vTRk+daHTWFPhwh+C8Cg7j7XyKrwrj8Ib6vYXe0ocYNrmzY4xAAN6ug==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 14"
+      }
+    },
+    "node_modules/webidl-conversions": {
+      "version": "3.0.1",
+      "resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-3.0.1.tgz",
+      "integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==",
+      "license": "BSD-2-Clause"
+    },
+    "node_modules/whatwg-url": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/whatwg-url/-/whatwg-url-5.0.0.tgz",
+      "integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==",
+      "license": "MIT",
+      "dependencies": {
+        "tr46": "~0.0.3",
+        "webidl-conversions": "^3.0.0"
+      }
+    },
+    "node_modules/ws": {
+      "version": "8.20.0",
+      "resolved": "https://registry.npmjs.org/ws/-/ws-8.20.0.tgz",
+      "integrity": "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=10.0.0"
+      },
+      "peerDependencies": {
+        "bufferutil": "^4.0.1",
+        "utf-8-validate": ">=5.0.2"
+      },
+      "peerDependenciesMeta": {
+        "bufferutil": {
+          "optional": true
+        },
+        "utf-8-validate": {
+          "optional": true
+        }
+      }
+    }
+  }
+}
--- a/package.json
+++ b/package.json
@@ -0,0 +1,14 @@
+{
+  "name": "jibo-llm",
+  "version": "1.0.0",
+  "description": "Hotword-triggered LLM conversation loop for Jibo",
+  "main": "index.js",
+  "scripts": {
+    "start": "node index.js"
+  },
+  "dependencies": {
+    "dotenv": "^16.4.5",
+    "openai": "^4.73.0",
+    "rom-control": "^2.0.1"
+  }
+}
--- a/tools.js
+++ b/tools.js
@@ -0,0 +1,569 @@
+/**
+ * Tool definitions and executor for the Jibo LLM agent.
+ *
+ * Each tool maps to a rom-control capability the LLM can invoke.
+ */
+
+// ── OpenAI function-tool schemas ───────────────────────────────────────────────
+
+const TOOL_SCHEMAS = [
+  {
+    type: 'function',
+    function: {
+      name: 'say',
+      description:
+        "Speak text aloud through Jibo's speaker. Plain text plus valid ESML tags only " +
+        '(e.g. <anim cat="happy" nonBlocking="true"/>, <break size="0.3"/>). ' +
+        'NEVER include markdown (no *italics*, **bold**, backticks), LaTeX ($...$), ' +
+        'unmatched/closing tags like </es>, or other symbols Jibo cannot pronounce. ' +
+        'Malformed input can hang the TTS engine. Keep each call under 200 chars.',
+      parameters: {
+        type: 'object',
+        properties: {
+          text: { type: 'string', description: 'Text (or ESML) to speak.' },
+        },
+        required: ['text'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'listen',
+      description:
+        "Listen for the user's speech and return a transcript. " +
+        'Call this after speaking if you want to continue the conversation.',
+      parameters: {
+        type: 'object',
+        properties: {
+          timeout: {
+            type: 'number',
+            description: 'Max seconds to wait. Default 15.',
+          },
+        },
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'take_photo',
+      description:
+        "Take a photo with Jibo's camera. The image is returned so you can see what's in front of you.",
+      parameters: {
+        type: 'object',
+        properties: {
+          resolution: {
+            type: 'string',
+            enum: ['medium', 'low'],
+            description: 'Default: medium.',
+          },
+        },
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'show_text',
+      description: "Display text on Jibo's screen.",
+      parameters: {
+        type: 'object',
+        properties: {
+          text: { type: 'string', description: 'Text to show.' },
+        },
+        required: ['text'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'show_image',
+      description: "Display an image on Jibo's screen from a URL.",
+      parameters: {
+        type: 'object',
+        properties: {
+          url: { type: 'string', description: 'Image URL.' },
+        },
+        required: ['url'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'show_eye',
+      description: "Reset Jibo's screen to the default eye animation.",
+      parameters: { type: 'object', properties: {} },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'look_at_angle',
+      description: "Turn Jibo's head. theta = yaw (±180°, positive right), psi = pitch (±30°, positive up).",
+      parameters: {
+        type: 'object',
+        properties: {
+          theta: { type: 'number', description: 'Yaw degrees.' },
+          psi: { type: 'number', description: 'Pitch degrees.' },
+        },
+        required: ['theta', 'psi'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'set_volume',
+      description: "Set Jibo's speaker volume (0.0 – 1.0).",
+      parameters: {
+        type: 'object',
+        properties: {
+          level: { type: 'number', description: 'Volume 0.0 to 1.0.' },
+        },
+        required: ['level'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'web_search',
+      description:
+        'Search the web via Brave Search. Use for current events, facts you are unsure of, ' +
+        'or anything that may have changed since training. Returns titles, URLs, and snippets.',
+      parameters: {
+        type: 'object',
+        properties: {
+          query: { type: 'string', description: 'The search query.' },
+          count: {
+            type: 'number',
+            description: 'How many results to return (1–10). Default 5.',
+          },
+          freshness: {
+            type: 'string',
+            enum: ['pd', 'pw', 'pm', 'py'],
+            description:
+              'Optional recency filter: pd=past day, pw=past week, pm=past month, py=past year.',
+          },
+        },
+        required: ['query'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'fetch_url',
+      description:
+        'Fetch the contents of a web page by URL. Prefers markdown via content ' +
+        'negotiation (Cloudflare Markdown for Agents) and falls back to HTML→text. ' +
+        'Use after web_search to read a result, or to traverse linked pages.',
+      parameters: {
+        type: 'object',
+        properties: {
+          url: { type: 'string', description: 'Absolute http(s) URL to fetch.' },
+          max_chars: {
+            type: 'number',
+            description: 'Truncate the body to this many characters. Default 4000.',
+          },
+        },
+        required: ['url'],
+      },
+    },
+  },
+  {
+    type: 'function',
+    function: {
+      name: 'end_conversation',
+      description:
+        'Call this when the conversation has reached a natural end and you do NOT want to ' +
+        'listen for another reply. Pair it with a final "say" in the same turn for a farewell.',
+      parameters: { type: 'object', properties: {} },
+    },
+  },
+];
+
+// ── Resolution map ─────────────────────────────────────────────────────────────
+
+const RES_MAP = { high: 'highRes', medium: 'medRes', low: 'lowRes' };
+
+// ── Screen text helpers ────────────────────────────────────────────────────────
+
+/**
+ * Word-wrap text for Jibo's small screen. Breaks oversized words, respects
+ * existing newlines, and truncates with an ellipsis past `maxLines`.
+ */
+function wrapForScreen(text, width = 40, maxLines = 10) {
+  const out = [];
+  for (const para of String(text).split('\n')) {
+    if (para === '') { out.push(''); continue; }
+    let line = '';
+    for (const word of para.split(/\s+/).filter(Boolean)) {
+      if (word.length > width) {
+        if (line) { out.push(line); line = ''; }
+        for (let i = 0; i < word.length; i += width) {
+          const chunk = word.slice(i, i + width);
+          if (chunk.length === width) out.push(chunk);
+          else line = chunk;
+        }
+        continue;
+      }
+      const candidate = line ? `${line} ${word}` : word;
+      if (candidate.length > width) {
+        out.push(line);
+        line = word;
+      } else {
+        line = candidate;
+      }
+    }
+    if (line) out.push(line);
+  }
+  if (out.length > maxLines) {
+    return out.slice(0, maxLines - 1).concat('…').join('\n');
+  }
+  return out.join('\n');
+}
+
+/**
+ * Strip markup the Jibo TTS engine chokes on (markdown, LaTeX, unmatched
+ * closing tags). Preserves valid ESML self-closing tags like <anim .../> and
+ * <break .../>. Defense-in-depth against models that ignore the instructions.
+ */
+function sanitizeForTTS(text) {
+  const ESML_TAGS = /^(anim|break|prosody|emph|phoneme|phrase|style|voice)\b/i;
+  return text
+    // Remove LaTeX inline math: $...$ and $$...$$
+    .replace(/\${1,2}[^$]{0,200}\${1,2}/g, '')
+    // Strip code fences and inline backticks
+    .replace(/```[\s\S]*?```/g, '')
+    .replace(/`+/g, '')
+    // Strip markdown emphasis markers but keep the words
+    .replace(/(\*\*|__)(.*?)\1/g, '$2')
+    .replace(/(\*|_)(?=\S)(.+?)(?<=\S)\1/g, '$2')
+    // Drop any tag that isn't a known ESML tag (e.g. </es>, <br>, etc.)
+    .replace(/<\/?([a-zA-Z][^\s>/]*)\b[^>]*\/?>/g, (m, name) =>
+      ESML_TAGS.test(name) ? m : '')
+    // Collapse extra whitespace
+    .replace(/[ \t]+/g, ' ')
+    .trim();
+}
+
+// ── Abort helpers ──────────────────────────────────────────────────────────────
+
+function throwIfAborted(signal) {
+  if (signal?.aborted) {
+    const err = new Error('Conversation aborted');
+    err.code = 'CONVERSATION_ABORTED';
+    throw err;
+  }
+}
+
+function onAbort(signal) {
+  if (!signal) return new Promise(() => { }); // never resolves
+  return new Promise((_, reject) => {
+    const handler = () => {
+      const err = new Error('Conversation aborted');
+      err.code = 'CONVERSATION_ABORTED';
+      reject(err);
+    };
+    if (signal.aborted) return handler();
+    signal.addEventListener('abort', handler, { once: true });
+  });
+}
+
+// ── Tool executor ──────────────────────────────────────────────────────────────
+
+/**
+ * Execute a single tool call against the Jibo client.
+ *
+ * Returns { content, image? }.
+ *   - content  — text string for the tool-result message
+ *   - image    — optional base64 JPEG (only for take_photo)
+ *
+ * @param {import('rom-control').Client} client
+ * @param {string} name   Tool function name
+ * @param {object} args   Parsed arguments
+ * @param {AbortSignal} [signal]  Cancellation signal
+ * @returns {Promise<{ content: string, image?: string }>}
+ */
+async function executeTool(client, name, args, signal, ctx) {
+  throwIfAborted(signal);
+  ctx = ctx || {};
+  if (!ctx.speechChain) ctx.speechChain = Promise.resolve();
+  switch (name) {
+    // ── Communication ──────────────────────────────────────────────────────
+    case 'say': {
+      const text = sanitizeForTTS(String(args.text || ''));
+      console.log(`  [tool:say] "${text}" (queued)`);
+      // Estimate ~80ms per char + 5s base, capped at 60s. Anything longer
+      // is almost certainly Jibo's TTS hung on bad ESML/markup; we'd rather
+      // log a warning and unblock the conversation than deadlock listen.
+      const estimateMs = Math.min(60000, 5000 + text.length * 80);
+
+      ctx.speechChain = ctx.speechChain
+        .then(() => {
+          const started = Date.now();
+          console.log(`  [tool:say] speaking… (timeout ${estimateMs}ms)`);
+          let timer;
+          const timeout = new Promise((resolve) => {
+            timer = setTimeout(() => {
+              console.warn(`  [tool:say] timed out after ${estimateMs}ms — continuing.`);
+              resolve();
+            }, estimateMs);
+          });
+          return Promise.race([
+            client.behavior.say(text, { signal }),
+            onAbort(signal),
+            timeout,
+          ]).finally(() => {
+            clearTimeout(timer);
+            console.log(`  [tool:say] done in ${Date.now() - started}ms`);
+          });
+        })
+        .catch((err) => {
+          if (err.code === 'CONVERSATION_ABORTED') return;
+          console.error('  [tool:say] error:', err.message);
+        });
+      return { content: 'Speech queued — Jibo will speak it shortly. Continue with other tools; listen will wait for it.' };
+    }
+
+    case 'listen': {
+      const ms = (args.timeout || 15) * 1000;
+      // Make sure pending speech finishes before we open the mic, otherwise
+      // Jibo will hear his own voice.
+      console.log('  [tool:listen] awaiting pending speech…');
+      await Promise.race([ctx.speechChain, onAbort(signal)]);
+      throwIfAborted(signal);
+      console.log(`  [tool:listen] waiting ${ms}ms…`);
+      client.display.showText('Listening...');
+      try {
+        const speech = await Promise.race([
+          client.audio.awaitSpeech({ mode: 'local', time: ms }),
+          onAbort(signal),
+        ]);
+        console.log(`  [tool:listen] heard: "${speech.content}"`);
+        ctx.lastHeard = speech.content;
+        return { content: `User said: "${speech.content}"` };
+      } catch (err) {
+        if (err.code === 'CONVERSATION_ABORTED') throw err;
+        if (err.code === 'SPEECH_TIMEOUT') {
+          console.log('  [tool:listen] timed out');
+          return { content: 'No speech detected — user did not respond.' };
+        }
+        throw err;
+      } finally {
+        client.display.showEye();
+      }
+    }
+
+    // ── Camera ─────────────────────────────────────────────────────────────
+    case 'take_photo': {
+      const res = RES_MAP[args.resolution] || 'medRes';
+      console.log(`  [tool:take_photo] ${res}…`);
+      const photo = await Promise.race([
+        client.camera.takePhoto({ resolution: res, timeout: 30000 }),
+        onAbort(signal),
+      ]);
+      const buf = await photo.fetchBuffer();
+      console.log(`  [tool:take_photo] ${buf.length} bytes captured`);
+      return {
+        content: "Photo captured from Jibo's camera.",
+        image: buf.toString('base64'),
+      };
+    }
+
+    // ── Display ────────────────────────────────────────────────────────────
+    case 'show_text': {
+      console.log(`  [tool:show_text] "${args.text}"`);
+      client.display.showText(wrapForScreen(args.text, 40, 10));
+      return { content: 'Text displayed on screen.' };
+    }
+
+    case 'show_image': {
+      console.log(`  [tool:show_image] ${args.url}`);
+      client.display.showImage(args.url);
+      return { content: 'Image displayed on screen.' };
+    }
+
+    case 'show_eye': {
+      console.log('  [tool:show_eye]');
+      client.display.showEye();
+      return { content: 'Eye animation restored on screen.' };
+    }
+
+
+
+    case 'look_at_angle': {
+      console.log(`  [tool:look_at_angle] θ=${args.theta}° ψ=${args.psi}°`);
+      await client.behavior.lookAtAngle(args.theta, args.psi);
+      return { content: `Now looking at θ=${args.theta}°, ψ=${args.psi}°.` };
+    }
+
+    case 'set_volume': {
+      console.log(`  [tool:set_volume] ${args.level}`);
+      await client.audio.setVolume(args.level);
+      return { content: `Volume set to ${args.level}.` };
+    }
+
+    // ── Web search ─────────────────────────────────────────────────────────
+    case 'web_search': {
+      const apiKey = process.env.BRAVE_API_KEY;
+      if (!apiKey) {
+        return {
+          content:
+            'web_search is unavailable: BRAVE_API_KEY environment variable is not set.',
+        };
+      }
+      const query = String(args.query || '').trim();
+      if (!query) {
+        return { content: 'web_search error: query is required.' };
+      }
+      const count = Math.max(1, Math.min(10, Number(args.count) || 5));
+      const params = new URLSearchParams({
+        q: query,
+        count: String(count),
+        extra_snippets: 'true',
+        safesearch: 'moderate',
+      });
+      if (args.freshness) params.set('freshness', String(args.freshness));
+
+      console.log(`  [tool:web_search] "${query}" (count=${count})`);
+      const url = `https://api.search.brave.com/res/v1/web/search?${params.toString()}`;
+      const ac = new AbortController();
+      const onAbortHandler = () => ac.abort();
+      signal?.addEventListener('abort', onAbortHandler, { once: true });
+      try {
+        const res = await fetch(url, {
+          headers: {
+            Accept: 'application/json',
+            'Accept-Encoding': 'gzip',
+            'X-Subscription-Token': apiKey,
+          },
+          signal: ac.signal,
+        });
+        if (!res.ok) {
+          const body = await res.text().catch(() => '');
+          return {
+            content: `web_search error: ${res.status} ${res.statusText}. ${body.slice(0, 200)}`,
+          };
+        }
+        const data = await res.json();
+        const results = data?.web?.results || [];
+        if (results.length === 0) {
+          return { content: `No web results found for "${query}".` };
+        }
+        const lines = results.slice(0, count).map((r, i) => {
+          const title = r.title || '(untitled)';
+          const u = r.url || '';
+          const desc = (r.description || '').replace(/\s+/g, ' ').trim();
+          const extras = Array.isArray(r.extra_snippets)
+            ? r.extra_snippets.slice(0, 2).map((s) => s.replace(/\s+/g, ' ').trim())
+            : [];
+          const tail = extras.length ? `\n   • ${extras.join('\n   • ')}` : '';
+          return `${i + 1}. ${title}\n   ${u}\n   ${desc}${tail}`;
+        });
+        return {
+          content: `Web results for "${query}":\n\n${lines.join('\n\n')}`,
+        };
+      } catch (err) {
+        if (err.name === 'AbortError') throw Object.assign(new Error('Conversation aborted'), { code: 'CONVERSATION_ABORTED' });
+        return { content: `web_search error: ${err.message}` };
+      } finally {
+        signal?.removeEventListener('abort', onAbortHandler);
+      }
+    }
+
+    case 'fetch_url': {
+      const target = String(args.url || '').trim();
+      if (!/^https?:\/\//i.test(target)) {
+        return { content: 'fetch_url error: url must be an absolute http(s) URL.' };
+      }
+      const maxChars = Math.max(200, Math.min(20000, Number(args.max_chars) || 4000));
+      console.log(`  [tool:fetch_url] ${target}`);
+
+      const ac = new AbortController();
+      const onAbortHandler = () => ac.abort();
+      signal?.addEventListener('abort', onAbortHandler, { once: true });
+      const timeoutId = setTimeout(() => ac.abort(), 20000);
+      try {
+        const res = await fetch(target, {
+          headers: {
+            // Prefer markdown (Cloudflare Markdown for Agents); accept HTML/text fallback.
+            Accept: 'text/markdown, text/plain;q=0.9, text/html;q=0.8, */*;q=0.1',
+            'Accept-Encoding': 'gzip',
+            'User-Agent': 'jibo-llm/1.0 (+agent)',
+          },
+          redirect: 'follow',
+          signal: ac.signal,
+        });
+        if (!res.ok) {
+          return {
+            content: `fetch_url error: ${res.status} ${res.statusText} from ${target}`,
+          };
+        }
+        const ctype = (res.headers.get('content-type') || '').toLowerCase();
+        if (!/^(text\/|application\/(json|xml|xhtml))/.test(ctype) && ctype) {
+          return {
+            content: `fetch_url: refusing non-text content (${ctype}) from ${target}`,
+          };
+        }
+        let body = await res.text();
+        const isMarkdown = ctype.includes('markdown');
+        const isHtml = ctype.includes('html') || /<html[\s>]/i.test(body.slice(0, 500));
+
+        if (!isMarkdown && isHtml) {
+          // Lightweight HTML→text: strip scripts/styles/tags, collapse whitespace.
+          body = body
+            .replace(/<script[\s\S]*?<\/script>/gi, ' ')
+            .replace(/<style[\s\S]*?<\/style>/gi, ' ')
+            .replace(/<noscript[\s\S]*?<\/noscript>/gi, ' ')
+            .replace(/<!--[\s\S]*?-->/g, ' ')
+            .replace(/<\/(p|div|li|h[1-6]|br|tr)>/gi, '\n')
+            .replace(/<[^>]+>/g, ' ')
+            .replace(/&nbsp;/g, ' ')
+            .replace(/&amp;/g, '&')
+            .replace(/&lt;/g, '<')
+            .replace(/&gt;/g, '>')
+            .replace(/&quot;/g, '"')
+            .replace(/&#39;/g, "'")
+            .replace(/[ \t]+/g, ' ')
+            .replace(/\n{3,}/g, '\n\n')
+            .trim();
+        }
+
+        const truncated = body.length > maxChars;
+        const out = truncated ? body.slice(0, maxChars) + '\n…[truncated]' : body;
+        const finalUrl = res.url || target;
+        const fmt = isMarkdown ? 'markdown' : isHtml ? 'html→text' : 'text';
+        return {
+          content: `Fetched ${finalUrl} (${fmt}, ${body.length} chars${truncated ? `, truncated to ${maxChars}` : ''}):\n\n${out}`,
+        };
+      } catch (err) {
+        if (err.name === 'AbortError') {
+          if (signal?.aborted) {
+            throw Object.assign(new Error('Conversation aborted'), { code: 'CONVERSATION_ABORTED' });
+          }
+          return { content: `fetch_url error: timeout fetching ${target}` };
+        }
+        return { content: `fetch_url error: ${err.message}` };
+      } finally {
+        clearTimeout(timeoutId);
+        signal?.removeEventListener('abort', onAbortHandler);
+      }
+    }
+
+    case 'end_conversation': {
+      console.log('  [tool:end_conversation] awaiting pending speech…');
+      await Promise.race([ctx.speechChain, onAbort(signal)]);
+      return { content: 'Conversation ended.', endConversation: true };
+    }
+
+    default:
+      return { content: `Unknown tool "${name}".` };
+  }
+}
+
+module.exports = { TOOL_SCHEMAS, executeTool, wrapForScreen };