Lettuce Engine API

REST + WebSocket API for the Lettuce Engine character system.

Quick Start

# Start the server
lettuce serve                     # 0.0.0.0:8000
lettuce serve --port 3000         # custom port
lettuce serve --reload            # dev mode with auto-reload

OpenAPI docs available at http://localhost:8000/docs.

Authentication

Set LETTUCE_API_KEY env var or api.api_key in config/settings.yaml.

Empty key = auth disabled (dev mode)
REST: Authorization: Bearer <key>
WebSocket: ws://host:port/ws/chat/{slug}?token=<key>
/health and /setup/* never require auth
/config/* endpoints bypass the setup gate (so providers can be configured before setup is complete)

Setup Flow

On first run with no LLM configured, the system enters setup mode. Chat and character endpoints return 503 until configuration is complete.

# 1. Check setup status
curl http://localhost:8000/setup/status
# → {"needs_setup": true, "configured_providers": [], "has_api_key": false}

# 2. Configure a provider
curl -X PUT http://localhost:8000/config/llm/openrouter \
  -H "Content-Type: application/json" \
  -d '{"model": "anthropic/claude-sonnet-4-5-20250929", "api_key": "sk-..."}'

# 3. Set default backend
curl -X PUT http://localhost:8000/config/llm/default \
  -H "Content-Type: application/json" \
  -d '{"provider": "openrouter"}'

# 4. Complete setup
curl -X POST http://localhost:8000/setup/complete
# → {"status": "ok"}

Endpoints

Health & Status

Method	Path	Auth	Description
GET	`/health`	No	Liveness probe: `{status, version}`
GET	`/status`	Yes	Full system dashboard with all character stats

Setup & Configuration

Method	Path	Auth	Description
GET	`/setup/status`	No	Setup state, configured providers, API key status
POST	`/setup/complete`	No	Mark setup done (fails if no LLM configured)
GET	`/config`	Yes	Full config (API keys redacted)
PUT	`/config/llm/{provider}`	Yes	Set LLM provider config
DELETE	`/config/llm/{provider}`	Yes	Remove a provider
PUT	`/config/llm/default`	Yes	Set default backend
PUT	`/config/engine`	Yes	Update engine settings
PUT	`/config/memory`	Yes	Update memory/retrieval settings
PUT	`/config/background`	Yes	Update background loop timers
PUT	`/config/safety`	Yes	Update safety settings
PUT	`/config/research`	Yes	Update global research settings

Valid providers: anthropic, openai, openrouter, ollama

Characters

Method	Path	Auth	Description
GET	`/characters`	Yes	List available characters with loaded status
GET	`/characters/{slug}`	Yes	Character detail (name, era, role, traits)
POST	`/characters/{slug}/load`	Yes	Eagerly load engine into memory
POST	`/characters/{slug}/unload`	Yes	Unload engine, free resources
PUT	`/characters/{slug}/research`	Yes	Enable/disable research for a loaded character
DELETE	`/characters/{slug}/users/{user_id}`	Yes	Delete all user data for a character

Characters lazy-load on first chat if not explicitly loaded.

Slugs are derived from the character's name field — lowercased, spaces replaced with underscores, dots removed. For example, "Samuel Sam Thompson" becomes samuel_sam_thompson. Use GET /characters to discover available slugs.

Chat & History

Method	Path	Auth	Description
POST	`/characters/{slug}/chat`	Yes	Send message, get complete response
GET	`/characters/{slug}/history/{user_id}`	Yes	Get conversation history for a user
WS	`/ws/chat/{slug}?token=KEY`	Yes	WebSocket streaming chat

REST Chat

curl -X POST http://localhost:8000/characters/samuel_sam_thompson/chat \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hey Sam", "user_id": "user1", "user_name": "Alex"}'

Response:

{
  "response": "Hey! What's going on?",
  "character": "Samuel Sam Thompson",
  "emotion": "anticipation",
  "emotion_intensity": 0.45
}

Conversation History

Retrieve stored conversation turns for a specific user and character. All messages (both user and assistant) are persisted to SQLite automatically.

curl http://localhost:8000/characters/samuel_sam_thompson/history/user1?limit=20 \
  -H "Authorization: Bearer KEY"

Response:

[
  {
    "id": "a1b2c3...",
    "user_id": "user1",
    "user_name": "Alex",
    "role": "user",
    "content": "Hey Sam",
    "timestamp": "2026-02-15T10:30:00+00:00",
    "entities_mentioned": []
  },
  {
    "id": "d4e5f6...",
    "user_id": "user1",
    "user_name": "",
    "role": "assistant",
    "content": "Hey! What's going on?",
    "timestamp": "2026-02-15T10:30:01+00:00",
    "entities_mentioned": []
  }
]

Query parameters:

limit (int, default 50) — max number of turns to return

Returns turns in chronological order (oldest first).

WebSocket Streaming

Connect to ws://localhost:8000/ws/chat/{slug}?token=KEY.

Client → Server:

{"type": "message", "content": "Hey Sam", "user_id": "user1", "user_name": "Alex"}
{"type": "ping"}

Server → Client:

{"type": "stream_start", "character": "Samuel Sam Thompson"}
{"type": "stream_chunk", "content": "Hey! "}
{"type": "stream_chunk", "content": "What's going on?"}
{"type": "stream_end", "content": "Hey! What's going on?", "emotion": "anticipation", "emotion_intensity": 0.45}
{"type": "pong"}
{"type": "error", "message": "..."}

Per-Character Research Toggle

Research (web scraping, knowledge synthesis, drip research) can be controlled per character.

In YAML

# config/characters/sam_thompson.yaml
research_enabled: true   # default — research runs normally
research_seeds:
  - 1st Special Forces Group Airborne
  - ...

Set research_enabled: false to disable all research for a character at load time.

At Runtime via API

Toggle research for a loaded character:

# Disable research
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'
# → {"slug": "samuel_sam_thompson", "research_enabled": false}

# Re-enable
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

When disabled, the following are all suppressed:

Initial seed research on boot
Conversation-driven knowledge-gap research
Drip research (background loop)

The research_enabled status is visible in GET /status under each character's stats.

Configuration Schemas

LLM Provider

{
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "api_key": "sk-...",
  "max_tokens": 1024,
  "temperature": 0.9,
  "base_url": "http://localhost:11434"
}

api_key and base_url are optional. base_url is only used for Ollama.

Engine

{"data_dir": "./data", "log_level": "INFO", "max_history": 40}

Memory

{
  "embedding_model": "all-MiniLM-L6-v2",
  "max_retrieval_results": 15,
  "dense_weight": 0.5,
  "bm25_weight": 0.3,
  "graph_weight": 0.2,
  "recency_boost_hours": 2.0,
  "random_surface_probability": 0.05
}

Background Timers

{
  "synthesis_interval_minutes": 10,
  "consolidation_interval_minutes": 60,
  "bm25_rebuild_interval_minutes": 15,
  "drip_research_interval_minutes": 60
}

Safety

{"honesty_section": true, "user_data_deletion": true}

Research (Global)

{"initial_scrape_on_boot": true, "periodic_interval_hours": 6}

Status Dashboard

GET /status returns a full system overview:

{
  "version": "1.0.0",
  "needs_setup": false,
  "default_backend": "openrouter",
  "configured_providers": ["openrouter", "anthropic"],
  "characters": [
    {
      "slug": "samuel_sam_thompson",
      "name": "Samuel Sam Thompson",
      "loaded": true,
      "stats": {
        "backend": "openrouter",
        "memories_vector": 47,
        "memories_sqlite": 52,
        "total_turns": 120,
        "graph_nodes": 34,
        "graph_edges": 28,
        "emotion": "anticipation",
        "emotion_intensity": 0.5,
        "background_loops": true,
        "research_enabled": true,
        "drift_rate": 0.012
      }
    },
    {
      "slug": "sherlock_holmes",
      "name": "Sherlock Holmes",
      "loaded": false,
      "stats": null
    }
  ],
  "background": {
    "synthesis_interval_minutes": 10,
    "consolidation_interval_minutes": 60,
    "bm25_rebuild_interval_minutes": 15,
    "drip_research_interval_minutes": 60
  }
}

Architecture Notes

Multi-character: Multiple characters can be loaded simultaneously, each with independent engines
Lazy loading: Engines spin up on first chat request (or explicitly via /characters/{slug}/load)
Message persistence: Every conversation turn (user + assistant) is stored in per-character SQLite, retrievable via the history endpoint
Config persistence: All PUT /config/* writes back to config/settings.yaml immediately
Setup gate: Middleware returns 503 for chat/character endpoints until at least one LLM provider is configured; /health, /setup/*, /config/*, /status, and /docs bypass the gate
Streaming: All 4 backends (Anthropic, OpenAI, OpenRouter, Ollama) support native token streaming
Per-character research: Research can be toggled per character via YAML or runtime API, controlling initial scrape, conversation-gap research, and drip research independently

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lettuce Engine API

Quick Start

Authentication

Setup Flow

Endpoints

Health & Status

Setup & Configuration

Characters

Chat & History

REST Chat

Conversation History

WebSocket Streaming

Per-Character Research Toggle

In YAML

At Runtime via API

Configuration Schemas

LLM Provider

Engine

Memory

Background Timers

Safety

Research (Global)

Status Dashboard

Architecture Notes

FilesExpand file tree

API.md

Latest commit

History

API.md

File metadata and controls

Lettuce Engine API

Quick Start

Authentication

Setup Flow

Endpoints

Health & Status

Setup & Configuration

Characters

Chat & History

REST Chat

Conversation History

WebSocket Streaming

Per-Character Research Toggle

In YAML

At Runtime via API

Configuration Schemas

LLM Provider

Engine

Memory

Background Timers

Safety

Research (Global)

Status Dashboard

Architecture Notes