REST + WebSocket API for the Lettuce Engine character system.
# Start the server
lettuce serve # 0.0.0.0:8000
lettuce serve --port 3000 # custom port
lettuce serve --reload # dev mode with auto-reloadOpenAPI docs available at http://localhost:8000/docs.
Set LETTUCE_API_KEY env var or api.api_key in config/settings.yaml.
- Empty key = auth disabled (dev mode)
- REST:
Authorization: Bearer <key> - WebSocket:
ws://host:port/ws/chat/{slug}?token=<key> /healthand/setup/*never require auth/config/*endpoints bypass the setup gate (so providers can be configured before setup is complete)
On first run with no LLM configured, the system enters setup mode. Chat and character endpoints return 503 until configuration is complete.
# 1. Check setup status
curl http://localhost:8000/setup/status
# → {"needs_setup": true, "configured_providers": [], "has_api_key": false}
# 2. Configure a provider
curl -X PUT http://localhost:8000/config/llm/openrouter \
-H "Content-Type: application/json" \
-d '{"model": "anthropic/claude-sonnet-4-5-20250929", "api_key": "sk-..."}'
# 3. Set default backend
curl -X PUT http://localhost:8000/config/llm/default \
-H "Content-Type: application/json" \
-d '{"provider": "openrouter"}'
# 4. Complete setup
curl -X POST http://localhost:8000/setup/complete
# → {"status": "ok"}| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /health |
No | Liveness probe: {status, version} |
| GET | /status |
Yes | Full system dashboard with all character stats |
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /setup/status |
No | Setup state, configured providers, API key status |
| POST | /setup/complete |
No | Mark setup done (fails if no LLM configured) |
| GET | /config |
Yes | Full config (API keys redacted) |
| PUT | /config/llm/{provider} |
Yes | Set LLM provider config |
| DELETE | /config/llm/{provider} |
Yes | Remove a provider |
| PUT | /config/llm/default |
Yes | Set default backend |
| PUT | /config/engine |
Yes | Update engine settings |
| PUT | /config/memory |
Yes | Update memory/retrieval settings |
| PUT | /config/background |
Yes | Update background loop timers |
| PUT | /config/safety |
Yes | Update safety settings |
| PUT | /config/research |
Yes | Update global research settings |
Valid providers: anthropic, openai, openrouter, ollama
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /characters |
Yes | List available characters with loaded status |
| GET | /characters/{slug} |
Yes | Character detail (name, era, role, traits) |
| POST | /characters/{slug}/load |
Yes | Eagerly load engine into memory |
| POST | /characters/{slug}/unload |
Yes | Unload engine, free resources |
| PUT | /characters/{slug}/research |
Yes | Enable/disable research for a loaded character |
| DELETE | /characters/{slug}/users/{user_id} |
Yes | Delete all user data for a character |
Characters lazy-load on first chat if not explicitly loaded.
Slugs are derived from the character's name field — lowercased, spaces replaced with underscores, dots removed. For example, "Samuel Sam Thompson" becomes samuel_sam_thompson. Use GET /characters to discover available slugs.
| Method | Path | Auth | Description |
|---|---|---|---|
| POST | /characters/{slug}/chat |
Yes | Send message, get complete response |
| GET | /characters/{slug}/history/{user_id} |
Yes | Get conversation history for a user |
| WS | /ws/chat/{slug}?token=KEY |
Yes | WebSocket streaming chat |
curl -X POST http://localhost:8000/characters/samuel_sam_thompson/chat \
-H "Authorization: Bearer KEY" \
-H "Content-Type: application/json" \
-d '{"message": "Hey Sam", "user_id": "user1", "user_name": "Alex"}'Response:
{
"response": "Hey! What's going on?",
"character": "Samuel Sam Thompson",
"emotion": "anticipation",
"emotion_intensity": 0.45
}Retrieve stored conversation turns for a specific user and character. All messages (both user and assistant) are persisted to SQLite automatically.
curl http://localhost:8000/characters/samuel_sam_thompson/history/user1?limit=20 \
-H "Authorization: Bearer KEY"Response:
[
{
"id": "a1b2c3...",
"user_id": "user1",
"user_name": "Alex",
"role": "user",
"content": "Hey Sam",
"timestamp": "2026-02-15T10:30:00+00:00",
"entities_mentioned": []
},
{
"id": "d4e5f6...",
"user_id": "user1",
"user_name": "",
"role": "assistant",
"content": "Hey! What's going on?",
"timestamp": "2026-02-15T10:30:01+00:00",
"entities_mentioned": []
}
]Query parameters:
limit(int, default 50) — max number of turns to return
Returns turns in chronological order (oldest first).
Connect to ws://localhost:8000/ws/chat/{slug}?token=KEY.
Client → Server:
{"type": "message", "content": "Hey Sam", "user_id": "user1", "user_name": "Alex"}
{"type": "ping"}Server → Client:
{"type": "stream_start", "character": "Samuel Sam Thompson"}
{"type": "stream_chunk", "content": "Hey! "}
{"type": "stream_chunk", "content": "What's going on?"}
{"type": "stream_end", "content": "Hey! What's going on?", "emotion": "anticipation", "emotion_intensity": 0.45}
{"type": "pong"}
{"type": "error", "message": "..."}Research (web scraping, knowledge synthesis, drip research) can be controlled per character.
# config/characters/sam_thompson.yaml
research_enabled: true # default — research runs normally
research_seeds:
- 1st Special Forces Group Airborne
- ...Set research_enabled: false to disable all research for a character at load time.
Toggle research for a loaded character:
# Disable research
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
-H "Authorization: Bearer KEY" \
-H "Content-Type: application/json" \
-d '{"enabled": false}'
# → {"slug": "samuel_sam_thompson", "research_enabled": false}
# Re-enable
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
-H "Authorization: Bearer KEY" \
-H "Content-Type: application/json" \
-d '{"enabled": true}'When disabled, the following are all suppressed:
- Initial seed research on boot
- Conversation-driven knowledge-gap research
- Drip research (background loop)
The research_enabled status is visible in GET /status under each character's stats.
{
"model": "anthropic/claude-sonnet-4-5-20250929",
"api_key": "sk-...",
"max_tokens": 1024,
"temperature": 0.9,
"base_url": "http://localhost:11434"
}api_key and base_url are optional. base_url is only used for Ollama.
{"data_dir": "./data", "log_level": "INFO", "max_history": 40}{
"embedding_model": "all-MiniLM-L6-v2",
"max_retrieval_results": 15,
"dense_weight": 0.5,
"bm25_weight": 0.3,
"graph_weight": 0.2,
"recency_boost_hours": 2.0,
"random_surface_probability": 0.05
}{
"synthesis_interval_minutes": 10,
"consolidation_interval_minutes": 60,
"bm25_rebuild_interval_minutes": 15,
"drip_research_interval_minutes": 60
}{"honesty_section": true, "user_data_deletion": true}{"initial_scrape_on_boot": true, "periodic_interval_hours": 6}GET /status returns a full system overview:
{
"version": "1.0.0",
"needs_setup": false,
"default_backend": "openrouter",
"configured_providers": ["openrouter", "anthropic"],
"characters": [
{
"slug": "samuel_sam_thompson",
"name": "Samuel Sam Thompson",
"loaded": true,
"stats": {
"backend": "openrouter",
"memories_vector": 47,
"memories_sqlite": 52,
"total_turns": 120,
"graph_nodes": 34,
"graph_edges": 28,
"emotion": "anticipation",
"emotion_intensity": 0.5,
"background_loops": true,
"research_enabled": true,
"drift_rate": 0.012
}
},
{
"slug": "sherlock_holmes",
"name": "Sherlock Holmes",
"loaded": false,
"stats": null
}
],
"background": {
"synthesis_interval_minutes": 10,
"consolidation_interval_minutes": 60,
"bm25_rebuild_interval_minutes": 15,
"drip_research_interval_minutes": 60
}
}- Multi-character: Multiple characters can be loaded simultaneously, each with independent engines
- Lazy loading: Engines spin up on first chat request (or explicitly via
/characters/{slug}/load) - Message persistence: Every conversation turn (user + assistant) is stored in per-character SQLite, retrievable via the history endpoint
- Config persistence: All
PUT /config/*writes back toconfig/settings.yamlimmediately - Setup gate: Middleware returns 503 for chat/character endpoints until at least one LLM provider is configured;
/health,/setup/*,/config/*,/status, and/docsbypass the gate - Streaming: All 4 backends (Anthropic, OpenAI, OpenRouter, Ollama) support native token streaming
- Per-character research: Research can be toggled per character via YAML or runtime API, controlling initial scrape, conversation-gap research, and drip research independently