Skip to content

Latest commit

 

History

History
324 lines (263 loc) · 9.68 KB

File metadata and controls

324 lines (263 loc) · 9.68 KB

Lettuce Engine API

REST + WebSocket API for the Lettuce Engine character system.

Quick Start

# Start the server
lettuce serve                     # 0.0.0.0:8000
lettuce serve --port 3000         # custom port
lettuce serve --reload            # dev mode with auto-reload

OpenAPI docs available at http://localhost:8000/docs.

Authentication

Set LETTUCE_API_KEY env var or api.api_key in config/settings.yaml.

  • Empty key = auth disabled (dev mode)
  • REST: Authorization: Bearer <key>
  • WebSocket: ws://host:port/ws/chat/{slug}?token=<key>
  • /health and /setup/* never require auth
  • /config/* endpoints bypass the setup gate (so providers can be configured before setup is complete)

Setup Flow

On first run with no LLM configured, the system enters setup mode. Chat and character endpoints return 503 until configuration is complete.

# 1. Check setup status
curl http://localhost:8000/setup/status
# → {"needs_setup": true, "configured_providers": [], "has_api_key": false}

# 2. Configure a provider
curl -X PUT http://localhost:8000/config/llm/openrouter \
  -H "Content-Type: application/json" \
  -d '{"model": "anthropic/claude-sonnet-4-5-20250929", "api_key": "sk-..."}'

# 3. Set default backend
curl -X PUT http://localhost:8000/config/llm/default \
  -H "Content-Type: application/json" \
  -d '{"provider": "openrouter"}'

# 4. Complete setup
curl -X POST http://localhost:8000/setup/complete
# → {"status": "ok"}

Endpoints

Health & Status

Method Path Auth Description
GET /health No Liveness probe: {status, version}
GET /status Yes Full system dashboard with all character stats

Setup & Configuration

Method Path Auth Description
GET /setup/status No Setup state, configured providers, API key status
POST /setup/complete No Mark setup done (fails if no LLM configured)
GET /config Yes Full config (API keys redacted)
PUT /config/llm/{provider} Yes Set LLM provider config
DELETE /config/llm/{provider} Yes Remove a provider
PUT /config/llm/default Yes Set default backend
PUT /config/engine Yes Update engine settings
PUT /config/memory Yes Update memory/retrieval settings
PUT /config/background Yes Update background loop timers
PUT /config/safety Yes Update safety settings
PUT /config/research Yes Update global research settings

Valid providers: anthropic, openai, openrouter, ollama

Characters

Method Path Auth Description
GET /characters Yes List available characters with loaded status
GET /characters/{slug} Yes Character detail (name, era, role, traits)
POST /characters/{slug}/load Yes Eagerly load engine into memory
POST /characters/{slug}/unload Yes Unload engine, free resources
PUT /characters/{slug}/research Yes Enable/disable research for a loaded character
DELETE /characters/{slug}/users/{user_id} Yes Delete all user data for a character

Characters lazy-load on first chat if not explicitly loaded.

Slugs are derived from the character's name field — lowercased, spaces replaced with underscores, dots removed. For example, "Samuel Sam Thompson" becomes samuel_sam_thompson. Use GET /characters to discover available slugs.

Chat & History

Method Path Auth Description
POST /characters/{slug}/chat Yes Send message, get complete response
GET /characters/{slug}/history/{user_id} Yes Get conversation history for a user
WS /ws/chat/{slug}?token=KEY Yes WebSocket streaming chat

REST Chat

curl -X POST http://localhost:8000/characters/samuel_sam_thompson/chat \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hey Sam", "user_id": "user1", "user_name": "Alex"}'

Response:

{
  "response": "Hey! What's going on?",
  "character": "Samuel Sam Thompson",
  "emotion": "anticipation",
  "emotion_intensity": 0.45
}

Conversation History

Retrieve stored conversation turns for a specific user and character. All messages (both user and assistant) are persisted to SQLite automatically.

curl http://localhost:8000/characters/samuel_sam_thompson/history/user1?limit=20 \
  -H "Authorization: Bearer KEY"

Response:

[
  {
    "id": "a1b2c3...",
    "user_id": "user1",
    "user_name": "Alex",
    "role": "user",
    "content": "Hey Sam",
    "timestamp": "2026-02-15T10:30:00+00:00",
    "entities_mentioned": []
  },
  {
    "id": "d4e5f6...",
    "user_id": "user1",
    "user_name": "",
    "role": "assistant",
    "content": "Hey! What's going on?",
    "timestamp": "2026-02-15T10:30:01+00:00",
    "entities_mentioned": []
  }
]

Query parameters:

  • limit (int, default 50) — max number of turns to return

Returns turns in chronological order (oldest first).

WebSocket Streaming

Connect to ws://localhost:8000/ws/chat/{slug}?token=KEY.

Client → Server:

{"type": "message", "content": "Hey Sam", "user_id": "user1", "user_name": "Alex"}
{"type": "ping"}

Server → Client:

{"type": "stream_start", "character": "Samuel Sam Thompson"}
{"type": "stream_chunk", "content": "Hey! "}
{"type": "stream_chunk", "content": "What's going on?"}
{"type": "stream_end", "content": "Hey! What's going on?", "emotion": "anticipation", "emotion_intensity": 0.45}
{"type": "pong"}
{"type": "error", "message": "..."}

Per-Character Research Toggle

Research (web scraping, knowledge synthesis, drip research) can be controlled per character.

In YAML

# config/characters/sam_thompson.yaml
research_enabled: true   # default — research runs normally
research_seeds:
  - 1st Special Forces Group Airborne
  - ...

Set research_enabled: false to disable all research for a character at load time.

At Runtime via API

Toggle research for a loaded character:

# Disable research
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'
# → {"slug": "samuel_sam_thompson", "research_enabled": false}

# Re-enable
curl -X PUT http://localhost:8000/characters/samuel_sam_thompson/research \
  -H "Authorization: Bearer KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

When disabled, the following are all suppressed:

  • Initial seed research on boot
  • Conversation-driven knowledge-gap research
  • Drip research (background loop)

The research_enabled status is visible in GET /status under each character's stats.

Configuration Schemas

LLM Provider

{
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "api_key": "sk-...",
  "max_tokens": 1024,
  "temperature": 0.9,
  "base_url": "http://localhost:11434"
}

api_key and base_url are optional. base_url is only used for Ollama.

Engine

{"data_dir": "./data", "log_level": "INFO", "max_history": 40}

Memory

{
  "embedding_model": "all-MiniLM-L6-v2",
  "max_retrieval_results": 15,
  "dense_weight": 0.5,
  "bm25_weight": 0.3,
  "graph_weight": 0.2,
  "recency_boost_hours": 2.0,
  "random_surface_probability": 0.05
}

Background Timers

{
  "synthesis_interval_minutes": 10,
  "consolidation_interval_minutes": 60,
  "bm25_rebuild_interval_minutes": 15,
  "drip_research_interval_minutes": 60
}

Safety

{"honesty_section": true, "user_data_deletion": true}

Research (Global)

{"initial_scrape_on_boot": true, "periodic_interval_hours": 6}

Status Dashboard

GET /status returns a full system overview:

{
  "version": "1.0.0",
  "needs_setup": false,
  "default_backend": "openrouter",
  "configured_providers": ["openrouter", "anthropic"],
  "characters": [
    {
      "slug": "samuel_sam_thompson",
      "name": "Samuel Sam Thompson",
      "loaded": true,
      "stats": {
        "backend": "openrouter",
        "memories_vector": 47,
        "memories_sqlite": 52,
        "total_turns": 120,
        "graph_nodes": 34,
        "graph_edges": 28,
        "emotion": "anticipation",
        "emotion_intensity": 0.5,
        "background_loops": true,
        "research_enabled": true,
        "drift_rate": 0.012
      }
    },
    {
      "slug": "sherlock_holmes",
      "name": "Sherlock Holmes",
      "loaded": false,
      "stats": null
    }
  ],
  "background": {
    "synthesis_interval_minutes": 10,
    "consolidation_interval_minutes": 60,
    "bm25_rebuild_interval_minutes": 15,
    "drip_research_interval_minutes": 60
  }
}

Architecture Notes

  • Multi-character: Multiple characters can be loaded simultaneously, each with independent engines
  • Lazy loading: Engines spin up on first chat request (or explicitly via /characters/{slug}/load)
  • Message persistence: Every conversation turn (user + assistant) is stored in per-character SQLite, retrievable via the history endpoint
  • Config persistence: All PUT /config/* writes back to config/settings.yaml immediately
  • Setup gate: Middleware returns 503 for chat/character endpoints until at least one LLM provider is configured; /health, /setup/*, /config/*, /status, and /docs bypass the gate
  • Streaming: All 4 backends (Anthropic, OpenAI, OpenRouter, Ollama) support native token streaming
  • Per-character research: Research can be toggled per character via YAML or runtime API, controlling initial scrape, conversation-gap research, and drip research independently