A simple mock LLM server for end-to-end testing. Provides request/response mocking for OpenAI and Anthropic APIs using their official SDK types.
- OpenAI Chat Completions API (streaming and non-streaming)
- OpenAI Responses API (streaming and non-streaming, including function outputs)
- Anthropic Messages API (non-streaming)
- Exact and contains matching
- Optional header matching (e.g., tenant ID, API key, custom headers)
- Tool/function calls support
- JSON configuration files
- Server: HTTP server with Gorilla mux router
- Providers: Separate handlers for OpenAI and Anthropic
- Matching: Linear search through mocks with exact/contains matching
- SDK Integration: Uses official OpenAI and Anthropic SDK types directly
Current implementation uses these core types:
Config: Root configuration containing arrays of OpenAI and Anthropic mocksOpenAIMockandOpenAIResponseMockMaps OpenAI requests to responses using official SDK typesAnthropicMock: Maps Anthropic requests to responses using official SDK types
MatchType: Enum for matching strategies (exact,contains)HeaderMatch: Defines a header matching rule (name, value, match type)OpenAIRequestMatchandOpenAIResponseRequestMatch: Defines how to match OpenAI requests (match type + message + optional headers)AnthropicRequestMatch: Defines how to match Anthropic requests (match type + message + optional headers)
- Endpoint:
POST /v1/chat/completions - Auth:
Authorization: Bearer <token>(presence check only) - Request:
openai.ChatCompletionNewParams - Response:
openai.ChatCompletion(streaming:openai.ChatCompletionChunk) - Matching: Exact or contains on last message
- Endpoint:
POST /v1/responses - Auth:
Authorization: Bearer <token>(presence check only) - Request:
responses.ResponseNewParams - Response:
responses.Response - Matching: Exact or contains on input field
- Features: Supports text output and function call outputs
- Endpoint:
POST /v1/messages - Auth:
x-api-key(presence check only) - Headers:
anthropic-versionrequired - Request:
anthropic.MessageNewParams - Response:
anthropic.Message - Matching: Exact or contains on last message
config := mockllm.Config{
OpenAI: []mockllm.OpenAIMock{
{
Name: "simple-response",
Match: mockllm.OpenAIRequestMatch{
MatchType: mockllm.MatchTypeExact,
Message: /* openai.ChatCompletionMessageParamUnion */,
},
Response: /* openai.ChatCompletion */,
},
},
OpenAIResponse: []mockllm.OpenAIResponseMock{
{
Name: "haiku-response",
Match: mockllm.OpenAIResponseRequestMatch{
MatchType: mockllm.MatchTypeContains,
Input: /* responses.ResponseNewParamsInputUnion */,
},
Response: /* responses.Response */,
},
},
Anthropic: []mockllm.AnthropicMock{/* ... */},
}{
"openai": [
{
"name": "initial_request",
"match": {
"match_type": "exact",
"message" : {
"content": "List all nodes in the cluster",
"role": "user"
}
},
"response": {
"id": "chatcmpl-1",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4.1-mini",
"choices": [
{
"index": 0,
"role": "assistant",
"message": {
"content": "",
"tool_calls": [
...
]
},
"finish_reason": "tool_calls"
}
]
}
},
{
"name": "k8s_get_resources_response",
"match": {
"match_type": "contains",
"message" : {
"content": "kagent-control-plane",
"role": "tool",
"tool_call_id": "call_1"
}
},
"response": {
"id": "call_1",
"object": "chat.completion.tool_message",
"created": 1677652288,
"model": "gpt-4.1-mini",
"choices": [
...
]
}
}
],
"openai_response": [
/* ... */
],
"anthropic": [
/* ... */
]
}Mocks can optionally require specific HTTP headers to match. When headers is specified, all header rules must match (AND semantics) in addition to the body match. Header matching is optional — mocks without headers continue to work identically.
mock := mockllm.OpenAIMock{
Name: "tenant-a-response",
Match: mockllm.OpenAIRequestMatch{
MatchType: mockllm.MatchTypeContains,
Message: /* ... */,
Headers: []mockllm.HeaderMatch{
{Name: "X-Tenant-ID", Value: "tenant-a", MatchType: mockllm.MatchTypeExact},
},
},
Response: /* ... */,
}{
"name": "tenant-a-response",
"match": {
"match_type": "contains",
"message": { "role": "user", "content": "Hello" },
"headers": [
{ "name": "X-Tenant-ID", "value": "tenant-a", "match_type": "exact" },
{ "name": "Authorization", "value": "Bearer", "match_type": "contains" }
]
},
"response": { }
}name: Header name (case-insensitive, per HTTP spec)value: Value to match againstmatch_type:"exact"(default if omitted) or"contains"
Simple linear search through mocks:
- Parse incoming request into appropriate SDK type
- Iterate through provider-specific mocks in order
- For each mock, check if the match criteria are met:
- Body: Exact JSON comparison or string contains check on last message/input
- Headers (optional): All specified header rules must match
- Return the response from the first matching mock
- Return 404 if no match found
- Non-streaming: JSON responses using SDK types
- Streaming: Server-Sent Events (SSE) for Chat Completions and Responses API
- Uses official SDK response types directly
config := mockllm.Config{/* mocks */}
server := mockllm.NewServer(config)
baseURL, err := server.Start(context.Background())
defer server.Stop(context.Background())
// Use baseURL for API calls in tests
client := openai.NewClient(
option.WithBaseURL(baseURL+"/v1/"),
option.WithAPIKey("test-key"),
)server.go— HTTP server, routing, lifecycletypes.go— Configuration typesheaders.go— Shared header matching logicopenai.go— OpenAI handler (Chat Completions)openai_response.go— OpenAI handler (Responses API)anthropic.go— Anthropic handlerserver_test.go— Integration teststestdata/— Test fixtures
github.com/openai/openai-go/v3github.com/anthropics/anthropic-sdk-gogithub.com/gorilla/mux
- Simple matching only (exact/contains on last message/input and optional headers)
- Does not mock hosted tools (e.g. OpenAI file search, code execution) calls, reasoning, and MCP calls
- No stateful conversation tracking
- No latency simulation
- No error injection