-
Notifications
You must be signed in to change notification settings - Fork 40
Implement streaming tool calling for OpenAI language model #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request implements streaming tool calling support for OpenAI language models, resolving issue #32. The implementation enables the model to execute tool calls during streaming responses, accumulating tool call deltas from the streaming API, executing them, and continuing the streaming loop with the results.
- Adds streaming tool call execution logic for both
.responsesand.chatCompletionsAPI variants - Introduces
OpenAIToolCallAccumulatorto merge delta chunks into complete tool calls - Updates
ResponseStream.Snapshotto carry transcript entries through the streaming process - Includes disabled tests with mock infrastructure for both API variants
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| Sources/AnyLanguageModel/Models/OpenAILanguageModel.swift | Implements streaming tool call loop for both API variants, adds OpenAIToolCallAccumulator for delta merging, adds makeAssistantToolCallMessage helper, extends OpenAIChatCompletionsChunk to support tool call deltas |
| Sources/AnyLanguageModel/LanguageModelSession.swift | Adds transcriptEntries field to ResponseStream.Snapshot and populates it in collect() method |
| Tests/AnyLanguageModelTests/OpenAILanguageModelTests.swift | Adds disabled streaming tool call tests for both API variants with mock event stream infrastructure |
| Tests/AnyLanguageModelTests/MockLanguageModelTests.swift | Removes unused variable to clean up test code |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if !toolCalls.isEmpty || finishReason == "tool_calls" { | ||
| if let assistantRaw = makeAssistantToolCallMessage( | ||
| for: .chatCompletions, | ||
| toolCalls: toolCalls | ||
| ) { | ||
| currentMessages.append( | ||
| OpenAIMessage(role: .raw(rawContent: assistantRaw), content: .text("")) | ||
| ) | ||
| } | ||
| let invocations = try await resolveToolCalls(toolCalls, session: session) | ||
| if !invocations.isEmpty { | ||
| transcriptEntries.append( | ||
| .toolCalls(Transcript.ToolCalls(invocations.map { $0.call })) | ||
| ) | ||
| for invocation in invocations { | ||
| let output = invocation.output | ||
| transcriptEntries.append(.toolOutput(output)) | ||
| let toolSegments: [Transcript.Segment] = output.segments | ||
| let blocks = convertSegmentsToOpenAIBlocks(toolSegments) | ||
| currentMessages.append( | ||
| OpenAIMessage(role: .tool(id: invocation.call.id), content: .blocks(blocks)) | ||
| ) | ||
| } | ||
|
|
||
| let raw = GeneratedContent(accumulatedText) | ||
| let content: Content.PartiallyGenerated = (accumulatedText as! Content) | ||
| .asPartiallyGenerated() | ||
| continuation.yield( | ||
| .init( | ||
| content: content, | ||
| rawContent: raw, | ||
| transcriptEntries: transcriptEntries | ||
| ) | ||
| ) | ||
| continue | ||
| } | ||
| } |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When tool calls exist but invocations.isEmpty (line 853 condition fails), the assistant message with tool calls has already been appended to currentMessages (lines 844-851). The loop then continues to the normal completion path instead of breaking. This means a follow-up request will include the assistant's tool call message without corresponding tool results, which could confuse the API. Consider only appending the assistant message when invocations are successfully created, or handle the empty invocations case explicitly.
| struct OpenAIStreamingToolCallTests { | ||
| private let baseURL = URL(string: "https://mock.openai.local")! | ||
|
|
||
| @Test(.disabled("Streaming mock under construction")) func responsesStreamToolCallExecution() async throws { |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test for streaming tool call execution with the Responses API is disabled with the reason "Streaming mock under construction". This means the new streaming tool call functionality for the Responses API variant is not being tested. Consider completing the mock implementation or removing the disabled attribute to ensure this feature has adequate test coverage.
| var snapshots: [LanguageModelSession.ResponseStream<String>.Snapshot] = [] | ||
| for try await snapshot in session.streamResponse(to: "What's the weather?") { | ||
| snapshots.append(snapshot) | ||
| } | ||
|
|
||
| #expect(chatCallCount >= 2) |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test collects snapshots from the stream but doesn't verify their contents. Consider adding assertions to verify that: 1) snapshots contain expected tool call entries in transcriptEntries, 2) the final snapshot includes the tool execution results, and 3) the accumulated text reflects the complete response after tool execution.
| #expect(responsesCallCount >= 2) | ||
| } | ||
|
|
||
| @Test(.disabled("Streaming mock under construction")) func chatCompletionsStreamToolCallExecution() async throws { |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test for streaming tool call execution with the Chat Completions API is disabled with the reason "Streaming mock under construction". This means the new streaming tool call functionality for the Chat Completions API variant is not being tested. Consider completing the mock implementation or removing the disabled attribute to ensure this feature has adequate test coverage.
| var snapshots: [LanguageModelSession.ResponseStream<String>.Snapshot] = [] | ||
| for try await snapshot in session.streamResponse(to: "What's the weather?") { | ||
| snapshots.append(snapshot) | ||
| } | ||
|
|
||
| #expect(responsesCallCount >= 2) |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test collects snapshots from the stream but doesn't verify their contents. Consider adding assertions to verify that: 1) snapshots contain expected tool call entries in transcriptEntries, 2) the final snapshot includes the tool execution results, and 3) the accumulated text reflects the complete response after tool execution.
| } | ||
|
|
||
| let toolCalls = toolCallAccumulator.build() | ||
| if !toolCalls.isEmpty || finishReason == "tool_calls" { |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition checks for finishReason == "tool_calls" even when toolCalls.isEmpty. This could lead to attempting to resolve an empty array of tool calls and potentially appending an empty assistant message. The condition should only proceed with tool call handling when there are actual tool calls to process. Consider changing this to just !toolCalls.isEmpty to match the .responses variant implementation at line 707.
| if !toolCalls.isEmpty || finishReason == "tool_calls" { | |
| if !toolCalls.isEmpty { |
| if !toolCalls.isEmpty { | ||
| if let assistantRaw = makeAssistantToolCallMessage( | ||
| for: .responses, | ||
| toolCalls: toolCalls | ||
| ) { | ||
| currentMessages.append( | ||
| OpenAIMessage(role: .raw(rawContent: assistantRaw), content: .text("")) | ||
| ) | ||
| } | ||
| let invocations = try await resolveToolCalls(toolCalls, session: session) | ||
| if !invocations.isEmpty { | ||
| transcriptEntries.append( | ||
| .toolCalls(Transcript.ToolCalls(invocations.map { $0.call })) | ||
| ) | ||
| for invocation in invocations { | ||
| let output = invocation.output | ||
| transcriptEntries.append(.toolOutput(output)) | ||
| let toolSegments: [Transcript.Segment] = output.segments | ||
| let blocks = convertSegmentsToOpenAIBlocks(toolSegments) | ||
| currentMessages.append( | ||
| OpenAIMessage(role: .tool(id: invocation.call.id), content: .blocks(blocks)) | ||
| ) | ||
| } | ||
|
|
||
| for try await event in events { | ||
| switch event { | ||
| case .outputTextDelta(let delta): | ||
| accumulatedText += delta | ||
| let raw = GeneratedContent(accumulatedText) | ||
| let content: Content.PartiallyGenerated = (accumulatedText as! Content) | ||
| .asPartiallyGenerated() | ||
| continuation.yield( | ||
| .init( | ||
| content: content, | ||
| rawContent: raw, | ||
| transcriptEntries: transcriptEntries | ||
| ) | ||
| ) | ||
| continue | ||
| } | ||
| } |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When !toolCalls.isEmpty but invocations.isEmpty (line 717 condition fails), the assistant message containing tool calls has already been appended to currentMessages (lines 708-715), but the loop continues to the normal completion path instead of breaking. This means a follow-up request will include the assistant's tool call message without corresponding tool results, which could confuse the API. Consider only appending the assistant message when invocations are successfully created, or handle the empty invocations case explicitly.
Resolves #32