Skip to content

Conversation

@mattt
Copy link
Owner

@mattt mattt commented Dec 11, 2025

Resolves #32

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements streaming tool calling support for OpenAI language models, resolving issue #32. The implementation enables the model to execute tool calls during streaming responses, accumulating tool call deltas from the streaming API, executing them, and continuing the streaming loop with the results.

  • Adds streaming tool call execution logic for both .responses and .chatCompletions API variants
  • Introduces OpenAIToolCallAccumulator to merge delta chunks into complete tool calls
  • Updates ResponseStream.Snapshot to carry transcript entries through the streaming process
  • Includes disabled tests with mock infrastructure for both API variants

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
Sources/AnyLanguageModel/Models/OpenAILanguageModel.swift Implements streaming tool call loop for both API variants, adds OpenAIToolCallAccumulator for delta merging, adds makeAssistantToolCallMessage helper, extends OpenAIChatCompletionsChunk to support tool call deltas
Sources/AnyLanguageModel/LanguageModelSession.swift Adds transcriptEntries field to ResponseStream.Snapshot and populates it in collect() method
Tests/AnyLanguageModelTests/OpenAILanguageModelTests.swift Adds disabled streaming tool call tests for both API variants with mock event stream infrastructure
Tests/AnyLanguageModelTests/MockLanguageModelTests.swift Removes unused variable to clean up test code

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +843 to 879
if !toolCalls.isEmpty || finishReason == "tool_calls" {
if let assistantRaw = makeAssistantToolCallMessage(
for: .chatCompletions,
toolCalls: toolCalls
) {
currentMessages.append(
OpenAIMessage(role: .raw(rawContent: assistantRaw), content: .text(""))
)
}
let invocations = try await resolveToolCalls(toolCalls, session: session)
if !invocations.isEmpty {
transcriptEntries.append(
.toolCalls(Transcript.ToolCalls(invocations.map { $0.call }))
)
for invocation in invocations {
let output = invocation.output
transcriptEntries.append(.toolOutput(output))
let toolSegments: [Transcript.Segment] = output.segments
let blocks = convertSegmentsToOpenAIBlocks(toolSegments)
currentMessages.append(
OpenAIMessage(role: .tool(id: invocation.call.id), content: .blocks(blocks))
)
}

let raw = GeneratedContent(accumulatedText)
let content: Content.PartiallyGenerated = (accumulatedText as! Content)
.asPartiallyGenerated()
continuation.yield(
.init(
content: content,
rawContent: raw,
transcriptEntries: transcriptEntries
)
)
continue
}
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When tool calls exist but invocations.isEmpty (line 853 condition fails), the assistant message with tool calls has already been appended to currentMessages (lines 844-851). The loop then continues to the normal completion path instead of breaking. This means a follow-up request will include the assistant's tool call message without corresponding tool results, which could confuse the API. Consider only appending the assistant message when invocations are successfully created, or handle the empty invocations case explicitly.

Copilot uses AI. Check for mistakes.
struct OpenAIStreamingToolCallTests {
private let baseURL = URL(string: "https://mock.openai.local")!

@Test(.disabled("Streaming mock under construction")) func responsesStreamToolCallExecution() async throws {
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test for streaming tool call execution with the Responses API is disabled with the reason "Streaming mock under construction". This means the new streaming tool call functionality for the Responses API variant is not being tested. Consider completing the mock implementation or removing the disabled attribute to ensure this feature has adequate test coverage.

Copilot uses AI. Check for mistakes.
Comment on lines +406 to +411
var snapshots: [LanguageModelSession.ResponseStream<String>.Snapshot] = []
for try await snapshot in session.streamResponse(to: "What's the weather?") {
snapshots.append(snapshot)
}

#expect(chatCallCount >= 2)
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test collects snapshots from the stream but doesn't verify their contents. Consider adding assertions to verify that: 1) snapshots contain expected tool call entries in transcriptEntries, 2) the final snapshot includes the tool execution results, and 3) the accumulated text reflects the complete response after tool execution.

Copilot uses AI. Check for mistakes.
#expect(responsesCallCount >= 2)
}

@Test(.disabled("Streaming mock under construction")) func chatCompletionsStreamToolCallExecution() async throws {
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test for streaming tool call execution with the Chat Completions API is disabled with the reason "Streaming mock under construction". This means the new streaming tool call functionality for the Chat Completions API variant is not being tested. Consider completing the mock implementation or removing the disabled attribute to ensure this feature has adequate test coverage.

Copilot uses AI. Check for mistakes.
Comment on lines +355 to +360
var snapshots: [LanguageModelSession.ResponseStream<String>.Snapshot] = []
for try await snapshot in session.streamResponse(to: "What's the weather?") {
snapshots.append(snapshot)
}

#expect(responsesCallCount >= 2)
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test collects snapshots from the stream but doesn't verify their contents. Consider adding assertions to verify that: 1) snapshots contain expected tool call entries in transcriptEntries, 2) the final snapshot includes the tool execution results, and 3) the accumulated text reflects the complete response after tool execution.

Copilot uses AI. Check for mistakes.
}

let toolCalls = toolCallAccumulator.build()
if !toolCalls.isEmpty || finishReason == "tool_calls" {
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks for finishReason == "tool_calls" even when toolCalls.isEmpty. This could lead to attempting to resolve an empty array of tool calls and potentially appending an empty assistant message. The condition should only proceed with tool call handling when there are actual tool calls to process. Consider changing this to just !toolCalls.isEmpty to match the .responses variant implementation at line 707.

Suggested change
if !toolCalls.isEmpty || finishReason == "tool_calls" {
if !toolCalls.isEmpty {

Copilot uses AI. Check for mistakes.
Comment on lines +707 to +743
if !toolCalls.isEmpty {
if let assistantRaw = makeAssistantToolCallMessage(
for: .responses,
toolCalls: toolCalls
) {
currentMessages.append(
OpenAIMessage(role: .raw(rawContent: assistantRaw), content: .text(""))
)
}
let invocations = try await resolveToolCalls(toolCalls, session: session)
if !invocations.isEmpty {
transcriptEntries.append(
.toolCalls(Transcript.ToolCalls(invocations.map { $0.call }))
)
for invocation in invocations {
let output = invocation.output
transcriptEntries.append(.toolOutput(output))
let toolSegments: [Transcript.Segment] = output.segments
let blocks = convertSegmentsToOpenAIBlocks(toolSegments)
currentMessages.append(
OpenAIMessage(role: .tool(id: invocation.call.id), content: .blocks(blocks))
)
}

for try await event in events {
switch event {
case .outputTextDelta(let delta):
accumulatedText += delta
let raw = GeneratedContent(accumulatedText)
let content: Content.PartiallyGenerated = (accumulatedText as! Content)
.asPartiallyGenerated()
continuation.yield(
.init(
content: content,
rawContent: raw,
transcriptEntries: transcriptEntries
)
)
continue
}
}
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When !toolCalls.isEmpty but invocations.isEmpty (line 717 condition fails), the assistant message containing tool calls has already been appended to currentMessages (lines 708-715), but the loop continues to the normal completion path instead of breaking. This means a follow-up request will include the assistant's tool call message without corresponding tool results, which could confuse the API. Consider only appending the assistant message when invocations are successfully created, or handle the empty invocations case explicitly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool calling for the stream API

2 participants