Add audio chat API support for gpt-4o-audio-preview model#397
Open
BugorBN wants to merge 1 commit intoMacPaw:mainfrom
Open
Add audio chat API support for gpt-4o-audio-preview model#397BugorBN wants to merge 1 commit intoMacPaw:mainfrom
BugorBN wants to merge 1 commit intoMacPaw:mainfrom
Conversation
- Add AudioChatQuery, AudioChatResult, AudioChatStreamResult models - Implement audioChats() and audioChatsStream() methods - Add type-safe enums: AudioFormat, Voice, Modality - Add AudioConversationManager for multi-turn conversations - Add support for gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview models - Add dated variants: 2024-12-17 snapshots - Comprehensive test coverage (38 tests across 4 test suites) - Update README with audio chat documentation and examples - Format requirements: wav/mp3 for input, pcm16 recommended for streaming output - Relaxed parsing support for handling missing fields
7d5804b to
7d385dd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements audio-to-audio chat completion support for the
gpt-4o-audio-previewmodel, replacing the traditional STT→Chat→TTS pipeline with a single API call for 2-3x faster response times.Features
audioChats()andaudioChatsStream()methodsAudioFormat(wav, mp3, flac, opus, pcm16) andVoice(alloy, echo, fable, onyx, nova, shimmer)AsyncThrowingStreamImplementation Details
AudioChatQuery,AudioChatResult,AudioChatStreamResultOpenAIProtocol,OpenAIAsync, with Combine supportTesting
Files Changed
Documentation
Added comprehensive Audio Chat section to README.md including:
Breaking Changes
None - this is a purely additive change.
Related