Refactor AWS Chunked logic from HTTP clients #3635

kai-ion · 2025-12-02T16:10:37Z

Issue #, if available:
#3297

Description of changes:
Our goal is specifically to refactor this logic out of http clients and into a interceptor. We will introduce a core ChunkingInterceptor that owns all aws-chunked behavior for streaming requests and remove the chunking logics from the individual HTTP clients.

Check all that applies:

Did a review by yourself.
Added proper tests to cover this PR. (If tests are not applicable, explain.)
Checked if this PR is a breaking (APIs have been changed) change.
Checked if this PR will not introduce cross-platform inconsistent behavior.
Checked if this PR would require a ReadMe/Wiki update.

Check which platforms you have built SDK on to verify the correctness of this PR.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

src/aws-cpp-sdk-core/include/aws/core/http/interceptor/ChunkingInterceptor.h

src/aws-cpp-sdk-core/source/auth/signer/AWSAuthV4Signer.cpp

src/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp

src/aws-cpp-sdk-core/source/http/interceptor/ChunkingInterceptor.cpp

src/aws-cpp-sdk-core/CMakeLists.txt

src/aws-cpp-sdk-core/include/aws/core/client/ClientConfiguration.h

src/aws-cpp-sdk-core/source/http/interceptor/ChunkingInterceptor.cpp

sbiscigl · 2025-12-04T19:36:47Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

why are we calling setg in the constructor?, specifically to nullptr?

Its to initialize the streambuf as an empty state.

Its also done the same way in the code snipet you showed me

class chunked_stream : public std::streambuf { public: chunked_stream(std::unique_ptr<std::istream> is) : source(std::move(is)) { setg(nullptr, nullptr, nullptr); } protected: int_type underflow() override { if (gptr() == egptr()) { if (!trailer.empty()) { buffer = trailer[trailer_pos++]; if (trailer_pos == trailer.size()) { trailer.clear(); trailer_pos = 0; } setg(&buffer, &buffer, &buffer + 1); return traits_type::to_int_type(*gptr()); } int ch = source->get(); if (ch == traits_type::eof()) { return traits_type::eof(); } buffer = static_cast<char>(ch); char_count++; if (char_count % 10 == 0) { trailer = "<trailer>"; trailer_pos = 0; } setg(&buffer, &buffer, &buffer + 1); } return traits_type::to_int_type(*gptr()); } private: std::unique_ptr<std::istream> source; char buffer; std::string trailer; size_t trailer_pos = 0; int char_count = 0; };

the base constructor already sets them to null, so why are we doing it again?

sbiscigl · 2025-12-04T19:38:51Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

why is this buffer 8KB? why is this buffer a c array and not a c++ array? why not make this configurable?

AwsChunkedStream has a buffer of 64KB for double encoding, why are creating a second buffer to buffer that buffer into?

If the answer is "because i need to use the AwsChunkedStream logic to do what i need it to do" then we need to move the logic from chunked stream into this class to avoid a extra buffer, or add additionally APIs to the existing class.

Ah yes you're right, this was originally setup just because we needed a streambuff interface. I think extending AwsChunkedStream API to add a streamBuff interface makes sense.

sbiscigl · 2025-12-04T19:41:36Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

why do you need a default constructor?

aws-sdk-cpp/src/aws-cpp-sdk-core/include/smithy/client/features/ChecksumInterceptor.h

Line 48 in 8a4f5e9

ChecksumInterceptor() = default;

The checksum interceptor also have a default constructor. But I think we actually dont need this default constructor here since we have the explicit clientconfig constructor

sbiscigl · 2025-12-04T19:42:51Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

why do you have this as the last step and not the first?

Okay moving this up

sbiscigl · 2025-12-04T19:43:53Z

src/aws-cpp-sdk-core/include/smithy/client/AwsSmithyClientBase.h

why do you call emplace back instead of using the initializer list?

Oh yeah, this is redundant logic, changing it to an initializer list

src/aws-cpp-sdk-core/source/http/curl/CurlHttpClient.cpp

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

sbiscigl · 2025-12-04T20:39:38Z

src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp

consider the following code

auto main() -> int { SDKOptions options{}; options.httpOptions.httpClientFactory_create_fn = []() -> std::shared_ptr<Aws::Http::HttpClientFactory> { Aws::MakeShared<MyHttpClientFactory>(); }; SdkContext context{std::move(options)}; //.. sdk code ClientConfiguration configuration{}; S3Client client{configuration}; return 0; }

would this operate correctly? or would the custom http client use the correct default?

sbiscigl · 2025-12-11T14:35:19Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

the base constructor already sets them to null, so why are we doing it again?

sbiscigl · 2025-12-11T14:56:13Z

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

+    Aws::Http::HttpRequest* m_request{nullptr};
+    std::shared_ptr<Aws::IOStream> m_stream;
+    Aws::Utils::Array<char> m_data;
+    Aws::Utils::Array<char> m_buffer{DataBufferSize};


This effectively doubles the size of this class by adding another buffer, it seems the only reason m_buffer exists is so that we can access the underlying data in m_chunkingStream. how can we access the data in m_chunkingStream without creating another buffer? seems to me like m_chunkingStream should be a different type, perhaps a Aws::Vector or a Aws::Utils::Array<char> so that you can access the underlying data. You will however change how you add and read from it.

I attempted to use accessible_stream_buf but it was more complicated then initially thought. (Failing intergration test)
So I decided to just change the m_chunkingStream to a vector to access the underlying data.

sbiscigl · 2025-12-11T15:03:08Z

src/aws-cpp-sdk-core/include/smithy/client/AwsSmithyClientBase.h

        {
+            // Create modified config for chunking interceptor
+            Aws::Client::ClientConfiguration chunkingConfig(*m_clientConfig);


why do you construct a object instead of referring to m_clientConfig directly?

I was thinking we should not mutate the user's clientConfig directly

src/aws-cpp-sdk-core/include/smithy/client/AwsSmithyClientBase.h

src/aws-cpp-sdk-core/source/smithy/client/AwsSmithyClientBase.cpp

Move aws-chunked encoding logic from individual HTTP clients to a centralized ChunkingInterceptor for better separation of concerns. - Add ChunkingInterceptor to handle aws-chunked encoding at request level - Remove custom chunking logic from CRT, Curl, and Windows HTTP clients - Simplify HTTP clients to focus on transport-only responsibilities - Maintain full backwards compatibility with existing APIs unit test for chunking stream added logic to detect custom http client and smart default

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h

sbaluja · 2025-12-12T18:35:26Z

src/aws-cpp-sdk-core/include/smithy/client/AwsSmithyClientBase.h

+              Aws::MakeShared<ChecksumInterceptor>("AwsSmithyClientBase", *m_clientConfig),
+              Aws::MakeShared<features::ChunkingInterceptor>("AwsSmithyClientBase", [this]() {
+                  Aws::Client::ClientConfiguration chunkingConfig = *m_clientConfig;
+                  chunkingConfig.httpClientChunkedMode = m_httpClient->IsDefaultAwsHttpClient() ? m_clientConfig->httpClientChunkedMode : Aws::Client::HttpClientChunkedMode::CLIENT_IMPLEMENTATION;


Is this the behavior we want? If it's our implementation (IsDefaultAwsHttpClient() == true) then don't we want to force Aws::Client::HttpClientChunkedMode::DEFAULT so that the chunking is applied? And if it's not our implementation then we follow whatever the customer has set, ie m_clientConfig->httpClientChunkedMode. Seems to me this condition should be reversed:

chunkingConfig.httpClientChunkedMode = m_httpClient->IsDefaultAwsHttpClient() ? Aws::Client::HttpClientChunkedMode::DEFAULT : m_clientConfig->httpClientChunkedMode;

I think we should default initialize clientConfig.httpClientChunkedMode = HttpClientChunkedMode::CLIENT_IMPLEMENTATION; so the client can opt-in for their HttpClient implementation if they want to, but for our implementation we should ensure it's going to use chunking

sbaluja · 2025-12-12T19:38:43Z

src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp

    clientConfig.httpLibOverride = Aws::Http::TransferLibType::DEFAULT_CLIENT;
+
+    // Users can explicitly set CLIENT_IMPLEMENTATION if their custom client handles chunking
+    clientConfig.httpClientChunkedMode = HttpClientChunkedMode::DEFAULT;


This needs to be

clientConfig.httpClientChunkedMode = HttpClientChunkedMode::CLIENT_IMPLEMENTATION;

We don't want to change existing client's HttpClient behavior unless they opt-in to our chunking logic.

sbiscigl

alright its in a good place now, please fix the constructor and do some memory tests and im good to ship

sbiscigl · 2025-12-12T20:29:57Z

src/aws-cpp-sdk-core/include/smithy/client/AwsSmithyClientBase.h

+          m_interceptors({
+              Aws::MakeShared<ChecksumInterceptor>("AwsSmithyClientBase", *m_clientConfig),
+              Aws::MakeShared<features::ChunkingInterceptor>("AwsSmithyClientBase", [this]() {
+                  Aws::Client::ClientConfiguration chunkingConfig = *m_clientConfig;


ok all of this exists because of my comment that we should be passing the whole client configuration, i'll admit i was wrong, lets use the enum value, that makes this all together better then you dont need any of this and a simple constructor call to

m_httpClient->IsDefaultAwsHttpClient() ? Aws::Client::HttpClientChunkedMode::DEFAULT : m_clientConfig->httpClientChunkedMode

will suffice, lets do that again, thats simpler and makes the better, sorry for the shuffling around there, lets do that though, that will make this better

kai-ion force-pushed the chunky branch 2 times, most recently from 083ea7a to a6b040e Compare December 2, 2025 19:54

sbiscigl reviewed Dec 3, 2025

View reviewed changes

kai-ion force-pushed the chunky branch from a6b040e to 324bd29 Compare December 3, 2025 16:27

kai-ion marked this pull request as draft December 3, 2025 21:25

kai-ion marked this pull request as ready for review December 3, 2025 21:32

sbiscigl reviewed Dec 4, 2025

View reviewed changes

kai-ion force-pushed the chunky branch from 4a758e6 to 4dba26f Compare December 8, 2025 20:14

kai-ion marked this pull request as draft December 9, 2025 03:57

kai-ion marked this pull request as ready for review December 9, 2025 15:40

kai-ion marked this pull request as draft December 9, 2025 18:12

kai-ion force-pushed the chunky branch from 92b9c5b to 52b64a0 Compare December 9, 2025 18:17

kai-ion marked this pull request as ready for review December 9, 2025 21:22

kai-ion force-pushed the chunky branch from 4c65270 to 86518de Compare December 10, 2025 18:50

sbiscigl reviewed Dec 11, 2025

View reviewed changes

kai-ion added 2 commits December 11, 2025 11:03

added testing for httpclient override

3834efd

kai-ion force-pushed the chunky branch from 59966fb to 3834efd Compare December 11, 2025 16:07

kai-ion added 2 commits December 11, 2025 12:48

changing the streambuff type to a vector

5bb0545

changing the streambuff type to a vector

5c16cc1

sbiscigl reviewed Dec 12, 2025

View reviewed changes

src/aws-cpp-sdk-core/include/smithy/client/features/ChunkingInterceptor.h Show resolved Hide resolved

adding a way to remove unbound growth

631c894

sbaluja reviewed Dec 12, 2025

View reviewed changes

reversing logic to check for chunked mode

a59d0a8

sbaluja reviewed Dec 12, 2025

View reviewed changes

reversing logic to check for chunked mode

e8d12d2

sbiscigl approved these changes Dec 12, 2025

View reviewed changes

kai-ion added 3 commits December 12, 2025 15:54

changing it to just passing in an enum

aba0f89

changing chunking interceptor to use array instead of vector

0c72d39

changing chunking interceptor to use array instead of vector

e7cb1f6

Refactor AWS Chunked logic from HTTP clients #3635

Are you sure you want to change the base?

Refactor AWS Chunked logic from HTTP clients #3635

Uh oh!

Conversation

kai-ion commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbiscigl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kai-ion commented Dec 2, 2025 •

edited

Loading