Current chunked encoding implementation triggers extensive copying

When `hackney` fetches a chunked response, it uses the very same binary to [accumulate incoming data](https://github.com/benoitc/hackney/blob/master/src/hackney_http.erl#L138) and [match for a chunk size](https://github.com/benoitc/hackney/blob/master/src/hackney_http.erl#L481). In general, it is a [clearly documented anti-pattern](http://erlang.org/doc/efficiency_guide/binaryhandling.html#id66216) that causes extensive copying and produces enormous amounts of garbage. 

In this particular case for chunks that are much bigger than MTU (and therefore bigger than binaries sent by `gen_tcp`) this may happen many times. It caused our application to consume tens of gigabytes of RAM while fetching mere hundreds of megabytes from an upstream service. Besides this behavior is likely the cause for #77.

I suggest introducing an additional "inside chunk" state to avoid peeking inside a data-binary every time. My initial intention was to send a pull request, but I wasn't able to find a test that covers that part of code, therefore I decided to simply file this issue.

For now, we solved the issue by switching to `ibrowse` that does not manifest the described problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Current chunked encoding implementation triggers extensive copying #378

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Current chunked encoding implementation triggers extensive copying #378

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions