Skip to content

HDDS-15179. Support Chunked Transfer without Content Length#10196

Open
peterxcli wants to merge 4 commits intoapache:masterfrom
peterxcli:fix/s3-chunked-transfer-without-content-length
Open

HDDS-15179. Support Chunked Transfer without Content Length#10196
peterxcli wants to merge 4 commits intoapache:masterfrom
peterxcli:fix/s3-chunked-transfer-without-content-length

Conversation

@peterxcli
Copy link
Copy Markdown
Member

@peterxcli peterxcli commented May 5, 2026

What changes were proposed in this pull request?

Currently chunked transfer would fail if we don't specify content length

def _ev_add_te_header(request, **kwargs):
    request.headers.add_header('Transfer-Encoding', 'chunked')

def test_object_write_with_chunked_transfer_encoding():
    bucket_name = get_new_bucket()
    client = get_client()

    client.meta.events.register_first('before-sign.*.*', _ev_add_te_header)
    response = client.put_object(Bucket=bucket_name, Key='foo', Body='bar')

    assert response['ResponseMetadata']['HTTPStatusCode'] == 200

with the following error:

botocore.exceptions.ClientError: An error occurred (XAmzContentSHA256Mismatch) when calling the PutObject operation: The provided 'x-amz-content-sha256' header does not match the computed hash.

https://github.com/ceph/s3-tests/blob/fb8b73092bb1dd8db829f1205a9e52e73bf9a232/s3tests/functional/test_s3.py#L1589-L1599

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15179

How was this patch tested?

(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)

…quests

Signed-off-by: peterxcli <peterxcli@gmail.com>
@peterxcli peterxcli requested review from ChenSammi and adoroszlai and removed request for adoroszlai May 5, 2026 20:41
@peterxcli
Copy link
Copy Markdown
Member Author

I notice the jira ticket should be as a subtask under https://issues.apache.org/jira/browse/HDDS-8423, is there any way to fix it? I couldn't found the move option.

@peterxcli peterxcli self-assigned this May 5, 2026
@adoroszlai adoroszlai changed the title HDDS-15179. [S3 Compatibility] Support Chunked Transfer without Content Length Set HDDS-15179. Support Chunked Transfer without Content Length May 6, 2026
Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peterxcli for the patch.

@peterxcli peterxcli marked this pull request as draft May 6, 2026 07:44
Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
@peterxcli peterxcli marked this pull request as ready for review May 6, 2026 07:45
@peterxcli peterxcli requested a review from adoroszlai May 6, 2026 07:46
Copy link
Copy Markdown
Contributor

@chungen0126 chungen0126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peterxcli for the patch.

Based on my understanding of S3, requests fall into one of two cases: they either include a Content-Length header or an x-amz-decoded-content-length header. That being said, I totally understand that this proposed change adds flexibility to the upload process, and for me it is ok.

OzoneBucket bucket = context.getBucket();
final String lengthHeader = getHeaders().getHeaderString(HttpHeaders.CONTENT_LENGTH);
long length = lengthHeader != null ? Long.parseLong(lengthHeader) : 0;
if (lengthHeader == null && body != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding S3, we should also consider the scenario where the Content-Length header is absent, but the x-amz-decoded-content-length header is provided. In this case, since we already know the exact size of the payload, we wouldn't need to manually calculate the upload length. See https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html

For all requests, you must include the x-amz-decoded-content-length header, specifying the size of the object in bytes.

peterxcli added 2 commits May 6, 2026 17:43
Signed-off-by: peterxcli <peterxcli@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
@peterxcli peterxcli requested a review from chungen0126 May 6, 2026 11:58
Copy link
Copy Markdown
Contributor

@chungen0126 chungen0126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM. Let's wait for the CI to pass.

Comment on lines +214 to +219
if (lengthHeader == null && body != null && !hasMultiChunksUpload) {
spooledBody = new FileBackedOutputStream(32);
length = IOUtils.copyLarge(body, spooledBody, new byte[getIOBufferSize(0)]);
body = spooledBody.asByteSource().openStream();
hasCalculatedLength = true;
}
Copy link
Copy Markdown
Contributor

@ivandika3 ivandika3 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to copy the whole object into a temporary file on S3G machine? In other words, is S3G required to allocate some disk space for this length calculation?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, this might be problematic since firstly currently a lot of cluster (including mine) assume that S3G does not have variable disk requirement (this then require some kind of persistent volume implementation for S3G deployed in K8s). Additionally, putting the whole object into a file and then re-reading again might incur additional disk IO.

if (canCreateDirectory &&
(length == 0 || hasAmzDecodedLengthZero) &&
hasKnownZeroLength &&
StringUtils.endsWith(keyPath, "/")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see, the length is only need to check whether the length is 0 or not. In that case is it possible to simply check the first byte existence?

Comment on lines +214 to +219
if (lengthHeader == null && body != null && !hasMultiChunksUpload) {
spooledBody = new FileBackedOutputStream(32);
length = IOUtils.copyLarge(body, spooledBody, new byte[getIOBufferSize(0)]);
body = spooledBody.asByteSource().openStream();
hasCalculatedLength = true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, this might be problematic since firstly currently a lot of cluster (including mine) assume that S3G does not have variable disk requirement (this then require some kind of persistent volume implementation for S3G deployed in K8s). Additionally, putting the whole object into a file and then re-reading again might incur additional disk IO.

@peterxcli
Copy link
Copy Markdown
Member Author

thanks for the review, let me think about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants