Skip to content

Commit

Permalink
Fix small and empty HTTP response handling (#195)
Browse files Browse the repository at this point in the history
If a response is very small (or the HTTP range requested "rolls off" the end
of the resource body) S3 will respond with the full resource and with a 200 status code.
If a resource is empty no Range support will be available (and there is no good way to
address a 0-size resource using the HTTP Ranges specification).

To handle this better we will now allow small 200 responses to be treated as normal responses,
as long as they do not exceed the range size we request from the server.
  • Loading branch information
julik authored Sep 10, 2021
1 parent 927d056 commit c665272
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 8 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
## 0.29.1
* Fix handling of 200 responses with `parse_http` as well as handling of very small responses which do not need range access

## 0.29.0
* Add option `headers:` to `FormatParser.parse_http`

Expand Down
2 changes: 1 addition & 1 deletion lib/format_parser/version.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
module FormatParser
VERSION = '0.29.0'
VERSION = '0.29.1'
end
24 changes: 19 additions & 5 deletions lib/remote_io.rb
Original file line number Diff line number Diff line change
Expand Up @@ -89,18 +89,32 @@ def request_range(range)
response = conn.get(@uri, nil, range: 'bytes=%d-%d' % [range.begin, range.end])

case response.status
when 200, 206
when 200
# S3 returns 200 when you request a Range that is fully satisfied by the entire object,
# we take that into account here. Also, for very tiny responses (and also for empty responses)
# the responses are going to be 200 which does not mean we cannot proceed
# To have a good check for both of these conditions we need to know whether the ranges overlap fully
response_size = response.body.bytesize
requested_range_size = range.end - range.begin + 1
if response_size > requested_range_size
error_message = [
"We requested #{requested_range_size} bytes, but the server sent us more",
"(#{response_size} bytes) - it likely has no `Range:` support.",
"The error occurred when talking to #{@uri})"
]
raise InvalidRequest.new(response.status, error_message.join("\n"))
end
[response_size, response.body]
when 206
# Figure out of the server supports content ranges, if it doesn't we have no
# business working with that server
range_header = response.headers['Content-Range']
raise InvalidRequest.new(response.status, "No range support at #{@uri}") unless range_header
raise InvalidRequest.new(response.status, "The server replied with 206 status but no Content-Range at #{@uri}") unless range_header

# "Content-Range: bytes 0-0/307404381" is how the response header is structured
size = range_header[/\/(\d+)$/, 1].to_i

# S3 returns 200 when you request a Range that is fully satisfied by the entire object,
# we take that into account here. For other servers, 206 is the expected response code.
# Also, if we request a _larger_ range than what can be satisfied by the server,
# If we request a _larger_ range than what can be satisfied by the server,
# the response is going to only contain what _can_ be sent and the status is also going
# to be 206
return [size, response.body]
Expand Down
25 changes: 23 additions & 2 deletions spec/remote_fetching_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,27 @@
res.status = 302
res.header['Location'] = req.path.sub('/redirect', '')
end
@server.mount_proc '/empty' do |_req, res|
res.status = 200
res.body = ''
end
@server.mount_proc '/tiny' do |_req, res|
res.status = 200
res.body = File.read(fixtures_dir + '/test.gif')
end

trap('INT') { @server.stop }
@server_thread = Thread.new { @server.start }
end

it '#parse_http is called without any option' do
it 'works with .parse_http called without any options' do
result = FormatParser.parse_http('http://localhost:9399/PNG/anim.png')

expect(result.format).to eq(:png)
expect(result.height_px).to eq(180)
end

it '#parse_http is called with hash options' do
it 'works with .parse_http called with additional options' do
fake_result = double(nature: :audio, format: :aiff)
expect_any_instance_of(FormatParser::AIFFParser).to receive(:call).and_return(fake_result)
results = FormatParser.parse_http('http://localhost:9399/PNG/anim.png', results: :all)
Expand All @@ -39,6 +48,18 @@
expect(results).to include(fake_result)
end

it 'is able to cope with a 0-size resource which does not provide Content-Range' do
file_information = FormatParser.parse_http('http://localhost:9399/empty')

expect(file_information).to be_nil
end

it 'is able to cope with a tiny resource which fits into the first requested range completely' do
file_information = FormatParser.parse_http('http://localhost:9399/tiny')
expect(file_information).not_to be_nil
expect(file_information.nature).to eq(:image)
end

it 'parses the animated PNG over HTTP' do
file_information = FormatParser.parse_http('http://localhost:9399/PNG/anim.png')
expect(file_information).not_to be_nil
Expand Down

0 comments on commit c665272

Please sign in to comment.