Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support library for returning Dropshot range requests #6963

Merged
merged 12 commits into from
Nov 21, 2024
Merged

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Oct 30, 2024

This library is used as a part of #1599 , I'm pulling it out of #6782 to help make the diff smaller.

This PR adds a range-requests crate for dropshot-based range requests.

Heavily inspired by @jclulow 's work in https://github.com/oxidecomputer/buildomat/blob/main/download/src/lib.rs

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I had some fairly nitpicky feedback about cases where I think we ought to be returning 419 status codes, and some equally nitpicky suggestions for allocation and header-value-validation golfing. None of this matters all that much, but I figured it was worth mentioning...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, any particular motivation for including this crate in the omicron repo, rather than in the dropshot repo or in its own repo? It seems general-purpose enough that it might eventually be useful in other projects, and there are no Omicron-specific dependencies beyond omicron-workspace-hack (which I assume could be dropped if this library lived elsewhere)...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally, I want to leave this door open, but I figured I'd just start locally to iterate quickly. It's already a separate crate, so if we keep the number of dependencies low, it should be easy to move out.

I haven't actually used this much from the client side yet, but I'd like to also validate that half works before exporting it out of Omicron, if that's cool?

Comment on lines 19 to 20
#[error("Failed to parse range")]
Parse(http_range::HttpRangeParseError),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this format the underlying parse error as well, rather than potentially discarding it? note that the inner error it isn't annotated with #[source], so it won't be part of the source chain when formatting this error.

perhaps this:

Suggested change
#[error("Failed to parse range")]
Parse(http_range::HttpRangeParseError),
#[error("Failed to parse range: {0}")]
Parse(http_range::HttpRangeParseError),

and/or this:

Suggested change
#[error("Failed to parse range")]
Parse(http_range::HttpRangeParseError),
#[error("Failed to parse range")]
Parse(#[source] http_range::HttpRangeParseError),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated -- can't use source, because tragically HttpRangeParseError doesn't implement error, but I'll show the debug representation at least.

Comment on lines 29 to 31
Error::MultipleRangesUnsupported | Error::Parse(_) => {
HttpError::for_bad_request(None, err.to_string())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the semantics of 419 Range Not Satisfiable correctly, I think Error::Parse(HttpRangeParseError::NoOverlap) should return a Range Not Satisfiable error, given that it indicates that the first byte position is out of range.

I'm not sure whether HttpRangeParseError::InvalidRange indicates a condition that should get a 419 or a 400. The documentation doesn't make it clear what "invalid range" means --- if it indicates that a syntactically valid range was parsed but the range didn't make sense (e.g. start greater than end), that should probably also be a 419?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 416 for these cases!

fn make_response_common(
range: Option<SingleRange>,
file_length: u64,
content_type: Option<&str>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: we might want to consider taking the content-type header as an http::HeaderValue (or impl Into<http::HeaderValue>, rather than as a &str. i believe using an &str as a header value will always allocate, but a http::HeaderValue can also be constructed from an &'static str without allocating, using HeaderValue::from_static. If we took that here, we could avoid the allocation in the &'static str case...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to impl Into

content_type: Option<&str>,
) -> hyper::http::response::Builder {
let mut res = Response::builder();
res = res.header(ACCEPT_RANGES, "bytes");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: as i mentioned above, we might consider using a const HeaderValue for the value of our accept-ranges header?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 187 to 189
.unwrap()
.checked_sub(1)
.unwrap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicky, goofy: perhaps these unwraps ought to be expects? i think both are more or less guaranteed unreachable, but it might be nice to have the expects document why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on making these expects

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!


/// Generate the Content-Range header for inclusion in a HTTP 206 partial
/// content response using this range.
pub fn to_content_range(&self) -> String {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: it might be nice if this function returned an http::HeaderValue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/// Generate a Range header for inclusion in another HTTP request; e.g.,
/// to a backend object store.
pub fn to_range(&self) -> String {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, this could perhaps return a HeaderValue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 209 to 210
pub fn content_length(&self) -> u64 {
assert!(self.range.length > 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should return a NonZeroU64, since we promise that it is one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

use tokio_util::io::ReaderStream;

#[test]
fn parse_range_valid() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a blocker, but this feels like it could potentially be a nice use-case for property tests eventually?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh I checked to see if omicron pulled in quick check so that I could suggest the same thing and then I realized the crate I actually wanted to suggest was proptest.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and added a couple proptest tests -- this is my first real usage of it, so I welcome extra scrutiny here.

@papertigers
Copy link
Contributor

I think this largely looks good to me. I came here to say that this would be useful as a a crate outside of omicron so that it could be used elsewhere.

There are no consumers of this crate yet as of this PR - do you plan on merging it ahead of the other PR or roughly at the same time as that in progress PR?

),
Error::Parse(_) => HttpError::for_client_error(
None,
http::StatusCode::RANGE_NOT_SATISFIABLE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to continue nitpicking about this, but I thought that only parseable-but-nonsensical range requests were supposed to return this status, and one that's syntactically malformed should get Bad Request instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good feedback, I'll change this case to "bad request".

| Error::RangeOverflow
| Error::EmptyRange => HttpError::for_client_error(
None,
http::StatusCode::RANGE_NOT_SATISFIABLE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per MDN, when returning a 419, the server is also supposed to add a Content-Range header indicating the actual length of the resource:

The 416 response message should contain a Content-Range indicating an unsatisfied range (that is a '*') followed by a '/' and the current length of the resource, e.g., Content-Range: bytes */12777

(see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/416)

I don't think it's practical to actually do that in this PR, since Dropshot doesn't presently have a nice way to add headers to error responses (see oxidecomputer/dropshot#801, oxidecomputer/dropshot#39). But, it might be worth adding a comment or something to remember to return such a header if/when Dropshot adds a way to do so. We might also want to go ahead and add a field for the actual length of the resource to these error variants, so that it can be used to generate the response headers later — for now, we could also include the actual length in the formatted representation of the errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think I might go take a crack at implementing oxidecomputer/dropshot#39 in the near future, because I've also needed it for very different reasons and it seems like it could be fun to do...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're doing this in this PR -- see bad_range_response, where we do attach these headers. If we try to parse a SingleRange from a PotentialRange and fail with this error, we'll attach the necessary headers.

I refactored this in 6c2562d to make this harder to mis-use - I've removed the conversion to HttpError that automatically happens, and ensured that the headers get supplied in-line.

That said, we'd still greatly benefit from oxidecomputer/dropshot#39. I'm returning a Response<Body> directly, just to be able to supply these headers, and I'd really prefer not to do that.

@smklein
Copy link
Collaborator Author

smklein commented Nov 15, 2024

Heyo, so my plan was to merge this before #6782, in an attempt to make that PR smaller. Would y'all be okay taking another look here?

Just to make sure we're on the same page:

  • I'm totally on-board with moving this out of Omicron -- just wanted to keep it in here for the short-term to iterate quickly
  • I'm also on-board with refactoring this around could use customisation of error presentation dropshot#39 , to return something more strongly-typed than simply Response<Body> -- but I also don't think that needs to hold-up subsequent integration

Copy link
Contributor

@papertigers papertigers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with this as is unless @hawkw had feedback that she wanted addressed. Also I think leaving this in omicron for now while it gets some milage before pulling it out into a standalone crate is a good idea!

+1

@smklein smklein merged commit bb3536f into main Nov 21, 2024
18 checks passed
@smklein smklein deleted the range-requests branch November 21, 2024 20:09
@iliana iliana mentioned this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants