Allow URLs with non-ASCII characters to be parsed #1063
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As described in #973, glTFs that include URIs with non-ASCII characters in them don't currently work with Cesium Native. That's because the library we use for URI parsing,
uriparser
, follows the RFC 3986 standard. According to RFC 3986, only ASCII characters are allowed in URLs, and all other characters must be escaped. Unicode support came later, in RFC 3987 and now the WhatWG URL specification which seems to be the modern standard browsers aim to support. The glTF spec allows Unicode characters in URIs "as-is," meaning a strictly RFC 3986-compliant parser won't do the job. One option would be to substitute a different, WhatWG-compliant parser in its place - but as I described in this comment, this introduces more problems than it solves.Instead, the solution I went with was to encode all non-ASCII characters in the string before passing it to
uriparser
, then decoding them again before returning from the method. This seems to work flawlessly -uriparser
gets the RFC 3986-compliant URLs it expects, and the user gets the WhatWG-compliant URLs they expect. Unfortunately, this solution does introduce an extra layer of string copies into URL parsing. This isn't ideal, but considering URLs are usually fairly short, I think it's a worthwhile tradeoff to gain this compliance with the glTF spec.