Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types: unreserve more characters #18

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

RileyApeldoorn
Copy link

adds all missing characters specified in the sub-delims ABNF rule in RFC 3986 to the unreserved lists in Network.HTTP.Types.URI with two exceptions: ; and + are excluded from the query string list because the library already counts ; as query key-value pair delimiters (so adding them would break compatibility and does indeed cause tests to fail), and + because they have special handling in parseQueryReplacePlus.

@Vlix
Copy link
Owner

Vlix commented Dec 22, 2023

Thank you for the PR, I hope to take some time this weekend to check the correctness of these changes and see what the best course of action is in releasing, since it feels like this change might break people's previous expectations of what this code does.

@Vlix
Copy link
Owner

Vlix commented Dec 23, 2023

Seeing as this will produce different output of a function that is already used in the wild, I'm hesitant to add this change "in-place". It would be better to add urlEncodePath and urlEncodeQuery functions, so that previous code keeps working as expected and to slowly phase out/deprecate the urlEncode :: Bool -> ... function.
(I already hate the Bool arguments, so this might be a good way to slowly change the API)

The problem is that urlEncodeBuilder is used by both renderQueryBuilder and encodePathSegment... so the options are:

  1. we don't change those functions ever and make new ones
  2. we have those functions produce different output than in http-types <= 0.12.4 (at which point, why bother with phasing out urlEncode)

I'm already in the process of adding regression tests and other missing tests, so I'll keep this PR open until I'm happy with the coverage and then we can see if this would break anything after those tests are merged in.

Could you:

  • adjust the doctest to not fail on:
    ./Network/HTTP/Types/URI.hs:529: failure in expression `renderQueryPartialEscape True [("a", [QN "x:z + ", QE (encodeUtf8 "They said: \"שלום\"")])]'
    expected: "?a=x:z + They%20said%3A%20%22%D7%A9%D7%9C%D7%95%D7%9D%22"
     but got: "?a=x:z + They%20said:%20%22%D7%A9%D7%9C%D7%95%D7%9D%22"
    
    So that I can see that the other tests aren't failing too?
  • add url(En|De)code roundtrip (property) tests for all combinations of True and False, to show more obviously that this doesn't break anything.

Also, RFC 3986 states that Query and Segment parts can include non-percent-encoded / and ? as well.

@Vlix
Copy link
Owner

Vlix commented Jan 7, 2024

I've added a bunch of tests and also regression tests, so this change will most certainly fail on the golden test I added. Please make sure to run the test and mv test/.golden/urlEncode-*/actual test/.golden/urlEncode-*/golden before updating this PR. 👍

@RileyApeldoorn
Copy link
Author

adding / to the list of unreserved characters for query strings is no biggie, but adding the ? causes at least the following tests to fail sporadically:

  • parseQuery, is parsed the same regardless of question mark
  • encode/decode query, add ? in front of Query if and only if necessary
  • encode/decode query, is identity to convert to and from Simple

i have no idea how to remedy this situation, though.


also, the roundtrip test i added for urlDecode True and urlEncode False is always falsified, which makes a certain amount of sense for that combination. perhaps it should be converted into a test that is expected to fail, what do you think?

@Vlix
Copy link
Owner

Vlix commented Feb 2, 2024

The failing tests when not percent-encoding ? is because it will sometimes add the ? to the front, and some tests want to make sure the function itself doesn't add a ? to the front.
The failing ones are all property tests, not unit tests, right?

I'll think about the urlDecode True <-> urlEncode False case, I'll come back to you on that at a later date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants