Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only remove the unescape prefix character when parsing #131

Merged
merged 1 commit into from
Jan 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions lib/csv/decoding/decoder.ex
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ defmodule CSV.Decoding.Decoder do
Must be a codepoint (syntax: ? + (your separator)).
* `:escape_character` – The escape character token to use, defaults to `?"`.
Must be a codepoint (syntax: ? + (your escape character)).
* `:escape_max_lines` – The number of lines an escape sequence is allowed
* `:escape_max_lines` – The number of lines an escape sequence is allowed
to span, defaults to 10.
* `:field_transform` – A function with arity 1 that will get called with
each field and can apply transformations. Defaults to identity function.
Expand All @@ -37,7 +37,7 @@ defmodule CSV.Decoding.Decoder do
* `:validate_row_length` – When set to `true`, will take the first row of
the csv or its headers and validate that following rows are of the same
length. Defaults to `false`.
* `:escape_formulas` – When set to `true`, will remove formula escaping
* `:unescape_formulas` – When set to `true`, will remove formula escaping
inserted to prevent [CSV Injection](https://owasp.org/www-community/attacks/CSV_Injection).

## Examples
Expand Down
13 changes: 8 additions & 5 deletions lib/csv/decoding/parser.ex
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ defmodule CSV.Decoding.Parser do

* `:separator` – The separator token to use, defaults to `?,`.
Must be a codepoint (syntax: ? + (your separator)).
* `:field_transform` – A function with arity 1 that will get called with
* `:field_transform` – A function with arity 1 that will get called with
each field and can apply transformations. Defaults to identity function.
This function will get called for every field and therefore should return
This function will get called for every field and therefore should return
quickly.
* `:unescape_formulas` – When set to `true`, will remove formula escaping
* `:unescape_formulas` – When set to `true`, will remove formula escaping
inserted to prevent [CSV Injection](https://owasp.org/www-community/attacks/CSV_Injection).

## Examples
Expand Down Expand Up @@ -88,11 +88,14 @@ defmodule CSV.Decoding.Parser do
unescape_formulas = options |> Keyword.get(:unescape_formulas, @unescape_formulas)

if unescape_formulas do
formula_pattern = :binary.compile_pattern(@escape_formula_start)
formula_pattern =
@escape_formula_start
|> Enum.map(fn char -> @escape_formula_prefix <> char end)
|> :binary.compile_pattern()

fn field ->
case :binary.match(field, formula_pattern) do
{1, _} -> binary_part(field, 1, byte_size(field) - 1)
{0, _} -> binary_part(field, 1, byte_size(field) - 1)
_ -> field
end
end
Expand Down
1 change: 1 addition & 0 deletions lib/csv/defaults.ex
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ defmodule CSV.Defaults do
@escape_formulas false
@unescape_formulas false
@escape_formula_start ["=", "-", "+", "@"]
@escape_formula_prefix "'"
end
end
end
2 changes: 1 addition & 1 deletion lib/csv/encoding/encode.ex
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ defimpl CSV.Encode, for: BitString do

data =
if escape_formulas and String.starts_with?(data, @escape_formula_start) do
"'" <> data
@escape_formula_prefix <> data
else
data
end
Expand Down
15 changes: 13 additions & 2 deletions test/decoding/parser_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -151,14 +151,25 @@ defmodule DecodingTests.ParserTest do
end

test "removes escaping for formula when unescape_formulas is set to true" do
input = [["=1+1", ~S(=1+2";=1+2), ~S(=1+2'" ;,=1+2)], ["-10+7"], ["+10+7"], ["@A1:A10"]]
input = [
["=1+1", ~S(=1+2";=1+2), ~S(=1+2'" ;,=1+2)],
["-10+7"],
["+10+7"],
["@A1:A10"],
["X-1"],
["B+1"],
["C=1"]
]

assert encode_decode_loop([input], escape_formulas: true, unescape_formulas: true) == [
ok: [
"=1+1=1+2\";=1+2=1+2'\" ;,=1+2",
"-10+7",
"+10+7",
"@A1:A10"
"@A1:A10",
"X-1",
"B+1",
"C=1"
]
]
end
Expand Down
Loading