Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv filter: support escaped quotes #2

Open
untergeek opened this issue Jan 23, 2015 · 12 comments · May be fixed by #27
Open

kv filter: support escaped quotes #2

untergeek opened this issue Jan 23, 2015 · 12 comments · May be fixed by #27
Labels

Comments

@untergeek
Copy link
Contributor

Migrated from JIRA: https://logstash.jira.com/browse/LOGSTASH-2272
which was replicated at elastic/logstash#1605

With the following config:

input { stdin { } }
filter { kv { } }
output {
  stdout {
    codec => 'json_lines'
  }
}

The following message:

foo="bar \"baz\""

Should create the following output:

{"message":"foo=\"bar \\\"baz\\\"\"","@version":"1","@timestamp":"1969-01-01T01:01:01.000Z","host":"host.example.net","foo":"bar \"baz\""}

But instead creates the following output:

{"message":"foo=\"bar \\\"baz\\\"\"","@version":"1","@timestamp":"1969-01-01T01:01:01.000Z","host":"host.example.net","foo":"bar \\"}
@jordansissel
Copy link
Contributor

confirmed this bug.

@dalegaspi
Copy link

DO NOT USE THIS WORKAROUND see update note below.

for those who are looking for a workaround, this is what i've done (assuming kvdata is the source of kv pairs:

mutate {
              gsub => ["kvdata","\\"","%22"]
}
kv {
          source => "kvdata"
          field_split => ","
}
urldecode {
         all_fields => true
}

this essentially URL-encode the escaped quote...then URL-decode after kv filter...obviously, not very efficient...but gets the job done.

note that i've tried gsub " with a quote, then gsub quote back to double-quote (tried escaped and unescaped double-quote)...and what happens is that the resulting string has a double-escaped quote...dang! but even if that worked...this becomes a problem if you actually have a legit quote in your string.

UPDATE 10/12/2015: as of this writing, i would advise using the workaround above as it actually triggers another long-standing bug elastic/logstash#3780 which has something to do with the urldecode/urlencode filter (i've verified this is the same error when i apply this workaround)...still prevalent as of v1.5.3

@placeybordeaux
Copy link

This bug is affecting me as well. Looks like it would be a fairly simple regex change, any specific constraints on contributions?

@placeybordeaux
Copy link

Changing the line

From

valueRxString = "(?:"([^\"]+)"|'([^']+)'"

To

valueRxString = "(?:"((.|[^\"])+)"|'((.|[^'])+)'"

Taken from this SO post

Should fix it right? If I have some time this weekend I'll look into actually testing this.

@beniwohli
Copy link

I applied the change proposed by @placeybordeaux and extended the existing test case. Unfortunately, I wasn't able to run the tests locally (not a Ruby dev myself), but the code behaves as it should in my manual tests.

Here's the diff, let me know if I should open a PR: https://github.com/logstash-plugins/logstash-filter-kv/compare/master...piquadrat:issue-2?expand=1

beniwohli added a commit to beniwohli/logstash-filter-kv that referenced this issue Mar 9, 2016
@beniwohli beniwohli linked a pull request Mar 9, 2016 that will close this issue
@untergeek untergeek added the P3 label Apr 26, 2016
@NandanPhadke
Copy link

Are there any updates on this issue? Is this issue still unresolved?

@RomanYu
Copy link

RomanYu commented Dec 26, 2017

hi all, is the 'escaped quote' problem still exists? i still get the bug in 'logstash-filter-kv-4.0.2'

@wanlxp
Copy link

wanlxp commented Aug 18, 2018

hi all, the 'escaped quote' problem still exists, how can i fix it ? i still get the bug in 'logstash-filter-kv-4.0.1'

@MrSsunlight
Copy link

MrSsunlight commented Sep 4, 2018

@untergeek

hi. Escaped quotes brought me parsing errors. The quotes in the value contain escaped quotes and contain the cut symbol. When parsing, the value is split by the cut character instead of using the double quotes as a whole.
My data format is:
"\"aaaa\" bbbb /cde"
Configured as:

kv {
source => "message"
field_split => "[,\s]"
value_split => "="
trim_key => "\s"
trim_value => """
}

the result is:
value = \"aaaa\"

@geekofalltrades
Copy link

Is there any hope for this issue? One of my newer colleagues just opened an internal bug ticket because he noticed parse issues due to this bug in our logging system. I had to point him at duplicate bugs going back YEARS. We never stopped suffering from this.

@ara-mark
Copy link

ara-mark commented Dec 19, 2024

It is the year 2150, the world has been collapsed, rebuilt from scratch and everyone lives happy. except for the poor souls who still have to work around the fact that logstash-kv does not support escaped quotes.

Just kidding, the issue still exists in logstash 8.17 (2024-12-12). A nice vacation time to all!

@edmocosta
Copy link

edmocosta commented Dec 19, 2024

It is the year 2150, the world has been collapsed, rebuilt from scratch and everyone lives happy. except for the poor souls who still have to work around the fact that logstash-kv does not support escaped quotes.

Just kidding, the issue still exists in logstash 8.17 (2024-12-12). A nice vacation time to all!

I guess it means Logstash will survive an apocalypse and it will still be there in 2150 😄


I gave it a try using the latest plugin version, and I'm not pretty sure if the issue still persists:

# bin/logstash -e "input { stdin { } } filter { kv { } } output { stdout { codec => 'json_lines' }}"
....
[2024-12-19T11:49:00,306][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
foo="bar \"baz\""
{"@timestamp":"2024-12-19T10:49:06.931856Z","message":"foo=\"bar \\\"baz\\\"\"","@version":"1","event":{"original":"foo=\"bar \\\"baz\\\"\""},"host":{"hostname":"Edmos-MacBook-Pro.local"},"foo":"bar \\\"baz\\\""}
test="\"aaaa\" bbbb /cde"
{"@timestamp":"2024-12-19T10:57:49.340289Z","test":"\\\"aaaa\\\" bbbb /cde","message":"test=\"\\\"aaaa\\\" bbbb /cde\"","@version":"1","event":{"original":"test=\"\\\"aaaa\\\" bbbb /cde\""},"host":{"hostname":"Edmos-MacBook-Pro.local"}}

Am I missing something?

Thank you!

Edit: Oh I see, there's still an issue with the extra backslashes discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.