-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discard User Agent if request has SP-Anonymous header #111
Comments
Hi @mbehm User Agent strings alone are not generally considered PII, they typically only might be if they form part of a fingerprint but your Snowplow pipeline doesn't fingerprint users based on a User Agent (or fingerprint them at all by default). I'd suggest running the PII Pseudonymization enrichment to hash the User Agent if you don't want it stored in your DB in its raw form. We'll also start seeing the UA string become less useful over the coming months/years as the browser vendors look to freeze it, which takes it further away from being PII and a fingerprint vector. You have sparked an idea though that I'll consider further. I think there is an opportunity to another enrichment in the Snowplow pipeline that allows for fields to be removed based on the SP-Anonymous header. |
Thank you for the fast response. Yes we're already running the PII enrichment to hash the User Agent and I'm aware that it's being migrated out off. Still I'd prefer to just drop it all together with anonymous tracking before it hits the event queues same way as IP address (which by themselves aren't PII either as far as I know). Regardless an enrichement to conditially drop fields based on SP-Anonymous would indeed be a very useful addition. |
Yeah, I think there are some additional things that can be considered here - I also see how a User Agent string can be considered PII in some use cases. This has opened up some additional thinking and I can see how making this configurable in the collector would be beneficial, even more beneficial than the enrichment idea. I've opened an issue to track that configurable concept in favour of reopening this one (#112). Thanks for your feedback and idea! |
In relation to #90 and #94 as far as I know User Agent strings are considered PII data under GDPR similar to IP addresses and should be discarded for anonymous tracking.
The text was updated successfully, but these errors were encountered: