Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend GeoLookup ability #3392

Open
bretg opened this issue Jan 9, 2024 · 6 comments
Open

Extend GeoLookup ability #3392

bretg opened this issue Jan 9, 2024 · 6 comments

Comments

@bretg
Copy link
Contributor

bretg commented Jan 9, 2024

There are several use cases for having Prebid Server do geographic lookups:

  1. Privacy regulation targeting (GDPR, Activity Controls)
  2. Bidder geo-scoping
  3. Allow for modules to know user location: traffic shaping, timeout optimization, ...

Currently only PBS-Java does geo-lookups, and only for GDPR scope as described here. The lookup is only called when no other signals indicate GDPR scope and when the account wants PBS to enforce GDPR.

The problem is that geo-lookups have a cost in both latency and money, so there should be controls for the host company to manage the volume.

The proposal is that there should be account-level config that will cause it to do geo-lookups early in the workflow in support of the above use cases.

  1. This should happen before the raw-auction-request module stage. The goal is that as soon as possible at the start of the workflow a module could shape traffic.
  2. If host company config geolocation.enabled is false, don't do lookup. Default is false.
  3. If the account ID is available do the account config lookup. Else, if account is not available, look in the host-level default account config. If the new configsettings.geo-lookup is true (defaults to false) and if request $.device.geo.country is not specified, then PBS should do the lookup and set device.geo.country to an ISO-3166-1-alpha-3 code and device.geo.region to ISO-3166-2; 2-letter state code if USA.
  4. Use the existing metrics:
    1. geolocation_requests
    2. geolocation_fail
    3. geolocation_request_time
  5. No change in the GDPR processing other than make sure that it checks for device.geo.country before doing the lookup necessary. To be clear, settings.geo-lookup does not change or disable the ability for PBS to determine GDPR scope per the flowchart. The GDPR lookup feature is disabled if the overall geolocation.enabled is false.
@bretg bretg moved this from Triage to Needs Requirements in Prebid Server Prioritization Jan 9, 2024
@bretg bretg moved this from Needs Requirements to Triage in Prebid Server Prioritization Jan 11, 2024
@bretg bretg moved this from Triage to Needs Requirements in Prebid Server Prioritization Jan 12, 2024
@bsardo
Copy link
Collaborator

bsardo commented Jan 12, 2024

@bretg in an effort to keep things as simple as possible, I'm wondering if we need the host config geolocation.enabled. If a host company does not want to enable geolocation they should be able to do so by omitting settings.geo-lookup from their account configs and just leaning on the account default settings.geo-lookup which defaults to false.
I guess this could be of value though if you want the ability to globally toggle it, perhaps for testing purposes? Is there another scenario I'm missing where you envision this being of value?

From an optimization perspective, this host config isn't needed in PBS-Go since our discussion earlier today at the PMC led to a requirements change where the lookup happens before the raw auction stage instead of before the entrypoint stage. With this requirements change, PBS-Go should be able to take advantage of the existing account fetching logic instead of having to perform geo lookup specific parsing to extract the account ID and fetch the account object.

@bretg
Copy link
Contributor Author

bretg commented Jan 12, 2024

wondering if we need the host config geolocation.enabled.

This is an existing config. The use case for keeping it would be as a master kill switch for geo lookup in case there's an issue with the geo lookup servers.

I'm willing to consider removing it in PBS 3.0, but it would be a breaking change to remove it now. I think it would be fine to not implement in PBS-Go if it doesn't exist there now.

With this requirements change, PBS-Go should be able to take advantage of the existing account fetching logic

There are scenarios where the account ID isn't available until after reading the stored requests.

@muuki88
Copy link

muuki88 commented Jan 18, 2024

I just wanted to throw another idea here. It's fare to assume that most hosting companies do use a cloud provider or at least some sort of loadbalancing. Especially if your are acting globally, you probably do load balance on the geolocation of the user. Here's a short list of commonly used technologies for load balancing around geo

In the end it boils down to two sorts of load balancing

  1. application (HTTP)
  2. network (TCP/IP)

Both variants can be used to make the geo information available directly in the request, without the need for geo location lookup.

Application (HTTP)

Example: https://cloud.google.com/load-balancing/docs/https/custom-headers?hl=en

If host companies use an application load balancers, it can add HTTP headers that contain the geo location. It's just a matter of reading those from the HTTP request.

From my minimal knowledge, application load balancers are more expensive, hence network load balancers are preferred if possible.

Network (TCP/IP)

If the requests is re-routed to another IP address, there's no possibility to append information. However it would be possible to statically provide the information to the running instance in which geo it is running. This would at least provide the continent.

Proposal

We could extend the geo location lookup to a multi step process, where every step can be enabled disabled. For example

geolocation:
  enabled: true
  # define the order in which the geo location should be determined
  lookup:
     - cloudfront-header # checks if there's a http header from cloudfront
     - maxmind           # check a geo database if available
     - static            # use a statically provided value
  # configure all the rest

This is a super rough sketch, just to transport the idea.

@bretg
Copy link
Contributor Author

bretg commented Feb 5, 2024

Interesting thought @muuki88 , but I don't think this is going to be possible with DNS-based geo-balancers like Akamai's GTM... there's no 'edge' in that case for 'cloudlets' to work in or attach headers to.

Here's a counter-proposal:

  • keep the core geolocation config simple. If geo is on the request, use it. (this is the "static" option in your proposal)
  • entities that support CloudFlare/etc can build a raw-auction-request module to read the headers and inject into the request. And ideally open source that module.

@bretg bretg moved this from Needs Requirements to Ready for Dev in Prebid Server Prioritization Feb 14, 2024
@Net-burst
Copy link

Net-burst commented Mar 8, 2024

I want to add a proposal to add a sampling in addition to the feature toggle. So only a certain configurable % of requests for any given account will have geo lookup happen early.
The idea here is to have finer control when the host company has one dominant account that generates most of the traffic. Although this is more of an operational feature.

@bretg
Copy link
Contributor Author

bretg commented Mar 22, 2024

done wit PBS-Java 2.13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Dev
Development

No branches or pull requests

4 participants