-
Notifications
You must be signed in to change notification settings - Fork 110
Add ability to prime URL metrics #1850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
…ing between parent and iframe
…ints logic to dedicated functions
… generating a new UUID
… visibility hidden
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## trunk #1850 +/- ##
==========================================
- Coverage 72.47% 69.10% -3.37%
==========================================
Files 85 88 +3
Lines 7051 7397 +346
==========================================
+ Hits 5110 5112 +2
- Misses 1941 2285 +344
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
'prime_url_metrics_verification_token', | ||
odPrimeUrlMetricsVerificationToken | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authentication for REST API
-
WP Nonce Limitation: The default WordPress (WP) nonce does not function correctly when generated for the parent page and then passed to an iframe for REST API requests.
-
Custom Token Authentication: To address this, I have added a custom token-based authentication mechanism. This generates a time-limited token used to authenticate REST API requests made via the iframe.
In #1835 PR, WP nonces are introduced for REST API requests for logged-in users. This may allow us to eliminate the custom token authentication if URL metrics are collected exclusively from logged-in users.
}; | ||
|
||
// Load the iframe | ||
iframe.src = task.url; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently if the IFRAME
shares the same origin as the parent, then it allows it to access the parent session. This ensures that the user session in the page loaded within the iframe (which is a frontend page) matches the logged-in user of the WordPress dashboard.
But if the WordPress admin dashboard and the frontend have different origins, WP nonces won’t work for REST API authentication because the iframe will not recognize the logged-in session. As the different origin does not allow iframe to access parents session. For context I am talking about the REST nonce introduced in #1835.
iframe.style.transform = 'scale(0.05)'; | ||
iframe.style.transformOrigin = '0 0'; | ||
iframe.style.pointerEvents = 'none'; | ||
iframe.style.opacity = '0.000001'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the detect.js
requires the iframe to be visible in the viewport to resolve the onLCP promise. Traditional methods like moving the iframe off-screen using translate
, setting visibility: hidden
, or opacity: 0
cause the promise to hang.
performance/plugins/optimization-detective/detect.js
Lines 496 to 503 in 6ca5c4b
// Obtain at least one LCP candidate. More may be reported before the page finishes loading. | |
await new Promise( ( resolve ) => { | |
onLCP( | |
( /** @type LCPMetric */ metric ) => { | |
lcpMetricCandidates.push( metric ); | |
resolve(); | |
}, | |
{ |
I am using a workaround using the following CSS to keep the iframe minimally visible and functional:
position: fixed;
top: 0px;
left: 0px;
transform: scale(0.05);
transform-origin: 0px 0px;
pointer-events: none;
opacity: 1e-6;
z-index: -9999;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering how you were going to resolve solve for this!
'OD_PRIME_URL_METRICS_REQUEST_SUCCESS', | ||
'*' | ||
); | ||
resolve(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parent and IFRAME
communication is handled via postMessage. A message is sent to the parent, and the promise resolves immediately.
If the promise isn't resolved immediately, navigating to a new URL causes the code following the promise to never execute. This is because changing the iframe.src
does not trigger events like pagehide
, pageswap
, or visibilitychange
.
performance/plugins/optimization-detective/detect.js
Lines 568 to 569 in 6ca5c4b
// Wait for the page to be hidden. | |
await new Promise( ( resolve ) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder. Do we even need to post a message here? As soon as the iframe
is destroyed won't it automatically cause the URL Metric to be sent, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is we need to signal the parent that we can move to next URL or breakpoint using postMessage
as the load
event can't be used. Check this comment for detailed explanation #1850 (comment) .
Will it makes sense to send the postMessage
after the navigator.sendBeacon
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it makes sense to send the message after the beacon is sent, definitely.
This comment was marked as spam.
This comment was marked as spam.
The JS Lint workflow is currently failing because the dependencies for the Priming CLI are not being installed. This is happening because the workflow only installs dependencies from the root This could be solved by updating diff --git a/.github/workflows/js-lint.yml b/.github/workflows/js-lint.yml
index 365d32b9..b840bc80 100644
--- a/.github/workflows/js-lint.yml
+++ b/.github/workflows/js-lint.yml
@@ -45,7 +45,9 @@ jobs:
node-version-file: '.nvmrc'
cache: npm
- name: npm install
- run: npm ci
+ run: |
+ npm ci
+ npm ci --prefix plugins/optimization-detective/priming-cli
- name: JS Lint
run: npm run lint-js
- name: TypeScript compile cc: @westonruter |
@b1ink0 This makes sense to me. Might as well include that in this PR. |
if ( '' !== odPrimeUrlMetricsVerificationToken ) { | ||
resolve(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that just came to mind here. In order for the URL Metrics to be fully primed, we need to make sure that the entire page is scrolled into view. This will allow Embed Optimizer, for example, to capture the resized heights of all embeds if they are lazy-loaded. So it seems like right here before resolving it should scroll vertically be 100vh
with a short delay between and then scroll down another 100vh
until the bottom is reached. Then it needs to either wait for the debounced re-compression of the URL Metric data, or there should be a function to force recompression and bypass any debouncing and idle callback. Then after the page has been scrolled to the bottom and any pending updates to the URL Metric have resulted in a newly-compressed Blob
, then this can go ahead and resolve
to proceed with sending the URL Metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then after the page has been scrolled to the bottom
I think we will have to add some limit how many time do we scroll as there could be infinite scroll in the page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point! 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of a limit we could just check if the page height changes ( not small changes but like 50vh
changes ) then we could just stop scroll logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good idea. We wouldn't want it to stop scrolling once a Tweet loads and changes the page height. That said, some tweets can be quite tall on mobile viewports. So maybe it should allow for a few large document height changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in ec5dcd7
odPrimeUrlMetricsVerificationToken | ||
); | ||
} | ||
|
||
navigator.sendBeacon( url, payloadBlob ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that in trunk
this is using fetch()
, I propose that when odPrimeUrlMetricsVerificationToken
is defined we do not use the keepalive
parameter. We can then look at the response for a success and send that with the message sent to the parent window. This will then given the parent window the opportunity to either retry or provide some error indicator to the user, or at least to log something out to the console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 3a795e9
try { | ||
if ( | ||
win.parent && | ||
win.location.origin === win.parent.location.origin | ||
) { | ||
/** @type {HTMLIFrameElement|null} */ | ||
const urlPrimeIframeElement = win.parent.document.querySelector( | ||
'iframe#od-prime-url-metrics-iframe' | ||
); | ||
if ( | ||
urlPrimeIframeElement && | ||
urlPrimeIframeElement.dataset.odPrimeUrlMetricsVerificationToken | ||
) { | ||
odPrimeUrlMetricsVerificationToken = | ||
urlPrimeIframeElement.dataset | ||
.odPrimeUrlMetricsVerificationToken; | ||
} | ||
} | ||
} catch ( e ) { | ||
// Ignoring error caused possibly due to cross-origin iframe access. | ||
} | ||
|
||
// Only available when page is loaded by Puppeteer script. | ||
if ( win.__odPrimeUrlMetricsVerificationToken ) { | ||
odPrimeUrlMetricsVerificationToken = | ||
win.__odPrimeUrlMetricsVerificationToken; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest the query parameter be supplied consistently by adding a URL query parameter instead. The cross-domain issue could be a common one, where wp-admin is on a different origin than the frontend. Then the token would be supplied in the same way for Puppeteer as for non-Puppeteer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and as opposed to passing the parameter as a query parameter it could also be supplied in the URL fragment instead. The benefit here would be it wouldn't get bust any page caches, although caching would be unlikely since the user is already logged-in. But another reason is that sometimes I've noticed that requests with unrecognized query parameters get redirected with those stripped out. For example:
https://joost.blog/wordpress-leadership/?odPrimeUrlMetricsVerificationToken=abc123
Notice how that redirects with the query parameter removed. Compare with:
https://joost.blog/wordpress-leadership/#odPrimeUrlMetricsVerificationToken=abc123
The URL target here now remains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The URL fragment approach seems very good. I had previously thought of using a query parameter, but I was also worried about the same problem you described. Additionally, we would have had to remove the query parameter from the currentUrl
. I was also considering using postMessage
to send the verification token, as cross-origin access can't reach the dataset of the iframe. Using a URL fragment will eliminate the complex logic required for postMessage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 4f93e14
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a side note, when we use URL fragments in this case—since I'm not creating a new iframe for each URL—setting the same URL (with different breakpoints) no longer reloads the page. Previously, setting iframe.src
to the same URL would force a reload. However, when a URL fragment is used, the browser behaves differently: it doesn't reload the page but instead attempts to scroll to the element with the matching ID.
This means we now need to navigate to about:blank
after processing each URL to reset the iframe's state. I was already doing this inside a cleanup
function, so theoretically, this shouldn't have caused any issues.
However, I encountered a strange behavior: when navigating from a URL with a fragment to about:blank
, the assignment iframe.src = 'about:blank'
becomes asynchronous. Since I was using a promise to detect when URL processing was complete—and resolving it after calling the cleanup
function (where iframe.src = 'about:blank'
is set)—the next URL would be set too early.
As a result, when the next task uses the same base URL with the same fragment, the iframe does not reload. Only its dimensions change, and then about:blank
is set asynchronously by the previous task’s cleanup function.
The weirdest part: if I inspect the iframe in DevTools at this point, the src
attribute is set to the new URL, but the document inside is still about:blank
. When I try to access iframe.src
via JavaScript, it reports the new URL, not about:blank
.
I was able to fix this by delaying the resolution of the processTask
promise until the load
event for iframe.src = 'about:blank'
is triggered.
// Load the iframe | ||
iframe.src = task.url; | ||
iframe.width = task.width.toString(); | ||
iframe.height = task.height.toString(); | ||
iframe.dataset.odPrimeUrlMetricsVerificationToken = | ||
verificationToken; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/WordPress/performance/pull/1850/files#r2038090196
I suggest adding the token as a query parameter for the URL being loaded in the iframe.
function od_get_standard_breakpoints(): array { | ||
$widths = od_get_breakpoint_max_widths(); | ||
sort( $widths ); | ||
|
||
$min_width = $widths[0]; | ||
$max_width = (int) end( $widths ) + 300; // For large screens. | ||
$widths[] = $max_width; | ||
|
||
// We need to ensure min is 0.56 (1080/1920) else the height becomes too small. | ||
$min_ar = max( 0.56, od_get_minimum_viewport_aspect_ratio() ); | ||
// Ensure max is 1.78 (1920/1080) else the height becomes too large. | ||
$max_ar = min( 1.78, od_get_maximum_viewport_aspect_ratio() ); | ||
|
||
// Compute [width => height] for each breakpoint. | ||
return array_map( | ||
static function ( $width ) use ( $min_width, $max_width, $min_ar, $max_ar ) { | ||
// Linear interpolation between max_ar and min_ar based on width. | ||
$ar = $max_ar - ( ( $max_ar - $min_ar ) * ( ( $width - $min_width ) / ( $max_width - $min_width ) ) ); | ||
$ar = max( $min_ar, min( $max_ar, $ar ) ); | ||
|
||
return array( | ||
'width' => $width, | ||
'height' => (int) round( $ar * $width ), | ||
); | ||
}, | ||
$widths | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The viewport dimensions returned here are critical. Something I'm finding for sites in the field is that on mobile there may be an image preloaded for a larger phone which then is not the actual LCP element on a smaller phone.
Instead of using the breakpoint max widths and deriving the heights from each, I think think we'll need to instead obtain what the most common viewport size for each category: mobile, phablet, tablet, and desktop. Naturally, the most common viewport sizes will vary depending on the region of the world where visitors are most commonly visiting from. So this is tricky. It may need to be filterable, or perhaps even eventually derived from the URL Metrics which have been collected (excluding the ones obtained via priming).
I consulted with Gemini to provide a table of the most common viewport dimensions in each category: https://g.co/gemini/share/f5898dd2304e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the od_get_breakpoint_max_widths
function is used to set the breakpoints for the OD_URL_Metric_Group_Collection
, and the od_breakpoint_max_widths
filter also allows filtering the breakpoints, I am concerned that using a different standard viewport could cause problems with optimization, although I am not entirely sure about this.
So, I think we could use a standard height suggested by Gemini based on the width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may well be that the breakpoint max widths I had chosen aren't reflective of actual devices being used. I picked them based on the responsive breakpoints in the Gutenberg stylesheets. So we may need to revisit them. But if we do that, we may need to reconsider whether just having the first group and last group is sufficient to allow fetchpriority=high
to be added to the IMG
if it is the LCP element in both. If the first group is actually 0px-360px and the second group is 360px-480px, then if a site gets a lot of iPhone traffic then the first group could be empty.
In any case, we don't actually have to use od_get_breakpoint_max_widths()
in the priming, although ideally it would factor in somehow. Picking the most common viewport size in each group I think is better than trying to compute a viewport based on the breakpoint max width.
'prime_url_metrics_verification_token' => array( | ||
'type' => 'string', | ||
'description' => __( 'Nonce for auto priming URLs.', 'optimization-detective' ), | ||
'required' => false, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that the URL Metric schema be updated to include a non-required property source
which can have a value of either synthetic
, visitor
, user
. When the prime_url_metrics_verification_token
is defined, then the value can be synthetic
. This would be very helpful when debugging URL Metrics or, for example, collecting viewport dimensions used by actual visitors to the site, by just looking at URL Metrics which have the source
value of visitor
. (Names and values for this property only provisional, not fully thought out.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is_user_logged_in()
is true
, then the value would be user
, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added it in ff6ad0f is this the correct way to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks close, but I think we can avoid having to rely on the client specifying the source
. We can derive that from other information: #1850 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing because of this function which asserts all properties in URL metrics schema to be required
but the new source
property is not required
. Should the check for required
true
be removed?
performance/plugins/optimization-detective/tests/test-class-od-url-metric.php
Lines 920 to 924 in ff8b62b
protected function check_schema_subset( array $schema, string $path, bool $extended = false ): void { | |
$this->assertArrayHasKey( 'required', $schema, $path ); | |
if ( ! $extended ) { | |
$this->assertTrue( $schema['required'], $path ); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add an exception like which was originally present for the etag
which was originally not required: bd642b4#diff-1b20a90cae49130dddd782c9983b134d549cacbed62c47a5f7f45138f153922cL920
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also plan to make the source
property required
in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could be made required once we're at a point where all existing URL Metrics stored will have that property set. This could happen after a couple months of collection. We just don't want to make it required now so we don't invalidate URL Metrics unnecessarily.
Co-authored-by: Weston Ruter <[email protected]>
Co-authored-by: Weston Ruter <[email protected]>
* | ||
* @since n.e.x.t | ||
* | ||
* @return non-empty-string|null Source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The literal values could be used instead:
* @return non-empty-string|null Source. | |
* @return 'visitor'|'user'|'synthetic'|null Source. |
source: restApiNonce ? 'user' : 'visitor', | ||
}; | ||
|
||
if ( odPrimeUrlMetricsVerificationToken ) { | ||
urlMetric.source = 'synthetic'; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Populating the source
should probably be done in the endpoint itself. It should be a readonly
param.
'source' => array( | ||
'description' => __( 'The source of the URL Metric.', 'optimization-detective' ), | ||
'type' => 'string', | ||
'required' => false, | ||
'enum' => array( | ||
'visitor', | ||
'user', | ||
'synthetic', | ||
), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be readonly
, populated in the endpoint callback here:
performance/plugins/optimization-detective/storage/class-od-rest-url-metrics-store-endpoint.php
Lines 214 to 217 in ff8b62b
// Now supply the readonly args which were omitted from the REST API params due to being `readonly`. | |
'timestamp' => microtime( true ), | |
'uuid' => wp_generate_uuid4(), | |
'etag' => $request->get_param( 'current_etag' ), |
The value can be synthetic
if the prime_url_metrics_verification_token
param is present, or else user
if is_user_logged_in()
. Otherwise, it can be visitor
.
Co-authored-by: Weston Ruter <[email protected]>
Co-authored-by: Weston Ruter <[email protected]>
Summary
Fixes #1311
Relevant technical choices
This PR introduces a new mechanism for priming URL metrics across the site. It uses a newly added submenu page in the Tools menu and automatically primes URL metrics when a post is saved in the block editor.
TODOS:
Demos
Settings page:
UI.mp4
Saving post in Block Editor:
Block.Editor.mp4