Add enhanced GitHub Actions raw logs viewer #2076

Urgau · 2025-06-13T17:50:07Z

This PR adds an enhanced (as in colored) GitHub Actions raw logs viewer.

This is useful for us as Github Actions logs viewer is just unusable in rust-lang/rust given the size of our logs (it loads indefinitely) and the raw logs are full of ANSI sequence cluttering the output.

This works by adding a new endpoint /gha-logs/:owner/:repo/:log-id (restricted to team repos) that fetch from GitHub REST api the raw logs, embeds it directly in the HTML which are then processed by the ansi_up javascript library (loaded from jsdelivr.com CDN).

To prevent XSS injection (or just bugs in ansi_up) I added the Content-Security-Policy header with restriction in script-src. I also tried adding integrity for ansi_up but jsdelivr doesn't guaranteed the same content for minified libraries.

After merging, we should probably adjust rust-log-analyzer to link to this new endpoint instead of to GitHub, so people can use directly.

cc @Mark-Simulacrum

Kobzol

This is a really wonderful idea ❤️ And I can't wait to use it! I can take care of the RLA integration once/if this is merged.

Kobzol · 2025-06-13T19:55:31Z

src/gha_logs.rs

+    Ok(res)
+}
+
+async fn process_logs(


Could we add some cache for this in Context? Maybe a LRU hashmap or something like that? The logs can be large and we probably don't want to spam the GH API when the page is refreshed or a couple of people go visit a link when a job fails.

I though of it, but given some of the size of the logs (some of them go 3-4 Mb each) I'm worried we'll have a huge cache miss due to the limited number of logs we'll be able to cache.

I also don't know how much we can safely cache, I don't how much RAM the triagebot machine has left. Or maybe we'll want to have the cache on disk?

I wouldn't go for a disk cache. I think that even storing a Vec of ~10-100 last results would be enough. Logs are usually accessed only for a brief period of time after CI fails, when a bunch of people may get notified and they go check the logs. After a few days, people usually don't care about the log anymore. So remembering the last N logs to have a history of at least a few hours would help to avoid needless API requests, IMO.

Not sure about RAM size, but storing e.g. 20 results should hopefully be fine. @Mark-Simulacrum should know the machine specs, I think.

Okay, added a small cache of 50Mb for the raw logs in RAM.

We currently allocate 1/4 of a CPU and 0.5 GB of memory: https://github.com/rust-lang/simpleinfra/blob/bcba51d544106fba945c904bf213d64a7bb7a473/terraform/shared/services/triagebot/main.tf#L47-L48

Looks like most of the memory is currently unused:

I think a small cache isn't bad, but I'd probably suggest that if we want this in the long run it should be backed by S3 (with a, say, 15 day TTL) -- that's pretty cheap and avoids worrying about how much space it uses, and S3 rate limiting is basically impossible. If we did that we could also serve the logs from a CDN rather than through triagebot, which seems also nice.

src/gha_logs.rs

Kobzol · 2025-06-13T20:09:18Z

We could also just host the ansiup file ourselves through triagebot, it's relativey small (~14 KiB).

Kobzol

Impl looks good, but I'd like to hear from Mark or someone else from t-infra about potential security concerns.

src/gha_logs.rs

Mark-Simulacrum · 2025-06-16T11:20:32Z

src/gha_logs.rs

+    Ok(res)
+}
+
+async fn process_logs(


We currently allocate 1/4 of a CPU and 0.5 GB of memory: https://github.com/rust-lang/simpleinfra/blob/bcba51d544106fba945c904bf213d64a7bb7a473/terraform/shared/services/triagebot/main.tf#L47-L48

Looks like most of the memory is currently unused:

I think a small cache isn't bad, but I'd probably suggest that if we want this in the long run it should be backed by S3 (with a, say, 15 day TTL) -- that's pretty cheap and avoids worrying about how much space it uses, and S3 rate limiting is basically impossible. If we did that we could also serve the logs from a CDN rather than through triagebot, which seems also nice.

src/gha_logs.rs

Mark-Simulacrum · 2025-06-16T11:25:09Z

src/gha_logs.rs

+        tracing::info!("gha_logs: cache miss for {log_uuid}");
+        let logs = ctx
+            .github
+            .raw_job_logs(


So, I'm wondering if we actually have to make the request from triagebot. Could the page we serve use JS to request directly from GitHub, using the users own GitHub credentials? The plain logs appear to be served with CORS: * -- though not sure if the redirect to them is.

I seem to recall we had even built (or found?) such a page at some point...

How would we get the user GitHub credentials? By using a GitHub App or OAuth app for triagebot?

Doing GH login in triagebot is not trivial and it has been rejected before for the back-office review work, so I don't suppose that's what Mark meant (?). I guess what we could do is just fetch the plain logs URL from the user's browser.

However, it doesn't seem to work? I tried this:

<html> <head> </head> <body> <div id="body"></div> <script> async function load() { const data = await fetch("https://github.com/rust-lang/rust/commit/d04c7466850b10b56c28211ccb880ae34915d431/checks/44186934959/logs"); document.getElementById("body").innerHTML = data; } load(); </script> </body> </html>

and got this:

You can get a * CORS URL by visiting the original logs URL, but it is specific for each user and temporary (it lasts like 5 minutes, IIRC). And you can only get it after you visit the original URL in your browser, not via CORS. So I think that we'll have to download it through the API.

I see. Just tried with the API, and it leads to a 403 about being unauthenticated:

{ "message": "Must have admin rights to Repository.", "documentation_url": "https://docs.github.com/rest/actions/workflow-jobs#download-job-logs-for-a-workflow-run", "status": "403" }

Ok, yeah, sounds like the final link is available - in theory, that means we could still leave the download for users, right? I.e., on triagebot side we cache for ~1 minute ttl the resolve of the API URL (stable) to the short-lived plaintext logs link and then have the user's browser download that and render it as needed. That way we don't have triagebot in the critical path for the big object and caches are much smaller per-element, etc.

That said I personally suspect that we might want to keep the rendering plausibly on the server, I'm not sure how well browsers will cope with the megabytes of HTML + JavaScript rendering all the ansi codes... it might be better to strip them on the server side.

So let's not change anything here but keep it in mind if we run into issues.

I don't think there's an API endpoint for figuring out the temporary URL though.

That said I personally suspect that we might want to keep the rendering plausibly on the server, I'm not sure how well browsers will cope with the megabytes of HTML + JavaScript rendering all the ansi codes... it might be better to strip them on the server side.

I have looked at our longest CI jobs (like 5/6Mb of raw jobs) and the Firefox renders it surprisingly fine (less than 1s). The longest part is downloading the raw jobs logs, not rendering it.

I don't think there's an API endpoint for figuring out the temporary URL though.

It's the same API, but instead of following the redirects we need to look at the Location: header of the GitHub API call response instead of following it to it's destination.

Mark-Simulacrum · 2025-06-16T22:43:45Z

src/gha_logs/[email protected]

+/**
+ * Bundled by jsDelivr using Rollup v2.79.2 and Terser v5.39.0.
+ * Original file: /npm/[email protected]/ansi_up.js
+ */


What's the license on this? Can we get it copy/pasted into this comment?

Oh yeah, forgot about the license. It's MIT license. Included it.

Urgau requested a review from Kobzol June 13, 2025 17:50

Kobzol reviewed Jun 13, 2025

View reviewed changes

Urgau force-pushed the gha_logs_ansi_up branch 4 times, most recently from 32638f3 to 3242c4f Compare June 14, 2025 15:11

Urgau added 2 commits June 14, 2025 20:19

Add method for getting the raw job logs

c7b96dd

Add cached team repos

d0b09ee

Urgau force-pushed the gha_logs_ansi_up branch from 3242c4f to a85fafb Compare June 14, 2025 18:22

Kobzol reviewed Jun 14, 2025

View reviewed changes

src/gha_logs.rs Outdated Show resolved Hide resolved

Add enhanced GitHub Actions raw logs viewer

d006123

Urgau force-pushed the gha_logs_ansi_up branch from a85fafb to 68996b0 Compare June 14, 2025 22:16

Mark-Simulacrum reviewed Jun 16, 2025

View reviewed changes

Add simple cache to gha_logs endpoint

35254b5

Urgau force-pushed the gha_logs_ansi_up branch from 68996b0 to 93efa64 Compare June 16, 2025 18:40

Mark-Simulacrum reviewed Jun 16, 2025

View reviewed changes

Serve ansi_up javascript library our-selves

0201040

Urgau force-pushed the gha_logs_ansi_up branch from 93efa64 to 0201040 Compare June 17, 2025 05:39

Add enhanced GitHub Actions raw logs viewer #2076

Are you sure you want to change the base?

Add enhanced GitHub Actions raw logs viewer #2076

Conversation

Urgau commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kobzol left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kobzol Jun 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kobzol commented Jun 13, 2025

Uh oh!

Kobzol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Urgau Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Urgau commented Jun 13, 2025 •

edited

Loading

Kobzol Jun 14, 2025 •

edited

Loading

Urgau Jun 17, 2025 •

edited

Loading