Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling EventBlocks and roundtripping #99

Open
TeofilC opened this issue Jun 30, 2023 · 2 comments
Open

Handling EventBlocks and roundtripping #99

TeofilC opened this issue Jun 30, 2023 · 2 comments

Comments

@TeofilC
Copy link
Collaborator

TeofilC commented Jun 30, 2023

The eventlog is structured as a list of blocks of events.

A block has a capability number that specifies the capability of upcoming events, and some information about when the block was written.

Currently we erase block events when reading the eventlog. This leads to two issues:

  • when writing the eventlog back out, we have to recreate blocks. But can't do so properly since information has been lost leading to roundtripping failures. I believe this would be a big step towards addressing Testsuite write-merge failing #14
  • we cannot currently figure out from the eventlog when the application is busy writing to the eventlog, but this is exactly what the two timestamps on the block event tell us.

My proposal is to keep the block events during parsing and require their presence when writing out eventlogs. This introduces some new illegal states, ie, an eventlog without block events could not now be written to a file. How does this sound?

An alternative is to change the types to make these states unrepresentable but I don't think the breaking change from that would be worth it.

@Mikolaj
Copy link
Member

Mikolaj commented Nov 4, 2023

That sounds good to me as a past Threadscope contributor, but we'd probably need feedback from current heavy eventlog users, e.g., @mpickering. Who would be the main consumer of the new feature?

@TeofilC
Copy link
Collaborator Author

TeofilC commented Nov 4, 2023

I think the main consumer would be tools that want to figure out mutator time. Currently we expose information about GC pauses but don't expose information about event log flush pauses. This information could also be added to Threadscope for instance.

There's also quite an old GHC ticket asking for this https://gitlab.haskell.org/ghc/ghc/-/issues/11950. Surprisingly this feature was already implemented in the eventlog when this ticket was opened (!) but just not exposed by GHC-events.

The other appeal is that it makes it a bit easier to process event logs in a streaming way. For instance, currently the API doesn't expose a way to write an eventlog without sorting all the events (in order to create dummy eventblocks). I recently ran into this when trying to filter out a small time range from a very large eventlog. It would also make it a bit easier to process events in order without sorting the eventlog, though it could be done without this too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants