Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local development hot-reloading for experiment reports #862

Open
myanvoos opened this issue Mar 13, 2025 · 9 comments
Open

Add local development hot-reloading for experiment reports #862

myanvoos opened this issue Mar 13, 2025 · 9 comments

Comments

@myanvoos
Copy link
Contributor

myanvoos commented Mar 13, 2025

To my understanding, right now while an experiment is running locally with ./run-all-experiments.py, the results go in the results directory (or as specified by the user) whenever they're available. The results-report directory is only updated after we manually run python -m report.web -r results -o results-report. This means that currently there's no automatic refresh/update of the report while the experiment is still running locally.

It might be good to add a functionality that watches the results directory for any new finished experiment, then automatically updates results-report (or equivalent) and hot-reloads the HTTP server accordingly. So you can run the local HTTP server alongside experiments. Perhaps this can be triggered with a --watch flag on ./run_all_experiments.py or on python -m report.web -r results -o results-report.

Discussion much appreciated :) I'll have an experimental PR for demonstration up later this week.

@myanvoos
Copy link
Contributor Author

A small note: There's a upload_report.sh script that runs as a subprocess of docker_run.sh that basically does a similar thing, but these scripts don't seem to be triggered in a basic local workflow like

./run_all_experiments.py \
    --model=<model-name> \
    --benchmarks-directory='./benchmark-sets/comparison'

@Karanjot786
Copy link

Hi @myanvoos, I'm very interested in working on this feature. I agree that having automatic hot-reloading of the experiment reports would greatly improve the local development workflow, especially when running experiments with ./run_all_experiments.py. My plan is to add a --watch flag (or similar) that watches the results and report directories and automatically updates the results-report along with hot-reloading the HTTP server.

Could you please assign this issue to me so I can start working on it? Thanks!

@Vinay152003
Copy link

Hi everyone,

I've submitted PR #869 which implements the hot-reloading functionality for experiment reports. This PR introduces a new --watch flag to run_all_experiments.py that uses Python's watchdog library to monitor the results directory. When changes are detected, it automatically triggers an update of the results-report directory by running the report generation command.

Thank you for the discussion and feedback so far. I'm looking forward to your review and am happy to make further adjustments if needed!

@myanvoos
Copy link
Contributor Author

myanvoos commented Mar 14, 2025

Hi @Vinay152003 and @Karanjot786 ,

I think before working on this issue we should discuss with @DonggeLiu and @DavidKorczynski about whether this is worth doing or not first. I opened the issue for discussion only as I am not a maintainer of this repo and this is just an idea of mine. We should consider how the idea fits into the broader structure of the project before jumping ahead to implement it since it may lead to headaches for maintainers later on. Especially since as mentioned there's the upload_report.sh script that basically does the same thing but runs from docker_run.sh only.

@Karanjot786
Copy link

Hi @myanvoos,

I completely agree with your points. That’s exactly why I’m reaching out—to open a discussion with @DonggeLiu and @DavidKorczynski about how this idea might fit into the broader project structure. I want to ensure that if we move forward, it aligns well with the maintainers' vision and minimizes potential headaches.

@DonggeLiu
Copy link
Collaborator

DonggeLiu commented Mar 15, 2025

Thanks @myanvoos for proposing this, @Vinay152003 for the PR, and @Karanjot786 for the discussion.
My thoughts:

  1. This is indeed a low priority task, given we run most of our experiments online, which automatically run report.web every few minutes.
  2. Would it be too often If we update the report every time the results directory changes? The report.web takes sometime to complete in large experiments.
  3. If we are going to add this feature, I would prefer to separate it from run-all-experiments.py. E.g., Another script that we can manually start and keeps running in parallel. run-all-experiments.py already takes a lot of command line args and let's avoid overcomplicates it.

@Karanjot786
Copy link

Hi @DonggeLiu,

Thanks for the feedback. I agree that a separate, manually-triggered script for hot-reloading makes sense, especially to avoid overloading run-all-experiments.py. We could also consider throttling updates to manage load during large experiments.
Looking forward to your experimental PR!

@myanvoos
Copy link
Contributor Author

myanvoos commented Mar 16, 2025

Thanks @myanvoos for proposing this, @Vinay152003 for the PR, and @Karanjot786 for the discussion. My thoughts:

1. This is indeed a low priority task, given we run most of our experiments online, which automatically [run `report.web` every few minutes](https://github.com/google/oss-fuzz-gen/blob/main/report/upload_report.sh#L105).

2. Would it be too often If we update the report every time the results directory changes? The `report.web` takes sometime to complete in large experiments.

3. If we are going to add this feature, I would prefer to separate it from `run-all-experiments.py`. E.g., Another script that we can manually start and keeps running  in parallel. `run-all-experiments.py` already takes a lot of command line args and let's avoid overcomplicates it.

A separate process sounds good since this task is mainly targeting local development workflows and should be optional. I reckon we can have something like run-report-server.py that has the following command line args:

./run-report-server.py \ 
		-r <results-dir> \ 
		-o <output-dir> \ 
		[-p <port>] \ 
		[-i <interval-seconds>] \ 
		[-w | --watch-fs] \
		[-t | --watch-template]

where

# Generate reports once and serve them, no hot-reloading (default)
./run-report-server.py -r <results-dir> -o <output-dir> 

# Generate reports with time-based reloading (every 10 minutes to match with the cloud script) 
./run-report-server.py -r <results-dir> -o <output-dir> -i 600 

# Generate reports with filesystem watching (reload on changes) 
./run-report-server.py -r <results-dir> -o <output-dir> -w 

# Generate reports with both time and filesystem-based reloading 
./run-report-server.py -r <results-dir> -o <output-dir> -i 600 -w

# Generate reports with both time and filesystem-based reloading
# This watches the report/ folder too for development purposes
./run-report-server.py -r <results-dir> -o <output-dir> -i 600 -w -t

Edit: Or we can extend the existing command line argument parser in web.py instead

@DonggeLiu
Copy link
Collaborator

Yep, extending web.py sounds good.

DonggeLiu added a commit that referenced this issue Mar 24, 2025
Related discussion #862 

This PR extends the command line parser in
[`web.py`](https://github.com/google/oss-fuzz-gen/blob/main/report/web.py)
to take in some additional inputs as outlined
[here](#862 (comment)).
It also adds an optional server-side hot-reloading functionality with
the `watchdog` library.

Happy to iterate upon this @DonggeLiu, and whenever you're ready

---------

Co-authored-by: Dongge Liu <donggeliu@google.com>
DonggeLiu added a commit that referenced this issue Mar 24, 2025
Related discussion #862 

This PR extends the command line parser in
[`web.py`](https://github.com/google/oss-fuzz-gen/blob/main/report/web.py)
to take in some additional inputs as outlined
[here](#862 (comment)).
It also adds an optional server-side hot-reloading functionality with
the `watchdog` library.

Happy to iterate upon this @DonggeLiu, and whenever you're ready

---------

Co-authored-by: Dongge Liu <donggeliu@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants