Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small fixes in input for calculate_indices() rgb() #1277

Merged
merged 6 commits into from
Jan 19, 2025
Merged

small fixes in input for calculate_indices() rgb() #1277

merged 6 commits into from
Jan 19, 2025

Conversation

GL-S
Copy link
Collaborator

@GL-S GL-S commented Jan 16, 2025

Proposed changes

added line to prevent warnings from dask.
Changed collection name in calculate_indices(), and reset band names in rgb() to default to avoid errors

Closes issues (optional)

Checklist

(Replace [ ] with [x] to check off)

  • Notebook created using the DEA-notebooks template
  • Clear all outputs, run notebook from start to finish, and save the notebook in the state where all cells have been sequentially evaluated
  • Test notebook on both the NCI and DEA Sandbox (flag if not working as part of PR and ask for help to solve if needed)

added line to prevent warnings from dask.
Changed collection name in calculate_indices(), and reset band names in rgb() to default to avoid errors
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -60,7 +60,7 @@
"These options will run a single cell at a time.\n",
"\n",
"To automatically run all cells in a notebook, navigate to the \"Run\" tab of the menu bar at the top of JupyterLab and select \"Run All Cells\" (or the option that best suits your needs).\n",
"When a cell is run, the cell's content is executed.\n",
"When a cell is run,the cell's content is executed.\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GL-S - might need to revert this change - guessing it was the one testing git access

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are correct, I will

GL-S added 3 commits January 16, 2025 16:00
revert last test commit
This reverts commit 7d8d503.
This reverts commit adddf58.
Copy link

review-notebook-app bot commented Jan 16, 2025

View / edit / reply to this conversation on ReviewNB

robbibt commented on 2025-01-16T05:19:56Z
----------------------------------------------------------------

Line #19.    dask.config.set({'distributed.worker.daemon': False}); # Prevent some warnings from being printed

Can you elaborate on this - e.g. what warnings? We could potentially move this to the dea_tools.dask module itself if it seems like a good idea, but I'm not exactly sure what this code does (just want to make sure we don't accidently break anything by adding it)


Copy link

review-notebook-app bot commented Jan 16, 2025

View / edit / reply to this conversation on ReviewNB

robbibt commented on 2025-01-16T05:19:57Z
----------------------------------------------------------------

Line #7.        ds = dc.load(product='ga_ls8cls9c_gm_cyear_3', 

Yay, great fix - we must have missed this last year


Copy link

review-notebook-app bot commented Jan 16, 2025

View / edit / reply to this conversation on ReviewNB

robbibt commented on 2025-01-16T05:19:57Z
----------------------------------------------------------------

Line #7.            # default band names: ['nbart_red', 'nbart_green', 'nbart_blue']

I reckon we can probably remove this line if this func now works directly on bands with those names


Copy link

review-notebook-app bot commented Jan 16, 2025

View / edit / reply to this conversation on ReviewNB

robbibt commented on 2025-01-16T05:19:58Z
----------------------------------------------------------------

Line #6.        # default band names: ['nbart_red', 'nbart_green', 'nbart_blue']

Same here; I'd probably remove this line just to keep things simple


@robbibt
Copy link
Member

robbibt commented Jan 16, 2025

This looks great, thankls @GL-S! I left a few minor comments on the notebook... for the Dask thing, my main priority is just making sure the new line doesn't accidently cause any other downstream issues. If we're sure it doesn't, then we could consider moving it from this notebook to the dea_tools.dask module instead to hide it from external users?

@GL-S
Copy link
Collaborator Author

GL-S commented Jan 16, 2025

Thank you for the comments @robbibt .
The warnings appeared when running the model for classifying the satellite images, there is a lot of printing which doesn't look great.
I added the line about the dask's warnings because it was a solution suggested by the warning itself (screenshot added below). I am actually not sure what it does in details, it should have an influence on how the processes run and exit. It's probably safer not to add it to dea_tools.dask yet.
...maybe a filterwarnings in dea_tools.dask is a better solution? It doesn't work when added to the 4_Classify_satellite_data.ipynb

warnings dask

Regarding the comments about the band names, I get your point. I will remove those commented lines

@robbibt
Copy link
Member

robbibt commented Jan 16, 2025

Thanks @GL-S! Yeah, those warnings look terrible, it would definitely be nice to hide them away.

@emmaai Do you know what these Dask warnings in the screenshot above mean, and whether adding dask.config.set({'distributed.worker.daemon': False}) to our dea_tools.dask script would be a safe move?

@emmaai
Copy link
Contributor

emmaai commented Jan 16, 2025

The warning seems to suggest there was a conflict between config worker as daemon and number of jobs = 1. Intuitively I don’t know why it can’t be both. The suggested solution is setting worker process as non-daemon. I don’t think it’s a good idea in general. Also as it runs with scikit-learn, I suspected the warnings actually suggested something different and more serious. Scikit-learn is multi-threading, it’s tricky to run with dask. I didn’t read the notebook. If it couldn’t be understood and resolved with reasonable thought, I’d suggest to keep it until further investigation.

@GL-S
Copy link
Collaborator Author

GL-S commented Jan 17, 2025

Hi all, I have submitted another commit.
I removed the dask.config.set({'distributed.worker.daemon': False}) to be safe, as Emma mentioned that it might not be a good idea. I will follow up on this when the notebooks COP resumes, as James suggested.
I have also removed the commented lines about the default band names.
Effectively, the changes from the version in develop are the updated product and collection names in the function feature_layers. The warnings are still there, but at least the notebook runs with no errors.

Copy link
Member

@robbibt robbibt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks heaps @GL-S, I think this is a good approach for now. Any chance you could create quick Github issue in this repo describing the warnings issue, that we can refer to later to work out what's going wrong?

Other than that, I think this is ready to "Squash and merge" into the develop branch 🙂

@GL-S
Copy link
Collaborator Author

GL-S commented Jan 17, 2025

Sure, I will create an issue now

@GL-S GL-S merged commit 11fcb73 into develop Jan 19, 2025
1 check passed
@GL-S GL-S deleted the GL branch January 19, 2025 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants