-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Landsat Workflow Automation #7
base: main
Are you sure you want to change the base?
Conversation
@tcnichol Hi Todd, maybe you could help here: In Do I need to use your files like It's ideal if you could just put the bash commands in here: # step 1: Google Earth to Google Cloud Storage.
# step 2: GCS to HPC.
# step 3: run HPC jobs.
generate_slurm_for_site.py ???
# step 4a - upload results to clowder.
bash ~/codeflare_utils/landsat_workflow/landsattrend/import_export/upload_region_output.sh https://pdg.clowderframework.org/ 981ab4c8-7d22-418d-93a2-b47019c2f583 ALASKA /scratch/bbou/toddn/landsat-delta/landsattrend/process 649232e2e4b00aa1838f0fc2
echo "Completed Step 4a: 'upload_region_output.sh'"
# step 4b - upload results to clowder.
bash ~/codeflare_utils/landsat_workflow/landsattrend/import_export/upload_region.sh https://pdg.clowderframework.org/ 981ab4c8-7d22-418d-93a2-b47019c2f583 ALASKA /scratch/bbou/toddn/landsat-delta/landsattrend/process 649232e2e4b00aa1838f0fc2
echo "Completed Step 4b: 'upload_input_regions'" |
For exporting to cloud, there are python scripts in another repo:
This will export a single zone
That will export everything. |
Kastan, here is what I think are answers. I have added some new files that should handle all regions and sites rather than just one at a time, which would probably be useful on automating.
region comes from ALASKA, CANADA, EURASIA1, EURASIA2, EURASIA3, TEST this one takes no arguments, but exports all the regions to the bucket. TODO for the future - add in parameters for start and end year, since that is what will change over time.
https://github.com/initze/landsattrend/blob/dev4Clowder_Ingmar_deployed_delta/import_export/cloud_download_all_regions.py $download_directory the https://github.com/initze/landsattrend/blob/dev4Clowder_Ingmar_deployed_delta/import_export/cloud_download_region.py $region $download_directory This also takes the region as an argument
this new python script will generate the slum file for all sites
https://github.com/initze/landsattrend/blob/dev4Clowder_Ingmar_deployed_delta/import_export/upload_data.py $url $key $landsat_space_id $data_dir This will upload all the input data https://github.com/initze/landsattrend/blob/dev4Clowder_Ingmar_deployed_delta/import_export/upload_process.py $url $key $landsat_space_id $process_dir This will upload all the results (contents of process) This script does make assumptions about the structure of the folders under process, I should probably make it more generic. |
See the Landsat repo for the corresponding PR enabling these changes: initze/landsattrend#17
Landsat full pipeline to automate
helper.sh
?a. Can take days, not great observability. Gives list of files, but not all files show up in storage bucket because irrelevant ones are filtered out (like all-water images).
helper.sh
?a. Generate SLURM files:
generate_slurm_for_site.py
b. Run slurm files
c. Port forward Ray dashboard from running nodes (from slurm) to user's laptop.
a. Input data to Clowder
helper.sh
?b. ✅ Results to Clowder
upload_region_output.sh
@tcnichol can you help me identify the right
helper.sh/py
files for each step?