-
Notifications
You must be signed in to change notification settings - Fork 4
[New Feature]: Modularize the EMIT workflow #302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@LucaCinquini - Is there a GitHub repo that holds the Dockerfile for the container ( |
The Docker image is in DockerHub: https://hub.docker.com/r/godwinshen/emit-ghg |
@LucaCinquini I did not generate a Dockerfile myself, I simply ran the app pack generation software "locally" and then pushed the image to docker hub. |
@GodwinShen - I am not familiar with how to use the app pack generation software. Just guessing but maybe you ran this: https://github.com/unity-sds/unity-app-generator on the |
@nikki-t yes I used that unity-app-generator on my fork of the emit-ghg repo: https://github.com/GodwinShen/emit-ghg.git I followed the "manual" method in this tutorial: https://unity-sds.gitbook.io/docs/mdps-overview/tutorials/the-development-environment/packaging-an-algorithm |
@brianlee731 and I were able to run the EMIT workflow using the modular DAG. This did require a few extra tasks:
Example logs stage_in [2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] Stage in download directory: /data/stage_in/granules
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] total 5577656
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 108709152 Feb 6 20:54 EMIT_L1B_OBS_001_20230620T084426_2317106_011.nc
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 1852557979 Feb 6 20:54 EMIT_L1B_RAD_001_20230620T084426_2317106_011.nc
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 48049971 Feb 6 20:55 EMIT_L2A_MASK_001_20230620T084426_2317106_011.nc
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 1851092306 Feb 6 20:55 EMIT_L2A_RFLUNCERT_001_20230620T084426_2317106_011.nc
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 1851092294 Feb 6 20:54 EMIT_L2A_RFL_001_20230620T084426_2317106_011.nc
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 2375 Feb 6 20:55 G2721220118-LPCLOUD.stac.json
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 2631 Feb 6 20:55 G2721699381-LPCLOUD.stac.json
[2025-02-06, 20:55:53 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 519 Feb 6 20:55 catalog.json process [2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] + ls -l /data/process/l4h7c7z9
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] total 8820812
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 374 Feb 6 21:03 catalog.json
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 7153047462 Feb 5 18:20 dataset_ch4_full.hdf5
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 6359040 Feb 6 21:02 emit20230620T084426_ch4_mf
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 302 Feb 6 21:02 emit20230620T084426_ch4_mf.hdr
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 1673 Feb 6 21:03 emit20230620T084426_ch4_mf.json
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 18310464 Feb 6 21:03 emit20230620T084426_ch4_mf_ort
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 569 Feb 6 21:03 emit20230620T084426_ch4_mf_ort.aux.xml
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 537 Feb 6 21:03 emit20230620T084426_ch4_mf_ort.hdr
[2025-02-06, 21:03:52 UTC] {pod_manager.py:471} INFO - [base] -rw-r--r-- 1 root root 10425 Feb 6 21:02 emit20230620T084426_ch4_target stage_out [2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] + successful_features='{
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "type": "FeatureCollection",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "features": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "type": "Feature",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "stac_version": "1.0.0",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "id": "urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "properties": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "datetime": "2025-02-06T21:03:07.720944Z",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "start_datetime": "2025-02-06T21:03:07.720944+00:00",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "end_datetime": "2025-02-06T21:03:07.720968+00:00",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "created": "2025-02-06T21:03:07.720973+00:00",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "updated": "2025-02-06T21:03:07.721271Z"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "geometry": null,
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "links": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "rel": "root",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "./catalog.json",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "type": "application/json"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "rel": "parent",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "./catalog.json",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "type": "application/json"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] }
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ],
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "assets": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "emit20230620T084426_ch4_mf": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "s3://unity-bucket/urn:nasa:unity:emit:dev:emit_ghg_test___1/urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf/emit20230620T084426_ch4_mf",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "title": "ENVI file",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "description": "",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "roles": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "data"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "emit20230620T084426_ch4_mf.hdr": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "s3://unity-bucket/urn:nasa:unity:emit:dev:emit_ghg_test___1/urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf/emit20230620T084426_ch4_mf.hdr",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "title": "ENVI_hdr file",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "description": "",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "roles": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "data"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "emit20230620T084426_ch4_mf_ort": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "s3://unity-bucket/urn:nasa:unity:emit:dev:emit_ghg_test___1/urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf/emit20230620T084426_ch4_mf_ort",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "title": "ENVI file",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "description": "",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "roles": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "data"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "emit20230620T084426_ch4_mf_ort.hdr": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "s3://unity-bucket/urn:nasa:unity:emit:dev:emit_ghg_test___1/urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf/emit20230620T084426_ch4_mf_ort.hdr",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "title": "ENVI_hdr file",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "description": "",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "roles": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "data"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "emit20230620T084426_ch4_mf.json": {
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "href": "s3://unity-bucket/urn:nasa:unity:emit:dev:emit_ghg_test___1/urn:nasa:unity:emit:dev:emit_ghg_test___1:emit20230620T084426_ch4_mf/emit20230620T084426_ch4_mf.json",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "title": "json file",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "description": "",
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "roles": [
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "metadata"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] }
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] },
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "stac_extensions": [],
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] "collection": "urn:nasa:unity:emit:dev:emit_ghg_test___1"
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] }
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] ]
[2025-02-06, 21:04:02 UTC] {pod_manager.py:471} INFO - [base] } |
This is great, thanks so much to you and Brian. Let's review and demo on Monday. |
The old version of the EMIT workflow is here:
http://awslbdockstorestack-lb-1429770210.us-west-2.elb.amazonaws.com:9998/api/ga4gh/trs/v2/tools/%23workflow%2Fdockstore.org%2FGodwinShen%2Femit-ghg/versions/9/plain-CWL/descriptor/workflow.cwl
and the parameter files for executing in unity-venue-dev and unity-venue-test are:
https://raw.githubusercontent.com/GodwinShen/emit-ghg/refs/heads/main/test/emit-ghg-dev.json
https://raw.githubusercontent.com/GodwinShen/emit-ghg/refs/heads/main/test/emit-ghg-test.json
This task involves updating the EMIT workflow to follow the new stage-in / process / stage-out design. The Docker container that executes processing:
godwinshen/emit-ghg:bc61e769
will probably have to be updated as well.
Please work with @ngachung and @brianlee731 for help with the Data Services functionality and overall EMIT design.
The text was updated successfully, but these errors were encountered: