Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Runtime Metrics #204

Open
brianlee731 opened this issue Sep 10, 2024 · 2 comments
Open

Define Runtime Metrics #204

brianlee731 opened this issue Sep 10, 2024 · 2 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request Feature Feature label used in Unity Project PSE

Comments

@brianlee731
Copy link
Contributor

brianlee731 commented Sep 10, 2024

Not in any particular order:

Define data, metrics we want to capture

  • Stage in, stage out, processing
  • CPU, Memory usage, Network IO
  • Job Inputs (stac + params)
  • Job Outputs
  • CWL run logs
    Storage Options for captured metrics
  • S3? Upload as a part of the outputs?
  • Kibana/EKS
  • Needs to be "forever"
    Document how to capture these logs
    Are there nearterm 'wins' for this? Can we uplaod the CWL logs to S3 after a job based on the airflow Job/OGC process id?

Original Note:

  1. Add runtime metrics to job output (e.g. the cwl logs for running algorithm vs 'job')
  2. Job details/metrics should be stored for a long time.
  3. Document what metrics we're capturing and how Users can access it
@brianlee731 brianlee731 converted this from a draft issue Sep 10, 2024
@brianlee731 brianlee731 added documentation Improvements or additions to documentation enhancement New feature or request U-SPS Feature Feature label used in Unity Project labels Sep 10, 2024
@mike-gangl mike-gangl moved this from Backlog to Design Phase in Unity Project Board Sep 17, 2024
@mike-gangl mike-gangl added PSE and removed U-SPS labels Sep 19, 2024
@mike-gangl
Copy link
Contributor

How do these individual job/process metrics relate to #213

@rtapella
Copy link
Collaborator

What's the objective behind these measurements? What are we trying to monitor/learn? (e.g. some may just be performance/load tracking, cost tracking,etc. and some may be related to user-success)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request Feature Feature label used in Unity Project PSE
Projects
Status: Design Phase
Development

No branches or pull requests

3 participants