-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of methodology for modeling generative AI systems #221
Conversation
@@ -0,0 +1,12 @@ | |||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this snippet used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, but figured we'd want it when we do the model later? Wasn't sure if we'd do an AI variant of calculations.mdx, so really just here as a placeholder.
| --------- | -------------- | | ||
| Total reserved time | 118 days | | ||
| Reservation start time | January 2022 (?) | | ||
| GPU hours for final model | 1,082,990 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is GPU hours suppose to be the hours where the GPUs were used during the intermediate and final training?
In this case with a cluster of 384 GPUs, this total GPU hours represents 117.5 (1082990/384/24) cluster days, which is the same as total reserved time.
I would have thought total reserved time in days to be more than the GPU hours for final model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - that's where we backfill the intermediate down below when we normalize. If you have ideas on how to clarify please lmk
|
||
Embodied emissions = (cluster embodied emissions per hour) x (training time) | ||
|
||
Usage emissions = (usage energy per GPU-hour) x (total GPU hours) x (average grid intensity during training) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if instead of total GPU hours, we had GPU hours utilized by hour/time, we could potentially leverage the actual grid mix during that hour. i'm thinking of cases where potentially training could be paused and resumed to only happen during times where marginal grid mix intensity is low
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agreed - though I was thinking this would be easiest to represent as a lower average grid mix?
LR updating phase description. Testing for update process going forward
Remove extra word
No description provided.