-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable optimisations with Chain
#2004
Comments
Speaking of Lux's documentation, it looks absolutely beautiful - any chance Flux would consider using a different theme? |
We could transition to Pollen |
As far as optimization goes, I don't think the Lux optimizations do what you're proposing. Instead, they recursively go through the EDIT: Okay, I see in Lux that That being said, it wouldn't be too difficult to build the kind of optimization you're talking about using |
I'll try and benchmark to see if there's a difference. But among other things, it makes porting weights from other libraries easier despite offering a little more functionality in lieu of pre-trained weights if the user wants |
Just to answer a few points raised here:
|
On this point, see #2005. I still think making |
Another optimisation that caught my eye was the flattening of nested |
Not really. The short templated chains means we can often use the compiled
gradient code across a model often
…On Thu, Jun 23, 2022, 12:01 Abhirath Anand ***@***.***> wrote:
Another optimisation that caught my eye was the flattening of nested Chains.
Would that be something that would maybe help with TTFG and the backward
pass times?
—
Reply to this email directly, view it on GitHub
<#2004 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJOZVVJ3I5MEYLD3FBUVHFLVQQAEVANCNFSM5ZO6T5OQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
In long
Chain
s in Metalhead, it is often the case that there are layers that can be reduced toidentity
-Dropout(p = 0)
is a frequent occurrence, along with some other similar regularisation layers (DropBlock
,DropPath
). Currently, according to Lux's documentation, there is an option to enable and disable optimisations that can remove these and make the model a little cleaner to go through. Is there a chance something similar can be implemented for Flux?The text was updated successfully, but these errors were encountered: