Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making the :expand option the default in @layer #2531

Closed
CarloLucibello opened this issue Nov 24, 2024 · 7 comments · Fixed by #2532
Closed

Consider making the :expand option the default in @layer #2531

CarloLucibello opened this issue Nov 24, 2024 · 7 comments · Fixed by #2532
Milestone

Comments

@CarloLucibello
Copy link
Member

It doesn't seem to be really needed, we could automatically detect if the layer contains only leaf types or is a nested container

@CarloLucibello CarloLucibello added this to the v0.15 milestone Nov 24, 2024
@mcabbott
Copy link
Member

No, we had this automatic detection, and it didn't work.

@CarloLucibello
Copy link
Member Author

what is exactly that is not working? I'm testing now with making everything :expand and I don't see any problem

@mcabbott
Copy link
Member

Every layer whose implementation contains other layers, or things which could plausibly be layers, but whose interface does not.

@CarloLucibello
Copy link
Member Author

Ok, I see that we want:

julia> LayerNorm(10)
LayerNorm(10)       # 20 parameters

julia> MultiHeadAttention(64 => 1024 => 1024, nheads = 8)
MultiHeadAttention(64 => 1024 => 1024; nheads=8)  # 1_245_184 parameters

Maybe we should change the default then? I feel that 90% cases one would do with expand, since it cover both the layer with plain arrays and the container type.
I don't really like that Flux.@layer :expand Model would come out so often in code and since the meaning is not obvious one has to look in the documentation.

@mcabbott
Copy link
Member

IDK, maybe the other default would have been better, but I'm against unnecessary churn.

My guess was that most people making layers are wrapping some parameter arrays, not building container layers. But I have no data.

The present default means that if you don't ever think about :expand, you get too-compact printing? Maybe I thought that was better than accidentally printing 100 pages of parameters. Although now I fixed that problem & it should print summary(x).

@mcabbott mcabbott changed the title remove the :expand option from layer Consider making the :expand option the default in @layer Nov 24, 2024
@CarloLucibello
Copy link
Member Author

I think most people don't make layers, they make blocks and models.
In pytorch the default is something like :expand.
Since it only affects the printing, moving to expand as a default, and having another keyword (:noexpand or :compact?) for the layer print shouldn't be considered breaking I guess. If so, we can do it also after v0.15 although I would be in favor of doing it now.

@CarloLucibello CarloLucibello removed this from the v0.15 milestone Nov 25, 2024
@mcabbott
Copy link
Member

Yea like I said, no data on which is more common. Certainly Flux's source has mostly compact ones.

Having another option (but and not breaking existing @layer :expand Layer code) and then changing the default seems OK. But seems polite to do on a breaking change, not just at some random time.

@CarloLucibello CarloLucibello added this to the v0.15 milestone Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants