-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add GAT layer #3
Comments
Question: Is there a way to do this for input graphs with variable size (number of nodes)? |
it seems that they do so in this paper, CODE HERE... I had trouble figuring this out (they also have the extra complication of this clustering thing) but this seems to be a way to do it. My best summary of the procedure: NOTATION:attention heads: h LAYER PROPERTIESa_l, a_r (splitting concatenated thing from eq3 in Velickovic into two separate parts): [1 x h x m] LAYER INPUTS:graph with n nodes LAYER ACTIONS (my commentary in italics):
What I wonder...Does this end up being different from a convolution? If so, how? Since the attention is trainable per-head and per-output-feature, what are we really doing? I think the message passing should somehow encode structure, since a node with more neighbors will have a larger sum of input stuff, but how would the attention tensors ever "learn" about that in order to weight things appropriately? |
Summary/my takeaways of chat with Shaojie about this just now:
|
Bump. now that we've made some pretty significant changes wrt we do graph building and featurization, it's probably a good time to discuss how we could go about this? |
I would be very psyched for this to be implemented. Realistically, I won't personally have the bandwidth in the super near future (i.e. the next ~4-6 weeks), but certainly happy to discuss and/or sketch out in more detail for whoever ends up taking a crack at it, be that future me, current/future you, or someone else entirely! :) |
Here is another paper: "Crystal graph attention networks for the prediction of stable materials" Paper: doi:10.1126/sciadv.abi7948 I believe the novelty here is "replacing the precise bond distances with embeddings of graph distances", but cmiiw. |
I'd like to implement a graph attention mechanism a la this paper.
The text was updated successfully, but these errors were encountered: