Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OP for 4D tensor #1

Merged
merged 35 commits into from
Nov 26, 2024
Merged

OP for 4D tensor #1

merged 35 commits into from
Nov 26, 2024

Conversation

Nathaniel-github
Copy link
Collaborator

No description provided.

@Nathaniel-github
Copy link
Collaborator Author

Main questions:

-Is the reshaping going to be an issue (should we do it manually/iteratively)?
-How should we construct tests for this method and has Andrew already written something similar?

@Nathaniel-github Nathaniel-github marked this pull request as draft October 23, 2024 21:11
@Nathaniel-github
Copy link
Collaborator Author

@aarmey just discussed this code with Andrew yesterday and was wondering what your input is on the questions above/ the code overall

Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial requests. Please also remove the overall Pf2 code for now. Let's just focus on implementing and testing the OP step.

cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. Besides the little stuff here, can you write a test for project_data? It should just randomly generate a dataset and factors, then try the function. This will help ensure we have all the sizes right.

cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
cellcommunicationpf2/cc-pf2.py Outdated Show resolved Hide resolved
@Nathaniel-github
Copy link
Collaborator Author

@aarmey I made the test and as we assumed it isn't able to solve for the projection matrix correctly. I was thinking about possible solutions/reasons and I am fairly certain reshaping and averaging the b^2 x X^2 matrix to get a single projection matrix isn't solving for a meaningful projection matrix (b x X). The alternatives I can think of would be:
-solving for 2 separate projection matrices that project the sender and receiver cells respectively to their eigen states
-doing an iterative solve by trying to minimize the error from performing the Q.t @ slice @ Q operation where Q is the proj matrix

I am not too certain about the viability of solving for separate projection matrices but the iterative solution would have the issue of not guaranteeing a global minimum in error. Not sure if you have any other ideas but I was going to draft up an implementation for each just to see what it might look like.

@aarmey aarmey marked this pull request as ready for review November 21, 2024 15:24
Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge these after these changes. I agree that it would be best to try a different approach.

cellcommunicationpf2/test.py Outdated Show resolved Hide resolved
cellcommunicationpf2/test.py Outdated Show resolved Hide resolved
cellcommunicationpf2/test.py Outdated Show resolved Hide resolved
cellcommunicationpf2/test.py Outdated Show resolved Hide resolved
.github/workflows/test.yml Outdated Show resolved Hide resolved
@Nathaniel-github
Copy link
Collaborator Author

@aarmey I wrote up the implementation using pymanopt but the test is still failing. There were some issues with pymanopt requiring the usage of autograd numpy and as a result I couldn't use slicing operations but I believe I was able to circumvent it correctly but if anything that is probably where there might be some logic incorrect. I did also notice the solving took an insane amount of time so I am not sure how viable it will be with actual scRNA-seq data given this was just smaller dummy data anyway.

Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nathaniel-github I tried to clean this up some because there was a lot of confusing extra cruft. This is now broken, but it should be clear how to fix the frank errors. Make whatever adjustments you see, then we can re-visit why the tests are failing.

@Nathaniel-github
Copy link
Collaborator Author

@aarmey Fixed those small errors and it seems like we are pretty close. It seems to be able to correctly solve for projection matrices about 1/10 of the time in the way the assert is written. However, it is able to solve for "a correct" projection matrix 100% of the time. When comparing the projection of the recreated tensor by the recreated projection matrix to the projectedX data it matches every time. However, when comparing the two projection matrices they differ more often than not. I wrote a second test to highlight this behavior. It is identical to the one that passes except instead of comparing tensors it compares the projection matrices.

TLDR I believe all this means is the way we construct this the projection matrix is NOT unique.

@aarmey
Copy link
Member

aarmey commented Nov 26, 2024

@Nathaniel-github does it fail 1/2 the time for each matrix? If so, it's probably because it allows reflections. A pure rotation matrix has a determinant of 1, never -1: https://en.wikipedia.org/wiki/Rotation_matrix

If so, you can fix this by taking the SVD decomposition of the projection matrix, and replacing the eigenvalues with all 1's.

S, V, D = np.linalg.svd(P)
P = S @ D

@Nathaniel-github
Copy link
Collaborator Author

Nathaniel-github commented Nov 26, 2024

@aarmey Seems like roughly 1/10.

Screenshot 2024-11-25 at 8 20 30 PM

Sorry ignore this I was running this to assert that all matrices passed.

@Nathaniel-github
Copy link
Collaborator Author

@aarmey Sorry ignore that last one I was counting that all the projections matrices were correct. Seems like that is the issue! Failing roughly 1/2:

Screenshot 2024-11-25 at 8 28 38 PM

@Nathaniel-github
Copy link
Collaborator Author

@aarmey Seems like it is correctly solving now. There is still the occasional fail though:

Screenshot 2024-11-25 at 9 01 24 PM

Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I think this is working well enough, so I'm going to merge.

@aarmey aarmey merged commit c4039ad into main Nov 26, 2024
1 check failed
@aarmey aarmey deleted the op_for_4d branch November 26, 2024 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants