Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds Globus GCS-sourced assets as a datasource #675

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jbottigliero
Copy link

@jbottigliero jbottigliero commented Dec 2, 2024

This pull request adds integration with Globus. Specifically, adding the ability to source data via HTTPS on a Globus Connect Server instance, authenticated using Globus Auth.

  • The credential_provider was created based off of the #src/util/google_oauth2.ts implementation and other OAuth-like credential providers (e.g. middleauth and ngauth).

The Globus credential provider uses PKCE for the authorization flow, which I've added a few basic utilities around 1.

@ravescovi has successfully deployed an instance of Neuroglancer configured with Globus using the proposed changes – much of this implementation is based on his initial work integrating with the Neuroglancer codebase.

We're looking forward to discussing the implementation and seeing what needs to be addressed to get this into the mainline!

Footnotes

  1. These utilities could be replaced with an external library (e.g. pkce-challenge) if that is preferred. It might be worth noting many of added methods were pulled from the Globus SDK for JavaScript directly.

* own Client ID from Globus and substitute it in.
* @see https://docs.globus.org/api/auth/developer-guide/#developing-apps
*/
GLOBUS_CLIENT_ID: JSON.stringify("f3c5dd86-8c8e-4393-8f46-3bfa32bfcd73"),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default value is a Client ID that is managed by the Globus team.

It is something that could be distributed as part of the codebase or removed as a default and only referenced here as a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from my side for at least not needing a fork to set a different value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshmoore – that was my intent here, but I am a bit unsure this method works as intended...

I thought something like npm run build -- --define GLOBUS_CLIENT_ID=example was the intended use, but it looks like the NEUROGLANCER_CLI environment flag needs to be disabled in order for the incoming --define properties to be merged.

Running npm run build -- --env NEUROGLANCER_CLI=false ... encounters an error due to the .strict() usage in build_tools/cli.ts – I don't want to derail the addition of this functionality, but just wanted to make sure I had a better grasp on the change and how it is used.

@jbms
Copy link
Collaborator

jbms commented Jan 31, 2025

Thanks, and sorry for the delay in responding. Part of the delay is that I had been working on a major refactor of file access in Neuroglancer.

Now that it has landed, it should make it easier to integrate additional data sources like this, but some refactoring of your PR will be needed.

In the README you mention that the user needs to enter a UUID --- can you describe a bit more about how Globus works and how this UUID is handled?

If the UUID is necessary to access the server, shouldn't it be included in the datasource URL somehow so that when sharing a link to the Neuroglancer state, another user doesn't have to enter it?

@jbms
Copy link
Collaborator

jbms commented Jan 31, 2025

Also, can you clarify how the client ids work and how the default client id you have provided will work?

As far as I can gather, users host their own globus server, but they rely on the central globus authentication server?

What are the allowed origins for the default client id? Is the list of allowed origins global or specific to a given globus server instance? Similarly, is the authentication token that is received valid for all globus servers or just a single instance (based on the way the scopes work, I guess the answer is just a single server)? Or is that irrelevant because users are local to a single server instance anyway?

The scope doesn't seem to say anything about what permissions are granted? Does that mean that all permissions (i.e. read and write) are granted?

A general issue that comes up is that a user may wish to view certain datasets in Neuroglancer, and therefore must necessarily grant a given Neuroglancer instance read access to that particular dataset. However, it is often not possible to limit permissions narrowly like that, and instead the user is forced to either grant no permission, or grant very broad permissions, e.g. read access to everything. This may not be an issue if Neuroglancer can be trusted with full access, e.g. because the person administering the Neuroglancer instance is also administering the Globus server, but often it is an issue.

This issue is exactly why Neuroglancer does not provide the option to access GCS via regular google oauth2 login, because it would require users to grant Neuroglancer full read access to all of their GCS resources, which would not normally be a good idea unless they have created a separate Google account with access limited to those datasets they wish to view in Neuroglancer. Instead, there is ngauth, which allows you to grant Neuroglancer access only to specific datasets.

Is it possible with Globus for a user to somehow say: I want to grant this specific Neuroglaner instance access to just these specific datasets? Furthermore, it would be nice if user A can grant that permission, and then share a Neuroglancer link with user B, who also has access to that dataset, and because user A has access to the dataset and already granted access to that dataset to Neuroglancer, and user B has access to the dataset, user B can login and then automatically access it without needing to specifically grant any additional permissions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants