Skip to content

Conversation

julianstirling
Copy link
Contributor

This is very much experimental for comment. Docstrings haven't been updated as we go.

The bug #142 seems to relate to Blobs being cached somewhere and called again by FastAPI without the correct context.

My understanding of how blobs are meant to work

  1. the blob is created and not added to the blob data manager as it may only be used to pass data from one action to another.
  2. If the blob is called once the action invocation finishes, if the blob is passed out of the action then it given a href and added to the blob manager.

Step 2 is currently done on serialisation as a side effect using a context variable to get the server. This means that serialisation of the basemodel only works in certain contexts which upstream libraries will not expect.

This MR experiments with just saving to the blob manager when the invocation action finishes.

@julianstirling julianstirling force-pushed the blob-managed-at-invocation-end branch from b2610c7 to 26c1121 Compare July 16, 2025 23:44
@rwb27
Copy link
Collaborator

rwb27 commented Jul 17, 2025

I think adding the blob to the manager and getting an ID can absolutely be done whenever - holding a bunch of weak references keyed by UUIDs feels like something that's ok in a module level global or singleton, so there wouldn't even be any need to explicitly pass the blob manager.

I think the trickier part is going from ID to URL. We ought to be using fastapi.Request.url_for instead of hard coding URLs. Even worse, we should be using url_for from the current request - not the request when the Blob is created. If the request comes from the same client that started the action, the returned URL should be the same - so it's incorrect but doesn't matter.

90% of the time this will not matter because we can guess the URL correctly. But it does sometimes matter and I was hoping to follow convention. There are a few other places where we should really be using url_for and I was hoping to use a similar trick there.

I can't imagine this is three first project with this issue, but I did a bit of looking and have not yet found a neat solution.

@julianstirling
Copy link
Contributor Author

Well if we know the only way to get the blob is from Invocation.output then we can make the property Invocation.output(request) this way we explicitly require the request a bit like for Invocation.response. The only difference is Invocation.response has the request as optional and creates a relative path if not passed.

@rwb27
Copy link
Collaborator

rwb27 commented Jul 21, 2025

That's quite similar to what's currently done for Thing Descriptions, although it would need some custom recursive code to cope with outputs that contain Blobs within models, lists, or dictionaries, which I was hoping to avoid.

Thinking about this some more, I think all of these quirks could be solved if url_for was available at serialisation time. We could deal with adding Blobs to the blob manager when they're created, and keep the blob manager global - we'd just need url_for to convert the ID into a URL.

I think it would be fine to make url_for available in a context variable in every function that's responding to a request. That could be done using middleware, I am pretty sure, which would then remove the possibility of forgetting to add a dependency to a particular endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants