You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Job chaining is not something that can be easily achieved by novice users. Job chaining, in its most basic form, is simply the ability to start one job in response to a change in state of a previous one. Think of it like catch an release. With respect to the science apis, the "release" of the job should be flexible and expanded to allow to occur in response to a predefined event or API call.
Job chaining could be accomplished several different ways. Some basic approaches would be:
Add dependencies array to the job request.
This would allow the greatest degree of freedom as one could potentially achieve all the other approaches within this one. Standard temporal, event, recurrence, etc all within one or more dependency objects. Implementation time and complexity would likely be significantly greater than the other approaches and the distinction between this feature and an independent workflow service is less clear.
Add hold parameter to job request
This approach would add a single boolean hold field to the job request indicating that the job should be held indefinitely until such time as it is manually released. The method under which the job would be released is up for debate. Some ideas include:
Generate a release token and add it to the job resource. It might be useful to generate public and private tokens, though a postit could be created to enable public releases.
Add support for a "release" action to the existing actions available via a PUT request. This semantically fits well, but adds complexity to consumers and does not integration well with webhooks coming from 3rd parties or the notifications API.
Add a jobs/v2/:id/release subresouce. This is probably the easiest approach, but breaks semantics and would otherwise indicated that multiple releases would be possible. If so, then the job would be more of a job template an bring its own set of requirements and implications on provenance, sharing, and history.
Reposting the job description. Resubmitting the job without hold: true value would allow users to optionally update their job request prior to submitting. This could be valuable in the event that job details change based on the output of the previous job. Of course, if they are able to repost a job request, why do they need to put a job on hold in the first place?
Regardless of how this is implemented, it would also require a new job status, ON_HOLD, to be introduced.
Add custom notifications
This approach would extend the notification object with additional fields to specify a notification type, customize message, etc. This falls in line with the features provided by the the templates api and would give us context to allow for greater configuration of notification targets. Things like authentication, custom headers, request type, fields, etc could all be configurable.
This gives a significant return on the investment, but does not provide a simple, direct way to create one job from another. This may be a strong use case for advanced, event-driven orchestration.
Add workflow API
This is a related, but separate feature. Workflow involves complex logic and understanding of the work being done. Managing workflows over times requires ongoing state persistence, management and retry logic, as well as a separate DSL. There are many good workflow solutions available today which should be evaluated and integrated rather than rolling our own.
Actual behavior
Not easily supported in the current implementation.
The text was updated successfully, but these errors were encountered:
Some other things that might be more general solutions are:
Relays API
The primary purpose of the Relay API is to provide controlled, delayed delivery of authenticated platform requests. Individual relays will store the destination, headers, query parameters, and body content of the original request. A relay can then be released at a later time by making an authenticated POST or GET request. Once released, the relay will forward the orignal request, unchanged, to the original destination.
Relay cannot be updated.
Relays can, optionally be used more than once.
Unused relays expire after 30 days.
Relays can be made publicy as unauthenticated URL in exactly the same way files are.
The PostIts API could be extended to allow storage of request data along with the destination, method, etc, thereby providing most of the functionality of the Relays API. The up side is that there would be significantly less onboarding than with a new API. The down side is that PostIts are public URL by default, thus there would not be any way to restrict their invocation without introducing optional authentication on individual tokens.
Using PostIts and Metadata as-is
For situations where your job is deterministic, you could use the postits and metadata services to handle the job chaining. Store your job request as a metadata object, generate a postit to that metadata object and pass it in as a parameter to your job as metadata_postit.
Generate another postit (with a POST method) to the jobs api and pass that in as a parameter to your job as submission_postit. In your wrapper template, you can now submit your saved job with the following single line of bash:
Issue summary
Job chaining is not something that can be easily achieved by novice users. Job chaining, in its most basic form, is simply the ability to start one job in response to a change in state of a previous one. Think of it like catch an release. With respect to the science apis, the "release" of the job should be flexible and expanded to allow to occur in response to a predefined event or API call.
What platform services are impacted?
To which tenant does this issue apply?
Community - https://api.agaveplatform.org
What version of the platform are you using?
2.2+
Steps to recreate the issue
N/A
Expected behavior
Job chaining could be accomplished several different ways. Some basic approaches would be:
Add
dependencies
array to the job request.This would allow the greatest degree of freedom as one could potentially achieve all the other approaches within this one. Standard temporal, event, recurrence, etc all within one or more dependency objects. Implementation time and complexity would likely be significantly greater than the other approaches and the distinction between this feature and an independent workflow service is less clear.
Add
hold
parameter to job requestThis approach would add a single boolean
hold
field to the job request indicating that the job should be held indefinitely until such time as it is manually released. The method under which the job would be released is up for debate. Some ideas include:release
token and add it to the job resource. It might be useful to generate public and private tokens, though a postit could be created to enable public releases.jobs/v2/:id/release
subresouce. This is probably the easiest approach, but breaks semantics and would otherwise indicated that multiple releases would be possible. If so, then the job would be more of a job template an bring its own set of requirements and implications on provenance, sharing, and history.hold: true
value would allow users to optionally update their job request prior to submitting. This could be valuable in the event that job details change based on the output of the previous job. Of course, if they are able to repost a job request, why do they need to put a job on hold in the first place?Regardless of how this is implemented, it would also require a new job status,
ON_HOLD
, to be introduced.Add custom notifications
This approach would extend the notification object with additional fields to specify a notification type, customize message, etc. This falls in line with the features provided by the the templates api and would give us context to allow for greater configuration of notification targets. Things like authentication, custom headers, request type, fields, etc could all be configurable.
This gives a significant return on the investment, but does not provide a simple, direct way to create one job from another. This may be a strong use case for advanced, event-driven orchestration.
Add workflow API
This is a related, but separate feature. Workflow involves complex logic and understanding of the work being done. Managing workflows over times requires ongoing state persistence, management and retry logic, as well as a separate DSL. There are many good workflow solutions available today which should be evaluated and integrated rather than rolling our own.
Actual behavior
Not easily supported in the current implementation.
The text was updated successfully, but these errors were encountered: