-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PULP-214] Added docs about on-demand content streaming caveats #6101
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Added docs about on-demand content limitations and caveats. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Added docs about on-demand content limitations and caveats. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,14 @@ | ||
# On-Demand Downloading | ||
# On-Demand Download/Sync | ||
|
||
## Overview | ||
|
||
Pulp can sync content in a few modes: 'immediate', 'on_demand', and 'streamed'. Each provides a | ||
Pulp can sync content in a few modes: `immediate`, `on_demand`, and `streamed`. Each provides a | ||
different behavior on how and when Pulp acquires content. These are set as the `policy` attribute | ||
of the `Remote` performing the sync. Policy is an optional parameter and defaults to | ||
`immediate`. | ||
|
||
## Sync Modes | ||
|
||
### immediate | ||
|
||
When performing the sync, download all `Artifacts` now. Also download all metadata | ||
|
@@ -39,25 +41,83 @@ instance, syncing from a nightly repo would cause Pulp to store every nightly ev | |
is likely not valuable. Units created from this mode are | ||
`on-demand content units<on-demand content>`. | ||
|
||
## Does Plugin X Support 'on_demand' or 'streamed'? | ||
## Plugin support for on-demand/streamed | ||
|
||
Unless a plugin has enabled either the 'on_demand' or 'streamed' values for the `policy` attribute | ||
you will receive an error. Check that plugin's documentation also. | ||
|
||
Example of the "Create Remote" endpoints for some plugins that supports these features: | ||
|
||
* [pulp-rpm](https://pulpproject.org/pulp_rpm/restapi/#tag/Remotes:-Rpm/operation/remotes_rpm_rpm_create) | ||
* [pulp-container](https://pulpproject.org/pulp_container/restapi/#tag/Remotes:-Container/operation/remotes_container_container_create) | ||
|
||
!!! note | ||
Want to add on-demand support to your plugin? See the | ||
[On-Demand Support](site:pulpcore/docs/dev/learn/other/on-demand-support/) | ||
documentation for more details on how to add on-demand support to a plugin. | ||
|
||
|
||
## Associating On-Demand Content with Additional Repository Versions | ||
## On-Demand Content and Repository Versions | ||
|
||
An `on-demand content unit` can be associated and unassociated from a `repository version` just like a normal unit. Note that the original `Remote` will be used to download content should a client request it, even as that content is | ||
made available in multiple places. | ||
|
||
!!! warning | ||
Deleting a `Remote` that was used in a sync with either the `on_demand` or `streamed` | ||
options can break published data. Specifically, clients who want to fetch content that a | ||
`Remote` was providing access to would begin to 404. Recreating a `Remote` and | ||
re-triggering a sync will cause these broken units to recover again. | ||
!!! warning "Deleting a Remote" | ||
Learn about the dangers of [deleting a Remote](#remote-deletion-and-content-sharing) in the context of on-demand content. | ||
|
||
## On-Demand/Streamed limitations | ||
|
||
On-demand/streamed content can be very useful, but it comes with some caveats. | ||
|
||
### External dependency and error handling | ||
|
||
The content might become unavailable or corrupted on the remote server. | ||
This makes it hard for Pulp to provide an accurate error message. | ||
|
||
Here are some scenarios involving remote failure: | ||
|
||
* Unreachable | ||
* Given all remote sources for the content are unavailable/corrupted | ||
* When the user requests that content through a distribution | ||
* Then it fails to deliver the content and it is effectively unreachable | ||
* Reachable after failure(s) | ||
* Given there is more than one remote for the content and at least one of them is good. | ||
* When the user requests that content through a distribution | ||
* Then some requests for the content might fail with close connection errors* and future requests will try the next ones, eventually reaching the good remote. | ||
Even though the content might be reachable, the failures can be confusing. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like this could be structured better. For instance, if all remote sources are "corrupted" then you'll get connection closed failures, but those aren't mentioned in the first example, only in the second example. I also have a mild preference for normal paragraphs here I think? Like:
Feel free to adjust or rewrite it differently. Making them bullet points is fine too |
||
!!! note "* Why do we close the connection?" | ||
The connection close happens because Pulp streams content directly from the remote. | ||
If the content is bad (and we can only know that after streaming everything) we prefer to close the connection over finalizing a bad response. | ||
|
||
Context: <https://github.com/pulp/pulpcore/issues/5012>. | ||
|
||
### Remote deletion and content sharing | ||
|
||
Deleting a `Remote` that was used in a sync with either the `on_demand` or `streamed` | ||
options can break published data. | ||
|
||
Specifically, clients who want to fetch content that a `Remote` was providing access to would begin to 404. | ||
Recreating a `Remote` and re-triggering a sync will cause these broken units to recover again. | ||
|
||
In the worst case, the Content is shared across multiple Repositories, and the Remote's removal | ||
can invalidate all those repositories at once. | ||
|
||
In either case, proceed with the deletion of a remote with great care. | ||
|
||
Context: <https://github.com/pulp/pulpcore/issues/1975>. | ||
|
||
### Implicit credential sharing within a domain | ||
|
||
In the same domain, a request for on-demand content may use any available Remotes associated with that content, | ||
regardless of which user created it. | ||
|
||
An example: | ||
|
||
* Given User A and User B both synced the same on-demand content from their separate remotes (there are two different sources for the same content). | ||
* When User B requests the content | ||
* Then the credentials used for the download could potentially be User A's. | ||
|
||
If a user doesn't want their registered Remotes to be indirectly used by other users, they should use a separate domain. | ||
|
||
Context: <https://github.com/pulp/pulpcore/issues/3212>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Also download all metadata is likely not valuable" is nearly incomprehensible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Also download all metadata" and "is likely not valuable" are 17 lines apart 😅