-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUERY] Storage blob requests are queued? #48140
Comments
Hi @vRune4. Thanks for reaching out and we regret that you're experiencing difficulties. The behavior that you're seeing is unrelated to the Blob Storage package. The reason that you're seeing this is because you're attempting to return a deferred enumerable ( To return the full set of data, you'd have to first materialize the set - such as by creating a list and then populating it by walking the enumerable. For example: [HttpGet]
public async Task<ActionResult<IEnumerable<ASBFileInfo>>> GetAsync()
{
var returnList = new List<ASBFileInfo>();
await foreach (var info in ReadMetaInfoAsync())
{
returnList.Add(info);
}
return Ok(returnList);
} |
Hi @vRune4. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation. |
/unresolve
Look at the screenshot I posted. Each request to the storace account starts reasonably soon, then lingers for a long time (> 100ms). If I instead make sure I fire off my requests sequentially (my first example that use the same async enumerable return type), they finish within less than ten ms. With the code you posted: StorageBlobLogs
|project TimeGenerated, AuthenticationType, ServerLatencyMs, DurationMs, ResponseBodySize shows that from the storage account's point of view, these requests were handled in typically <20ms. So the client puts in hundreds of milliseconds worth of delay for no discernable reason. Maybe it is just really wonky telemetry. Still does not explain why calling GetProperties() a 100 times sequentially finishes in a second, while trying to run in parallell takes closer to 4 seconds. (Oh, pulling back to five -- just five! -- blobs show similar delays) |
Your parallel results all share a connection and have to compete for both network and other resources on your local host. If you're seeing worse performance in parallel, it indicates that you're using more concurrency than is helpful for your host. Because the telemetry that you're looking at is emitted by the client - you'll see the client's view of the world. If a task is in a paused state waiting on I/O and the thread pool takes time to schedule the async completion because there are no resources, that will still count towards the time the client sees for the request. In other words, that's your end-to-end execution time on the client and does not accurately reflect the duration of the actual service call. This is not related to the Azure SDK package in use - it does not internally do any queueing of requests. Each operation you call is immediately executed. The client has no insight nor influence over how resources are shared/allocated - it just requests them based on the patterns in your code. |
Hi @vRune4. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation. |
I did another run. This time I just feed the task results into a List... The storage account log looks like this:
|
The operations in question are extremely low weight. The payload fits within a network packet or two. Look again. Test this yourself. I can reduce the test to five blobs. That means five concurrent tasks. It shows the same pattern. |
FIVE concurrent (trying to...) requests (rather than 100):
It takes almost as much time as firing off 100 (one hundred!) sequential requests: An ocean of difference. It isn't in my code. It isn't in the storage account. |
/unresolve |
@vRune4 : I'll pass this over to the owners of the package for further investigation, as you'd like. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage. |
We concur with everything @jsquire, the clients in the Storage package function exactly the same as the clients in the other SDKs in this repo. |
/unresolve
So no clients can handle concurrent requests to the storage account? Its performance is abysmal, unless my code carefully sequences all requests. @seanmcc-msft you really ought to read my comments. Just to be clear: "deferred enumerable" is irrelevant. I can put everything into a List instead and not defer anything. It does not change what I observe. |
Library name and version
Azure.Storage.Blobs 12.23.0
Query/Question
If I load meta information sequentially like this:
(BlobIds is a list of about a hundred blobs)
Active Insights shows this:
Each request takes <10ms which I consider to be quite good, but a colleague challenged me to do this in parallel.
After this change I get a worse result. At best it takes as long as the original approach, but can easily take twice as long. Each individual request seemingly takes more than a 100ms:
It looks really strange to me. It does not look like each request is queued up internally, but I suspect that could be an artifact from the telemetry.
The main reason I ask: I see this in our application. Simple requests like GetPropertiesAsync() takes hundreds of milliseconds to complete). Several blobs are effectively fetched in parallel and there is a significant delay calling GetPropertiesAsync().
The blob client is created as a singleton:
Environment
Azure Container Apps.
The text was updated successfully, but these errors were encountered: