Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BlockBlobClient.StageblockAsync API fails while uploading the 11th block to the blob with the specified blob or block content is invalid #47370

Open
v-rohank opened this issue Nov 28, 2024 · 1 comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)

Comments

@v-rohank
Copy link

v-rohank commented Nov 28, 2024

Library name and version

Azure.Storage.Blobs 12.23.0

Describe the bug

We're encountering an issue with the Post Import Large File API, which is used to upload Power BI files larger than 1 GB into the service (we wanted to support till 10 GB). The process involves uploading the data to a temporary location in Azure Blob Storage. We are currently using BlockBlobClient for uploading the data to temporary upload location with Stage Block Async API and Commit Block List Async API. We were able to upload the data to the blob using Stage Block Async API for 10 times, but in the 11th iteration we are facing the following issue:

Azure.RequestFailedException: 'The specified blob or block content is invalid.

RequestId:b39a4d37-301e-003e-5882-40eac8000000

Time:2024-11-27T04:10:41.3824027Z

Status: 400 (The specified blob or block content is invalid.)

ErrorCode: InvalidBlobOrBlock

Expected behavior

API response should be 200

Actual behavior

We were able to hit the StageBlockAsync API for 10 times, we are facing issue at the 11th iteration with the following error: The specified blob or block content is invalid

Reproduction Steps

  1. Create a Power BI file of size greater than 1GB

  2. Create a temporary upload location with the Create Temporary Upload Location API (we will be getting SAS URL from this API) and upload the data to temporary upload location using BlockBlob Client and call the Post Import API for uploading to service
    Please refer to the following below code:

             PowerBIClient client = new PowerBIClient(credentials);
             var importInfo = new ImportInfo();
    
             var temporaryLocation = client.Imports.CreateTemporaryUploadLocation();
    
    
             var blobclient = new BlockBlobClient(new Uri(temporaryLocation.Value.Url));
    
    
             string datasetDisplayName = "datasetDisplayName";
             string directoryPath = @"directory path";
             string fileName = "file name";
             // Combine directory path and file name
             string filePath = Path.Combine(directoryPath, fileName);
             var inputStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
    
             // Uploading blocks
             List<string> blockIds = new List<string>();
             int blockSize = 4 * 1024 * 1024;  //We attempted using block sizes of 5 MB and 100 MB, but encountered a same issue 
                during the 11th iteration.
    
             byte[] buffer = new byte[blockSize];
             int bytesRead;
             int blockIndex = 0;
    
    
             while ((bytesRead = await inputStream.ReadAsync(buffer, 0, blockSize)) > 0)
             {
                 // Generate a unique, base64-encoded block ID
                 string blockId = Convert.ToBase64String(Encoding.UTF8.GetBytes($"block-{blockIndex}"));
                 blockIds.Add(blockId);
    
                 // Upload the block
                 using (MemoryStream blockData = new MemoryStream(buffer, 0, bytesRead))
                 {
                     try
                     {
                         //we are getting issue at 11th iteration, 
                         await blobclient.StageBlockAsync(blockId, blockData);
                     }
                     catch (RequestFailedException ex)
                     {
                         throw;
                     }
    
                     Console.WriteLine($"Uploaded block {blockIndex + 1}");
                     blockIndex++;
                 }
             }
                 Console.WriteLine("Committing block list...");
                 // Commit the block list
                 await blobclient.CommitBlockListAsync(blockIds);
    
                 Console.WriteLine("Blob upload completed.");
    
    
                 if (Path.GetExtension(datasetDisplayName) == string.Empty)
                 {
                     datasetDisplayName = Path.GetFileNameWithoutExtension(datasetDisplayName) + ".pbix";
                 }
    
                 importInfo.FileUrl = temporaryLocation.Value.Url;
    
                 var import = client.Imports.PostImport(datasetDisplayName, importInfo);
    

Environment

No response

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files) labels Nov 28, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. Storage Storage Service (Queues, Blobs, Files)
Projects
None yet
Development

No branches or pull requests

1 participant