You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, if we have a table in Parquet or Delta and we want to transform it to Qbeast we can do two operations:
Full Rewrite of the table (which is done by the Spark API)
Convert to Qbeast command
The first one might take a lot of time and resources, while the second is a more simple approach. It works as follows:
User calls ConvertToQbeast with the columnsToIndex and cubeSize parameters.
If the format of the target table to convert is Parquet: convert to Delta First.
If the format of the target table is Delta: go to the next step.
Adds the required information in the Commit Log with the metadata specified in the first step.
The command does not rewrite any existing data by itself, it only indexes the newly appended data.
All the files that previously existed in the log are targeted as staging area and all of them (logically) belong to the root.
The idea is to find a way of rewriting a batch of those files without having to externalize part of the code in an application.
It can be to a ConvertToQbeast API, a new command called RewriteQbeastFiles, or even a parameter in Optimize command. What do you think? @cugni@alexeiakimov@Jiaweihu08
The text was updated successfully, but these errors were encountered:
Right now, if we have a table in Parquet or Delta and we want to transform it to Qbeast we can do two operations:
The first one might take a lot of time and resources, while the second is a more simple approach. It works as follows:
columnsToIndex
andcubeSize
parameters.The command does not rewrite any existing data by itself, it only indexes the newly appended data.
All the files that previously existed in the log are targeted as staging area and all of them (logically) belong to the
root
.The idea is to find a way of rewriting a batch of those files without having to externalize part of the code in an application.
It can be to a
ConvertToQbeast
API, a new command calledRewriteQbeastFiles
, or even a parameter inOptimize
command. What do you think? @cugni @alexeiakimov @Jiaweihu08The text was updated successfully, but these errors were encountered: