You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Extend substrait's literal type to unsigned variants
Describe the solution you'd like
The literal types in the following two systems are not aligned:
One gap is unsigned primitives are not in Substrait's definition. In substrait-io/substrait#2 (comment) they were cataloged as "third party extension defined types". But it's widely used in Arrow and DataFusion. I'd like to discuss how they would be integrated here.
For types we can use type extension, as I did in greptimedb , I occupy the "1" variations for those types (I8, I16, I32, I64) and translate them to the unsigned version. I think we can do it this way here.
For literals we can do it similarly, but one difference is they also ship the data values. I prepare to transmute between the signed and unsigned values. But not sure if this is the best way we can achieve it. As I investigated, duckdb also doesn't cover this part
case substrait::Expression_Literal::LiteralTypeCase::kI8:
dval = Value::TINYINT(slit.i8());
break;
case substrait::Expression_Literal::LiteralTypeCase::kI32:
dval = Value::INTEGER(slit.i32());
break;
case substrait::Expression_Literal::LiteralTypeCase::kI64:
dval = Value::BIGINT(slit.i64());
break;
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Extend substrait's literal type to unsigned variants
Describe the solution you'd like
The literal types in the following two systems are not aligned:
One gap is unsigned primitives are not in Substrait's definition. In substrait-io/substrait#2 (comment) they were cataloged as "third party extension defined types". But it's widely used in Arrow and DataFusion. I'd like to discuss how they would be integrated here.
For types we can use type extension, as I did in greptimedb , I occupy the "1" variations for those types (I8, I16, I32, I64) and translate them to the unsigned version. I think we can do it this way here.
For literals we can do it similarly, but one difference is they also ship the data values. I prepare to transmute between the signed and unsigned values. But not sure if this is the best way we can achieve it. As I investigated, duckdb also doesn't cover this part
Describe alternatives you've considered
N/A
Additional context
This issue is originally posted in datafusion-contrib/datafusion-substrait#41
The text was updated successfully, but these errors were encountered: