Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Substrait: Add support for unsigned primitive and literal #5169

Closed
waynexia opened this issue Feb 3, 2023 · 0 comments · Fixed by #5448
Closed

Substrait: Add support for unsigned primitive and literal #5169

waynexia opened this issue Feb 3, 2023 · 0 comments · Fixed by #5448
Labels
enhancement New feature or request substrait

Comments

@waynexia
Copy link
Member

waynexia commented Feb 3, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Extend substrait's literal type to unsigned variants

Describe the solution you'd like

The literal types in the following two systems are not aligned:

One gap is unsigned primitives are not in Substrait's definition. In substrait-io/substrait#2 (comment) they were cataloged as "third party extension defined types". But it's widely used in Arrow and DataFusion. I'd like to discuss how they would be integrated here.

For types we can use type extension, as I did in greptimedb , I occupy the "1" variations for those types (I8, I16, I32, I64) and translate them to the unsigned version. I think we can do it this way here.

    Kind::I8(desc) => substrait_kind!(desc, int8_datatype, uint8_datatype),
    Kind::I16(desc) => substrait_kind!(desc, int16_datatype, uint16_datatype),
    Kind::I32(desc) => substrait_kind!(desc, int32_datatype, uint32_datatype),
    Kind::I64(desc) => substrait_kind!(desc, int64_datatype, uint64_datatype),

For literals we can do it similarly, but one difference is they also ship the data values. I prepare to transmute between the signed and unsigned values. But not sure if this is the best way we can achieve it. As I investigated, duckdb also doesn't cover this part

  case substrait::Expression_Literal::LiteralTypeCase::kI8:
    dval = Value::TINYINT(slit.i8());
    break;
  case substrait::Expression_Literal::LiteralTypeCase::kI32:
    dval = Value::INTEGER(slit.i32());
    break;
  case substrait::Expression_Literal::LiteralTypeCase::kI64:
    dval = Value::BIGINT(slit.i64());
    break;

Describe alternatives you've considered
N/A

Additional context
This issue is originally posted in datafusion-contrib/datafusion-substrait#41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request substrait
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants