Skip to content

Commit

Permalink
feat: update operator to update a table (#734)
Browse files Browse the repository at this point in the history
  • Loading branch information
scgkiran authored Nov 12, 2024
1 parent 5b70acd commit adb1079
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
26 changes: 26 additions & 0 deletions proto/substrait/algebra.proto
Original file line number Diff line number Diff line change
Expand Up @@ -531,6 +531,7 @@ message Rel {
ReferenceRel reference = 21;
WriteRel write = 19;
DdlRel ddl = 20;
UpdateRel update = 22;
// Physical relations
HashJoinRel hash_join = 13;
MergeJoinRel merge_join = 14;
Expand Down Expand Up @@ -664,6 +665,31 @@ message WriteRel {
}
}

// The operator that modifies the columns of a table
message UpdateRel {
oneof update_type {
NamedTable named_table = 1;
}

NamedStruct table_schema = 2; // The full schema of the named_table
Expression condition = 3; // condition to be met for the update to be applied on a record

// The list of transformations to apply to the columns of the named_table
repeated TransformExpression transformations = 4;

message TransformExpression {
Expression transformation = 1; // the transformation to apply
int32 column_target = 2; // index of the column to apply the transformation to
}
}

// A base table. The list of string is used to represent namespacing (e.g., mydb.mytable).
// This assumes shared catalog between systems exchanging a message.
message NamedTable {
repeated string names = 1;
substrait.extensions.AdvancedExtension advanced_extension = 10;
}

// Hash joins and merge joins are a specialization of the general join where the join
// expression is an series of comparisons between fields that are ANDed together. The
// behavior of this comparison is flexible
Expand Down
26 changes: 26 additions & 0 deletions site/docs/relations/logical_relations.md
Original file line number Diff line number Diff line change
Expand Up @@ -474,6 +474,32 @@ Write definition types are built by the community and added to the specification
| Format | Enumeration of available formats. Only current option is PARQUET. | Required |


## Update Operator

The update operator applies a set of column transformations on a named table and writes to a storage.

| Signature | Value |
| -------------------- |---------------------------------------|
| Inputs | 0 |
| Outputs | 1 |
| Property Maintenance | Output is number of modified records |

### Update Properties

| Property | Description | Required |
|------------------------|--------------------------------------------------------------------------------------|--------------------------------------------------|
| Update Type | Definition of which object we are operating on (e.g., a fully-qualified table name). | Required |
| Table Schema | The names and types of all the columns of the input table | Required |
| Update Condition | The condition that must be met for a record to be updated. | Required |
| Update Transformations | The set of column updates to be applied to the table. | Required |

=== "UpdateRel Message"

```proto
%%% proto.algebra.UpdateRel %%%
```


## DDL (Data Definition Language) Operator

The operator that defines modifications of a database schema (CREATE/DROP/ALTER for TABLE and VIEWS).
Expand Down

0 comments on commit adb1079

Please sign in to comment.