Skip to content

[SPARK-52219][SQL] Schema level collation support for tables #50937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

ilicmarkodb
Copy link
Contributor

@ilicmarkodb ilicmarkodb commented May 19, 2025

What changes were proposed in this pull request?

Added support for setting and altering a default collation at the schema level.
Tables that do not specify their own collation will now automatically inherit the schema-level default.

CREATE SCHEMA schema1 DEFAULT COLLATION UTF8_LCASE; CREATE TABLE t1 (c1 STRING);
The default collation for table t1 will be UTF8_LCASE. and c1 data type will be STRING COLLATE UTF8_LCASE.

Following PRs will add support for Views, UDFs.

Why are the changes needed?

New feature.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tests added to DefaultCollationTestSuite.scala.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 19, 2025
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 14 times, most recently from e3c2bac to c6387e2 Compare May 19, 2025 23:12
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 8 times, most recently from 5c5f6e5 to 3b2b537 Compare May 20, 2025 14:18
@ilicmarkodb
Copy link
Contributor Author

@cloud-fan can you please review?

@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 2 times, most recently from 1a66b0a to 22e1433 Compare May 21, 2025 10:00
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch from 22e1433 to 9303ab3 Compare May 21, 2025 14:34
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch from 9303ab3 to 9f7ae53 Compare May 21, 2025 14:37
@ilicmarkodb ilicmarkodb requested a review from cloud-fan May 21, 2025 14:38
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch from 9f7ae53 to b8c9e84 Compare May 21, 2025 22:24
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 6 times, most recently from e54edc7 to 9980b52 Compare May 22, 2025 14:19
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 2 times, most recently from 958255d to b41c819 Compare May 22, 2025 16:23
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch 2 times, most recently from fbc65dc to c45ec0b Compare May 22, 2025 17:17
@ilicmarkodb ilicmarkodb force-pushed the schema_level_collation branch from c45ec0b to fef0bd3 Compare May 23, 2025 07:58
@ilicmarkodb ilicmarkodb requested a review from cloud-fan May 23, 2025 11:38
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 858fbcc May 23, 2025
ilicmarkodb added a commit to ilicmarkodb/spark that referenced this pull request May 23, 2025
### What changes were proposed in this pull request?
Added support for setting and altering a default collation at the schema level.
Tables that do not specify their own collation will now automatically inherit the schema-level default.

`CREATE SCHEMA schema1 DEFAULT COLLATION UTF8_LCASE; CREATE TABLE t1 (c1 STRING);`
The default collation for table `t1` will be `UTF8_LCASE`. and c1 data type will be `STRING COLLATE UTF8_LCASE`.

Following PRs will add support for Views, UDFs.

### Why are the changes needed?
New feature.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Tests added to `DefaultCollationTestSuite.scala`.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#50937 from ilicmarkodb/schema_level_collation.

Authored-by: ilicmarkodb <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants