Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to disable writing of Parquet offset index #6797

Merged
merged 7 commits into from
Nov 27, 2024

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Nov 25, 2024

Which issue does this PR close?

Closes #6778.

Rationale for this change

Allow disabling offset indexes.

What changes are included in this PR?

Adds a global offset_index_disabled writer property. This was made global rather than per-column because the index is not very useful if not defined for all columns.

Are there any user-facing changes?

Yes, adds a new property.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Nov 25, 2024
}
}

/// Disable writing of offset indexes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for building this into the builder, as opposed to simply wrapping the OffsetIndexBuilder in Option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analogy with column index builder? Lack of imagination? 🤷 Changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should column_index_builder get similar treatment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, although probably not worth making a breaking change over if it is public

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never realized that the builders were public. Guess it's not worth the refactor to make column_index_builder an Option too.

parquet/src/file/properties.rs Outdated Show resolved Hide resolved
Co-authored-by: Raphael Taylor-Davies <[email protected]>
@tustvold tustvold merged commit 93e6749 into apache:main Nov 27, 2024
16 checks passed
@etseidl etseidl deleted the disable_offset_index branch November 27, 2024 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow disabling the writing of Parquet Offset Index
2 participants