Skip to content

Simplify ParquetRecordBatchReader::next control logic #7512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2025

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 15, 2025

Which issue does this PR close?

Rationale for this change

As part of #7456 we will likely need to change the control logic in ParquetRecordBatchReader

However, at the moment the code is somewhat cumbersome to understand because it has to map any Err into a Some(Err..) to conform to the iterator interface

What changes are included in this PR?

Move the control logic into a separate function that returns Result<Option<..>> so that the ? can be used to make the control flow logic clearer

Are there any user-facing changes?

@github-actions github-actions bot added the parquet Changes to the parquet crate label May 15, 2025
@alamb alamb changed the title Simplify ParquetRecordBatchReader control logic Simplify ParquetRecordBatchReader::next control logic May 15, 2025
@@ -800,24 +800,33 @@ impl Iterator for ParquetRecordBatchReader {
type Item = Result<RecordBatch, ArrowError>;

fn next(&mut self) -> Option<Self::Item> {
self.next_inner()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically this PR consolidates the conversion of converting to ParquetError and transposing to a single location rather than inlining it several places in next()

"Struct array reader should return struct array".to_string(),
)
});
let array = self.array_reader.consume_batch()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a pretty good example of how the logic gets simpler

@alamb alamb force-pushed the alamb/simplify_batch_reader branch from 674748a to 7121c54 Compare May 15, 2025 17:18
@alamb alamb marked this pull request as ready for review May 15, 2025 19:25
Copy link
Contributor

@zhuqi-lucas zhuqi-lucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thank you @alamb , It makes sense we throw the error, it's clear.

@alamb
Copy link
Contributor Author

alamb commented May 16, 2025

Thank you for the review @zhuqi-lucas

@crepererum crepererum merged commit 1a5999a into apache:main May 16, 2025
16 checks passed
alamb added a commit to alamb/arrow-rs that referenced this pull request May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants