Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-29039 Optimize read performance for accumulated delete markers on the same row or cell #6557

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

EungsopYoo
Copy link
Contributor

No description provided.

@EungsopYoo EungsopYoo changed the title Optimize read performance for accumulated delete markers on the same row or cell HBASE-29039 Optimize read performance for accumulated delete markers on the same row or cell Dec 23, 2024
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 3m 51s master passed
+1 💚 compile 3m 33s master passed
+1 💚 checkstyle 0m 39s master passed
+1 💚 spotbugs 1m 50s master passed
+1 💚 spotless 0m 47s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 39s the patch passed
+1 💚 compile 3m 30s the patch passed
+1 💚 javac 3m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 38s /results-checkstyle-hbase-server.txt hbase-server: The patch generated 5 new + 8 unchanged - 0 fixed = 13 total (was 8)
+1 💚 spotbugs 1m 45s the patch passed
+1 💚 hadoopcheck 15m 22s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 1m 8s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 17s The patch does not generate ASF License warnings.
46m 37s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6557/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6557
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 9b584a4c6b5b 5.4.0-200-generic #220-Ubuntu SMP Fri Sep 27 13:19:16 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 80044c2
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6557/6/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 4m 2s master passed
+1 💚 compile 1m 8s master passed
+1 💚 javadoc 0m 34s master passed
+1 💚 shadedjars 6m 7s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 37s the patch passed
+1 💚 compile 1m 7s the patch passed
+1 💚 javac 1m 7s the patch passed
+1 💚 javadoc 0m 30s the patch passed
+1 💚 shadedjars 5m 52s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 181m 4s hbase-server in the patch passed.
208m 40s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6557/6/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6557
Optional Tests javac javadoc unit compile shadedjars
uname Linux 8aa279d51e08 5.4.0-200-generic #220-Ubuntu SMP Fri Sep 27 13:19:16 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 80044c2
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6557/6/testReport/
Max. process+thread count 5216 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6557/6/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented Dec 29, 2024

I checked the code, we do have logic to seek to next row or column when we hit a delte family cell.

But the problem is that, seems we will return earlier before we actually call this method here

The above code block

    if (PrivateCellUtil.isDelete(typeByte)) {
      boolean includeDeleteMarker =
        seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : tr.withinOrAfterTimeRange(timestamp);
      if (includeDeleteMarker) {
        this.deletes.add(cell);
      }
      return MatchCode.SKIP;
    }

Seems incorrect, we will always return MatchCode.SKIP if we get a delete maker...

I think why we do not find this before is that, usually there will be only one delete maker, so when we check the next cell, we will fall through and call the checkDeleted method so we will seek to next row or column.

Here the scenario is that we have bunch of delete makrer, then here we will see them all instead of seek to next row or column, since we will always go into the code block above and return MatchCode.SKIP.

I think we should try to optimize the logic of the above code block.

Thanks.

@virajjasani
Copy link
Contributor

FYI @kadirozde

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants