Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain-driven appends #227

Merged
merged 8 commits into from
Nov 27, 2023
Merged

domain-driven appends #227

merged 8 commits into from
Nov 27, 2023

Conversation

Jiaweihu08
Copy link
Member

@Jiaweihu08 Jiaweihu08 commented Oct 31, 2023

Description

The PR modifies the way appends are done through the following changes:

  1. Use the existing index during partition-level domain estimation.
    When a cube c already exists as an inner cube, an element e should be accepted if e.weight < c.existingWeight and count < groupCubeSize.
    When a cube c already exists as a leaf cube, its WeightAndCount should start with its existing group size and add elements until it's full.
  2. Compute mergedCubeDomains through a cube-wise sum of the existing domain and the append domains.
  3. Compute cube weights from the mergedCubeDomains using Wc = Wpc + dcs / domain.
  4. Remove weight merge that was done through harmonic means. The estimated weights from 4. are final.

Fixes #226
The same test from #226 shows that, with these changes, only 0.16% of the append blocks don't update the maxWeight of their corresponding cubes.

Checklist:

  • New feature / bug fix has been committed following the Contribution guide.
  • Add comments to the code (make it easier for the community!).
  • Change the documentation.
  • Add tests.
  • Your branch is updated to the main branch (dependent changes have been merged).
  • Furhter analysis on its impact on Replication

@Jiaweihu08 Jiaweihu08 requested a review from cugni October 31, 2023 16:45
Copy link

codecov bot commented Oct 31, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (2d39821) 90.86% compared to head (f2fc149) 92.07%.
Report is 21 commits behind head on main-1.0.0.

❗ Current head f2fc149 differs from pull request most recent head ed25729. Consider uploading reports for the commit ed25729 to get more accurate results

Files Patch % Lines
...io/qbeast/core/model/BroadcastedTableChanges.scala 87.50% 1 Missing ⚠️
...cala/io/qbeast/spark/index/OTreeDataAnalyzer.scala 98.11% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff               @@
##           main-1.0.0     #227      +/-   ##
==============================================
+ Coverage       90.86%   92.07%   +1.21%     
==============================================
  Files              93       88       -5     
  Lines            2386     2272     -114     
  Branches          178      176       -2     
==============================================
- Hits             2168     2092      -76     
+ Misses            218      180      -38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cugni cugni marked this pull request as ready for review November 27, 2023 14:42
@Jiaweihu08 Jiaweihu08 changed the base branch from main to main-1.0.0 November 27, 2023 14:51
@osopardo1 osopardo1 merged commit 36fcbf2 into Qbeast-io:main-1.0.0 Nov 27, 2023
This was referenced Mar 13, 2024
@Jiaweihu08 Jiaweihu08 deleted the dda branch April 5, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Appends that don't update cube weights
3 participants