Skip to content

Commit

Permalink
pageserver: reduce default compaction_upper_limit to 20 (#10889)
Browse files Browse the repository at this point in the history
## Problem

We've seen the previous default of 50 cause OOMs. Compacting many L0
layers at once now has limited benefit, since the cost is mostly linear
anyway. This is already being reduced to 20 in production settings.

## Summary of changes

Reduce `DEFAULT_COMPACTION_UPPER_LIMIT` to 20.

Once released, let's remove the config overrides.
  • Loading branch information
erikgrinaker authored Feb 19, 2025
1 parent 2d96134 commit 0453eaf
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions libs/pageserver_api/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -544,10 +544,11 @@ pub mod tenant_conf_defaults {
pub const DEFAULT_COMPACTION_PERIOD: &str = "20 s";
pub const DEFAULT_COMPACTION_THRESHOLD: usize = 10;

// This value needs to be tuned to avoid OOM. We have 3/4 of the total CPU threads to do background works, that's 16*3/4=9 on
// most of our pageservers. Compaction ~50 layers requires about 2GB memory (could be reduced later by optimizing L0 hole
// calculation to avoid loading all keys into the memory). So with this config, we can get a maximum peak compaction usage of 18GB.
pub const DEFAULT_COMPACTION_UPPER_LIMIT: usize = 50;
// This value needs to be tuned to avoid OOM. We have 3/4*CPUs threads for L0 compaction, that's
// 3/4*16=9 on most of our pageservers. Compacting 20 layers requires about 1 GB memory (could
// be reduced later by optimizing L0 hole calculation to avoid loading all keys into memory). So
// with this config, we can get a maximum peak compaction usage of 9 GB.
pub const DEFAULT_COMPACTION_UPPER_LIMIT: usize = 20;
pub const DEFAULT_COMPACTION_L0_FIRST: bool = false;
pub const DEFAULT_COMPACTION_L0_SEMAPHORE: bool = true;

Expand Down

1 comment on commit 0453eaf

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7697 tests run: 7314 passed, 1 failed, 382 skipped (full report)


Failures on Postgres 16

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_branch_creation_many[release-pg16-github-actions-selfhosted-random-1024]"
Flaky tests (1)

Postgres 17

Code coverage* (full report)

  • functions: 32.9% (8621 of 26193 functions)
  • lines: 48.8% (72716 of 148858 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
0453eaf at 2025-02-19T16:52:57.486Z :recycle:

Please sign in to comment.