-
Notifications
You must be signed in to change notification settings - Fork 510
db: edit default heuristic determining levels eligible for compaction #5187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: edit default heuristic determining levels eligible for compaction #5187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)
testdata/compaction_picker_concurrency
line 52 at r1 (raw file):
Here's a link to a discussion about this when the same change was propsed before:
If Lbase has a high score and we are compacting out of Lbase and therefore unable to compact into Lbase, the new heuristic results in an intra-L0 compactions. Intra-L0 compactions are subjectively quite useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also makes much more sense to me conceptually, but I wonder if any workloads will regress..
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed (commit messages unreviewed), 1 unresolved discussion (waiting on @RaduBerinde)
testdata/compaction_picker_concurrency
line 52 at r1 (raw file):
Previously, RaduBerinde wrote…
Here's a link to a discussion about this when the same change was propsed before:
If Lbase has a high score and we are compacting out of Lbase and therefore unable to compact into Lbase, the new heuristic results in an intra-L0 compactions. Intra-L0 compactions are subjectively quite useless.
I'm ok with keeping compensatedScore
only for L0 to determine eligibility for compaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I wonder if any workloads will regress..
Should the changes be made behind a config setting so they can be reverted if a problem arises? Sort of messy, but good to have controlled rollout of changes like this.
Reviewable status: all files reviewed (commit messages unreviewed), 1 unresolved discussion (waiting on @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed (commit messages unreviewed), 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
testdata/compaction_picker_concurrency
line 52 at r1 (raw file):
Previously, sumeerbhola wrote…
I'm ok with keeping
compensatedScore
only for L0 to determine eligibility for compaction.
Hrm, I'm not sure the compensated score is the right mechanism to prevent those intra L0 compactions in that scenario.
If Lbase is large (so compensated score is potentially < 1.0) and L0 has significant garbage (compensated fill factor > 1.0), and we can pick a L0->Lbase compaction, we should.
If we want to discourage intra-L0 compactions specifically, maybe we should adjust the heuristics around intra-L0 compactions specifically.
7ef6f66
to
6eec91f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
testdata/compaction_picker_concurrency
line 52 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Hrm, I'm not sure the compensated score is the right mechanism to prevent those intra L0 compactions in that scenario.
If Lbase is large (so compensated score is potentially < 1.0) and L0 has significant garbage (compensated fill factor > 1.0), and we can pick a L0->Lbase compaction, we should.
If we want to discourage intra-L0 compactions specifically, maybe we should adjust the heuristics around intra-L0 compactions specifically.
Ah maybe this is moot--we don't do any compensation for L0 today.
scores[0] = candidateLevelInfo{
outputLevel: p.baseLevel,
fillFactor: l0FillFactor,
compensatedFillFactor: l0FillFactor, // No compensation for L0.
}
I adjusted the logic to explicitly avoid compacting L0 if its score is <1.0 for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the changes be made behind a config setting so they can be reverted if a problem arises? Sort of messy, but good to have controlled rollout of changes like this.
+1 since these compaction heuristic changes are so hard to fully predict
Reviewed 2 of 2 files at r2.
Reviewable status: all files reviewed (commit messages unreviewed), 4 unresolved discussions (waiting on @RaduBerinde)
-- commits
line 17 at r2:
nit: this paragraph is slightly stale
compaction_picker.go
line 1041 at r2 (raw file):
} const compensatedFillFactorThreshold = 1.0 // The level requires compaction iff the compensatedFillFactor is >= 1.0.
nit: may require? "iff" isn't quite right, since L0 has an additional condition below.
compaction_picker.go
line 1058 at r2 (raw file):
// after dividing by the next level's fill factor) is less than 1.0. // This helps avoid over eager L0 compaction picking (and intra L0 // compactions) when LBase is large.
nit: consider elaborating on the "intra L0 compaction when Lbase is large". The concern specifically is that there are compactions happening out of Lbase to Lbase+1 because Lbase is large, and we are not able to find a key range for an L0 => Lbase compaction.
6eec91f
to
7d313dd
Compare
When deciding whether a level is even eligible for compaction picking, we consider the 'compensated fill factor': a statistic about the size of a level plus its garbage relative to the level's ideal size given the level multiplier and database size. Previously, we considered a level eligible to be the input level of a compaction if the compensated fill factor is >= 1.0 AND the compensated fill factor divided by the next level's fill factor is also > 1.0. We've preserved this behavior for historical consistency but have no clear justification for it. Experimentally, we've observed instances where the LSM is known to contain significant garbage (>14%) but no compactions are pursued because most of the data is in L6 and L6's fill factor is positive (see cockroachdb/cockroach#151633). This commit adjusts the default behavior to only require a compensated score >= 1.0 for L0. Other levels only require a 'compensated fill factor' >= 1.0 to be eligible for compaction. A new experimental UseDeprecatedCompensatedScore Option is introduced to allow opting out of the new behavior at runtime. This option is intended as an escape hatch if we observe unintended side effects in production. Otherwise, it'll ultimately be removed.
7d313dd
to
28f63bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Added an experimental option to allow reverting to the previous behavior
Reviewable status: 1 of 3 files reviewed, 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
Previously, sumeerbhola wrote…
nit: this paragraph is slightly stale
done
compaction_picker.go
line 1041 at r2 (raw file):
Previously, sumeerbhola wrote…
nit: may require? "iff" isn't quite right, since L0 has an additional condition below.
done
compaction_picker.go
line 1058 at r2 (raw file):
Previously, sumeerbhola wrote…
nit: consider elaborating on the "intra L0 compaction when Lbase is large". The concern specifically is that there are compactions happening out of Lbase to Lbase+1 because Lbase is large, and we are not able to find a key range for an L0 => Lbase compaction.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @RaduBerinde)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @RaduBerinde and @sumeerbhola)
In cockroachdb/pebble#5187 we altered the default compaction picking logic that determines whether a level is elgible for a default compaction. We added a knob to allow the configuration of this value at runtime in case the change has some unforeseen impact on some workloads. This commit connects this knob to a new cluster setting. This cluster setting is just an escape hatch that we don't anticipate needing. Epic: none Release note: none
When deciding whether a level is even eligible for compaction picking, we
consider the 'compensated fill factor': a statistic about the size of a level
plus its garbage relative to the level's ideal size given the level multiplier
and database size. Previously, we considered a level eligible to be the input
level of a compaction if the compensated fill factor is >= 1.0 AND the
compensated fill factor divided by the next level's fill factor is also > 1.0.
We've preserved this behavior for historical consistency but have no clear
justification for it. Experimentally, we've observed instances where the LSM is
known to contain significant garbage (>14%) but no compactions are pursued
because most of the data is in L6 and L6's fill factor is positive (see
cockroachdb/cockroach#151633).
This commit adjusts the default behavior to only require a compensated score >=
1.0 for L0. Other levels only require a 'compensated fill factor' >= 1.0 to be
eligible for compaction.
A new experimental UseDeprecatedCompensatedScore Option is introduced to allow
opting out of the new behavior at runtime. This option is intended as an escape
hatch if we observe unintended side effects in production. Otherwise, it'll
ultimately be removed.