Skip to content

Commit 4a3e571

Browse files
committed
refactor: simplify lsm compaction and diagnostics
1 parent 2d545bb commit 4a3e571

32 files changed

+2036
-2050
lines changed

db.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -691,7 +691,7 @@ func (db *DB) RunValueLogGC(discardRatio float64) error {
691691
if discardRatio >= 1.0 || discardRatio <= 0.0 {
692692
return utils.ErrInvalidRequest
693693
}
694-
heads := db.lsm.Diagnostics().ValueLogHead
694+
heads := db.lsm.ValueLogHeadSnapshot()
695695
if len(heads) == 0 {
696696
db.RLock()
697697
if len(db.vheads) > 0 {

db_test.go

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -761,7 +761,7 @@ func TestRecoveryRemovesStaleValueLogSegment(t *testing.T) {
761761
removed := os.IsNotExist(err)
762762
require.True(t, removed, "expected stale value log file to be deleted on recovery")
763763

764-
status := db2.lsm.Diagnostics().ValueLogStatus
764+
status := db2.valueLogStatusSnapshot()
765765
meta, ok := status[manifest.ValueLogID{Bucket: 0, FileID: staleFID}]
766766
if ok {
767767
require.False(t, meta.Valid)
@@ -798,7 +798,7 @@ func TestRecoveryRemovesOrphanValueLogSegment(t *testing.T) {
798798
require.False(t, headPtr.IsZero(), "expected value log head to be initialized")
799799
headCopy := headPtr
800800
require.NoError(t, db.lsm.LogValueLogHead(&headCopy))
801-
before := db.lsm.Diagnostics().ValueLogStatus
801+
before := db.valueLogStatusSnapshot()
802802
beforeInfo := make(map[manifest.ValueLogID]bool, len(before))
803803
for id, meta := range before {
804804
beforeInfo[id] = meta.Valid
@@ -812,9 +812,9 @@ func TestRecoveryRemovesOrphanValueLogSegment(t *testing.T) {
812812
db2 := openTestDB(t, opt)
813813
defer func() { _ = db2.Close() }()
814814

815-
diag := db2.lsm.Diagnostics()
816-
headMeta, hasHead := diag.ValueLogHead[0]
817-
status := diag.ValueLogStatus
815+
heads := db2.getHeads()
816+
headMeta, hasHead := heads[0]
817+
status := db2.valueLogStatusSnapshot()
818818
statusInfo := make(map[manifest.ValueLogID]bool, len(status))
819819
for id, meta := range status {
820820
statusInfo[id] = meta.Valid

docs/architecture.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -67,18 +67,18 @@ iterator scan, distributed read/write via Raft apply), see
6767
- `CURRENT` provides crash-safe pointer updates for storage-engine metadata. Region descriptors are no longer stored in the storage manifest.
6868

6969
### 2.4 LSM Compaction & Ingest Buffer
70-
- `compact.Manager` drives compaction cycles; `lsm.levelManager` supplies table metadata and executes the plan.
71-
- Planning is split: `compact.PlanFor*` selects table IDs + key ranges, then LSM resolves IDs back to tables and runs the merge.
72-
- `compact.State` guards overlapping key ranges and tracks in-flight table IDs.
73-
- Ingest shard selection is policy-driven in `compact` (`PickShardOrder` / `PickShardByBacklog`) while the ingest buffer remains in `lsm`.
70+
- `lsm.compaction` drives compaction cycles; `lsm.levelManager` supplies table metadata and executes the plan.
71+
- Planning is split inside `lsm`: `PlanFor*` selects table IDs + key ranges, then LSM resolves IDs back to tables and runs the merge.
72+
- `lsm.State` guards overlapping key ranges and tracks in-flight table IDs.
73+
- Ingest shard selection is policy-driven in `lsm` (`PickShardOrder` / `PickShardByBacklog`) while the ingest buffer remains in `lsm`.
7474

7575
```mermaid
7676
flowchart TD
77-
Manager["compact.Manager"] --> LSM["lsm.levelManager"]
78-
LSM -->|TableMeta snapshot| Planner["compact.PlanFor*"]
79-
Planner --> Plan["compact.Plan (fid+range)"]
77+
Manager["lsm.compaction"] --> LSM["lsm.levelManager"]
78+
LSM -->|TableMeta snapshot| Planner["PlanFor*"]
79+
Planner --> Plan["lsm.Plan (fid+range)"]
8080
Plan -->|resolvePlanLocked| Exec["LSM executor"]
81-
Exec --> State["compact.State guard"]
81+
Exec --> State["lsm.State guard"]
8282
Exec --> Build["subcompact/build SST"]
8383
Build --> Manifest["manifest edits"]
8484
L0["L0 tables"] -->|moveToIngest| Ingest["ingest buffer shards"]

docs/compaction.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,20 @@
66

77
## 1. Overview
88

9-
Compactions are orchestrated by `compact.Manager` with `lsm.levelManager` implementing the executor hooks. Each level owns two lists of tables:
9+
Compactions are orchestrated by `lsm.compaction` with `lsm.levelManager` supplying scheduling input and executing the plan. Each level owns two lists of tables:
1010

1111
- `tables` – the canonical sorted run for the level.
1212
- `ingest` – a staging buffer that temporarily holds SSTables moved from the level above when there is not yet enough work (or bandwidth) to do a full merge.
1313

14-
The compaction manager periodically calls into the executor to build a list of `compact.Priority` entries. The priorities consider three signals:
14+
The compaction runtime periodically calls into the picker to build a list of `Priority` entries. The priorities consider three signals:
1515

1616
1. **L0 table count** – loosely capped by `Options.NumLevelZeroTables`.
1717
2. **Level size vs target** – computed by `levelTargets()`, which dynamically adjusts the “base” level depending on total data volume.
1818
3. **Ingest buffer backlog** – if a level’s `ingest` shards have data, they receive elevated scores so staged tables are merged promptly.
1919

2020
The highest adjusted score is processed first. L0 compactions can either move tables into the ingest buffer of the base level (cheap re‑parenting) or compact directly into a lower level when the overlap warrants it.
2121

22-
Planning now happens via `compact.Plan`: LSM snapshots table metadata into `compact.TableMeta`, `compact.PlanFor*` selects table IDs + key ranges, and LSM resolves the plan back to `*table` before executing.
22+
Planning now happens via `Plan`: LSM snapshots table metadata into `TableMeta`, `PlanFor*` selects table IDs + key ranges, and LSM resolves the plan back to `*table` before executing.
2323

2424
---
2525

@@ -41,9 +41,9 @@ Compaction tests (`lsm/compaction_test.go`) assert that after calling `moveToIng
4141

4242
To prevent overlapping compactions:
4343

44-
- `compact.State.CompareAndAdd` tracks the key range of each in-flight compaction per level.
44+
- `State.CompareAndAdd` tracks the key range of each in-flight compaction per level.
4545
- Attempts to register a compaction whose ranges intersect an existing one are rejected.
46-
- When a compaction finishes, `compact.State.Delete` removes the ranges and table IDs from the guard.
46+
- When a compaction finishes, `State.Delete` removes the ranges and table IDs from the guard.
4747

4848
This mechanism is intentionally simple—just a mutex‐protected slice—yet effective in tests (`TestCompactStatusGuards`) that simulate back‑to‑back registrations on the same key range.
4949

docs/errors.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Examples:
4646

4747
- `kv/entry_codec.go`: `ErrBadChecksum`, `ErrPartialEntry`
4848
- `vfs/vfs.go`: `ErrRenameNoReplaceUnsupported`
49-
- `lsm/compact/errors.go`: compaction planner/runtime domain errors
49+
- `/Volumes/mac Ds - Data/WorkSpace/GitHub/NoKV/lsm/compaction.go`: compaction planner/runtime domain errors
5050
- `raftstore/peer/errors.go`: peer lifecycle/state errors
5151
- `pb/errorpb.proto`: region/store routing protobuf errors (`RegionError`,
5252
`StoreNotMatch`, `RegionNotFound`, `KeyNotInRegion`, ...)

docs/ingest_buffer.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,11 @@ flowchart LR
1616
- **Sharded by key prefix**: ingest tables are routed into fixed shards (top bits of the first byte). Sharding cuts cross-range overlap and enables safe parallel drain.
1717
- **Snapshot-friendly reads**: ingest tables are read under the level `RLock`, and iterators hold table refs so mmap-backed data stays valid without additional snapshots.
1818
- **Two ingest paths**:
19-
- *Ingest-only compaction*: drain ingest → main level (or next level) with optional multi-shard parallelism guarded by `compact.State`.
19+
- *Ingest-only compaction*: drain ingest → main level (or next level) with optional multi-shard parallelism guarded by `State`.
2020
- *Ingest-merge*: compact ingest tables back into ingest (stay in-place) to drop superseded versions before promoting, reducing downstream write amplification.
2121
- **IngestMode enum**: plans carry an `IngestMode` with `IngestNone`, `IngestDrain`, and `IngestKeep`. `IngestDrain` corresponds to ingest-only (drain into main tables), while `IngestKeep` corresponds to ingest-merge (compact within ingest).
2222
- **Adaptive scheduling**:
23-
- Shard selection is driven by `compact.PickShardOrder` / `compact.PickShardByBacklog` using per-shard size, age, and density.
23+
- Shard selection is driven by `PickShardOrder` / `PickShardByBacklog` using per-shard size, age, and density.
2424
- Shard parallelism scales with backlog score (based on shard size/target file size) bounded by `IngestShardParallelism`.
2525
- Batch size scales with shard backlog to drain faster under pressure.
2626
- Ingest-merge triggers when backlog score exceeds `IngestBacklogMergeScore` (default 2.0), with dynamic lowering under extreme backlog/age.
@@ -33,7 +33,7 @@ flowchart LR
3333

3434
## Benefits
3535
- **Lower write amplification**: bursty L0 SSTables land in ingest first; `IngestKeep`/ingest-merge prunes duplicates before full compaction.
36-
- **Reduced contention**: sharding + `compact.State` allow parallel ingest drain with minimal overlap.
36+
- **Reduced contention**: sharding + `State` allow parallel ingest drain with minimal overlap.
3737
- **Predictable reads**: ingest is part of the read snapshot, so moving tables in/out does not change read semantics.
3838
- **Tunable and observable**: knobs for parallelism and merge aggressiveness, with per-path metrics to guide tuning.
3939

docs/notes/2026-01-16-hotring-design.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ graph TD
6060
* **写路径**:只有当启用了限流(`WriteHotKeyLimit`)或突发检测时,才会调用 `TouchAndClamp`
6161
2. **计算 (Compute)**:HotRing 内部利用**滑动窗口**算法计算实时 QPS。
6262
3. **反馈 (Feedback)**
63-
* **Compaction 评分**`lsm/compact` 在选择压缩层级时,会参考 `HotRing.TopN`。如果某一层包含大量热点 Key,会优先压缩该层(Hot Overlap Score),减少热点数据的读放大。
63+
* **Compaction 评分**`lsm/picker.go` 在选择压缩层级时,会参考 `HotRing.TopN`。如果某一层包含大量热点 Key,会优先压缩该层(Hot Overlap Score),减少热点数据的读放大。
6464
* **缓存预取 (Prefetch)**:DB 层会根据 TopN 结果触发预取逻辑。虽然 HotRing 不直接控制 Cache,但它提供的热点名单是预取策略的重要输入。
6565
* **写入限流**:对于写频率过高的 Key,`TouchAndClamp` 会触发限流保护。
6666

docs/testing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ NOKV_RUN_BENCHMARKS=1 YCSB_RECORDS=10000 YCSB_OPS=50000 YCSB_WARM_OPS=0 \
6060
| Module | Tests | Coverage Highlights | Gaps / Next Steps |
6161
| --- | --- | --- | --- |
6262
| WAL | `wal/manager_test.go` | Segment rotation, sync semantics, replay tolerance for truncation, directory bootstrap. | Add IO fault injection, concurrent append stress. |
63-
| LSM / Flush / Compaction | `lsm/lsm_test.go`, `lsm/compaction_test.go`, `lsm/compact/*_test.go`, `lsm/flush_runtime_test.go` | Memtable correctness, iterator merging, flush pipeline metrics, compaction scheduling. | Extend backpressure assertions, test cache hot/cold split. |
63+
| LSM / Flush / Compaction | `lsm/lsm_test.go`, `lsm/picker_test.go`, `lsm/planner_test.go`, `lsm/compaction_test.go`, `lsm/flush_runtime_test.go` | Memtable correctness, iterator merging, flush pipeline metrics, compaction scheduling. | Extend backpressure assertions, test cache hot/cold split. |
6464
| Manifest | `manifest/manager_test.go`, `lsm/manifest_test.go` | CURRENT swap safety, rewrite crash handling, vlog metadata persistence. | Simulate partial edit corruption, column family extensions. |
6565
| ValueLog | `vlog/manager_test.go`, `vlog/io_test.go`, `vlog_test.go` | ValuePtr encoding/decoding, GC rewrite/rewind, concurrent iterator safety. | Long-running GC, discard-ratio edge cases. |
6666
| Percolator / Distributed Txn | `percolator/*_test.go`, `raftstore/client/client_test.go`, `stats_test.go` | Prewrite/Commit/ResolveLock flows, 2PC retries, timestamp-driven MVCC behaviour, metrics accounting. | Mixed multi-region fuzzing with lock TTL and leader churn. |

lsm/compact/errors.go

Lines changed: 0 additions & 6 deletions
This file was deleted.

0 commit comments

Comments
 (0)