fix: attempt to close all resources in DB.Close() even on error#65
Conversation
📝 WalkthroughWalkthroughDB.Close was changed to always attempt closing LSM, VLog, WAL, and dir lock, recording call order and aggregating any returned errors. A test-only hook container ( Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test / Caller
participant DB as DB.Close()
participant LSM as LSM
participant VLOG as VLog
participant WAL as WAL
participant LOCK as DirLock
participant AGG as ErrorAggregator
Test->>DB: call Close()
DB->>LSM: close (or test hook)
alt error?
LSM-->>AGG: add "lsm" error
end
DB->>VLOG: close (or test hook)
alt error?
VLOG-->>AGG: add "vlog" error
end
DB->>WAL: close (or test hook)
alt error?
WAL-->>AGG: add "wal" error
end
DB->>LOCK: release (or test hook)
alt error?
LOCK-->>AGG: add "dir lock" error
end
AGG-->>DB: joined error (nil or aggregated)
DB-->>Test: return aggregated error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
db.go (1)
353-355:⚠️ Potential issue | 🟠 Major
stats.close()still short-circuits, bypassing error aggregation for downstream resources.If
db.stats.close()fails, the function returns immediately (line 354), and lsm/vlog/wal/dirLock are never closed. This contradicts the PR objective of best-effort cleanup for all resources. Consider collecting this error intoerrsas well.🐛 Proposed fix
- if err := db.stats.close(); err != nil { - return err - } + if err := db.stats.close(); err != nil { + errs = append(errs, fmt.Errorf("stats close: %w", err)) + }Note: you'd need to move
var errs []errorabove this block (before line 353).
🤖 Fix all issues with AI agents
In `@db_test.go`:
- Around line 996-1025: The test TestCloseWithErrors replaces real closers with
testCloseHooks and never releases the real resources opened by Open(opt),
leaking descriptors and the directory lock; fix by capturing the real resource
handles before overriding testCloseHooks and ensuring they are closed/released
(e.g., save db.lsm, db.vlog, db.wal, db.dirLock into locals and defer/explicitly
call their Close/Release after the assert or before setting testCloseHooks) so
the real LSM/vlog/WAL/dirLock are always closed even while you assert the hook
behavior.
In `@db.go`:
- Around line 422-428: The Close() method currently returns early when errs is
non-empty and never sets db.isClosed, breaking idempotence; fix it by ensuring
atomic.StoreUint32(&db.isClosed, 1) is always executed before any return from
Close() (e.g. set it at the start of Close() when shutdown begins or use a defer
that sets it regardless of errors), then continue to aggregate and return
stderrors.Join(errs...) so callers still receive the error list; reference
Close(), the errs slice, IsClosed(), and atomic.StoreUint32(&db.isClosed, 1)
when making the change.
🧹 Nitpick comments (1)
db.go (1)
111-117: Test hooks struct adds test-only plumbing to production code.An alternative is to define an interface for the closeable components (or use function fields on DB set only during tests via an
_test.gofile usinginternalaccess). That said, the current approach is a common pragmatic pattern in Go — consider refactoring in a follow-up if the hook surface grows.
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This pull request fixes a resource leak issue where DB.Close() would short-circuit on the first error, preventing subsequent cleanup operations from running. The fix introduces aggregated error handling using errors.Join to ensure all close operations are attempted even when individual operations fail.
Changes:
- Modified
DB.Close()to collect and aggregate errors from lsm, vlog, wal, and dirLock close operations instead of returning immediately on first error - Added
testCloseHooksstruct to inject close errors in tests without affecting real resource cleanup - Added
TestCloseWithErrorsto verify that all close operations are attempted and errors are properly aggregated
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| db.go | Implemented error aggregation in Close() method, added testCloseHooks for testing, and record() helper method |
| db_test.go | Added TestCloseWithErrors to verify all close operations execute even when they fail and errors are aggregated |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func TestCloseWithErrors(t *testing.T) { | ||
| clearDir() | ||
| db := Open(opt) | ||
| lsmErr := errors.New("lsm close error") | ||
| vlogErr := errors.New("vlog close error") | ||
| walErr := errors.New("wal close error") | ||
| dirLockErr := errors.New("dir lock release error") | ||
| db.testCloseHooks = &testCloseHooks{ | ||
| lsmClose: func() error { return lsmErr }, | ||
| vlogClose: func() error { return vlogErr }, | ||
| walClose: func() error { return walErr }, | ||
| dirLockRelease: func() error { return dirLockErr }, | ||
| calls: []string{}, | ||
| } | ||
| err := db.Close() |
There was a problem hiding this comment.
The test opens a real database with Open(opt) which initializes real lsm, vlog, wal, and dirLock resources, but then immediately replaces their close methods with test hooks that only return errors. This means the actual resources (file handles, locks, etc.) may not be properly cleaned up when the test ends. Consider either: (1) creating a mock/stub DB instance instead of using Open(), or (2) ensuring the test hooks still call the real close methods after recording the call, or (3) adding explicit cleanup in a defer statement.
| type testCloseHooks struct { | ||
| lsmClose func() error | ||
| vlogClose func() error | ||
| walClose func() error | ||
| dirLockRelease func() error | ||
| calls []string | ||
| } |
There was a problem hiding this comment.
The testCloseHooks struct lacks a documentation comment explaining its purpose and usage. Consider adding a comment like: "testCloseHooks allows tests to intercept and simulate errors during DB.Close() operations without affecting real resource cleanup. Only for testing purposes." This would help other developers understand when and how to use this testing facility.
| err := db.Close() | ||
|
|
||
| var gotErrs []error | ||
| if err != nil { | ||
| if multiErr, ok := err.(interface{ Unwrap() []error }); ok { | ||
| gotErrs = multiErr.Unwrap() | ||
| } | ||
| } | ||
| require.Len(t, gotErrs, 4) | ||
| require.True(t, errors.Is(gotErrs[0], lsmErr)) | ||
| require.True(t, errors.Is(gotErrs[1], vlogErr)) | ||
| require.True(t, errors.Is(gotErrs[2], walErr)) | ||
| require.True(t, errors.Is(gotErrs[3], dirLockErr)) | ||
| expectCalls := []string{"lsm close", "vlog close", "wal close", "dir lock release"} | ||
| require.Equal(t, expectCalls, db.testCloseHooks.calls) | ||
| } |
There was a problem hiding this comment.
The test doesn't verify the state of db.IsClosed() after Close() returns errors. Given the bug where isClosed is only set on success (line 427), the test should assert that either: (1) db.IsClosed() returns true even when Close() fails (if that's the intended behavior), or (2) db.IsClosed() returns false when Close() fails (if that matches the current buggy behavior). This would help document the expected behavior and catch regressions.
db.go
Outdated
| if len(errs) > 0 { | ||
| return stderrors.Join(errs...) | ||
| } | ||
|
|
||
| atomic.StoreUint32(&db.isClosed, 1) |
There was a problem hiding this comment.
The isClosed flag is only set to 1 when there are no errors in the aggregated close operations. This means if any of lsm, vlog, wal, or dirLock close operations fail, the database will remain in a "not closed" state (isClosed will still be 0). This could lead to issues if Close() is called again or if other code checks IsClosed() to determine whether operations can proceed. The isClosed flag should be set regardless of whether errors occurred during cleanup, since the close attempt has been made and the database should not be used further.
| if len(errs) > 0 { | |
| return stderrors.Join(errs...) | |
| } | |
| atomic.StoreUint32(&db.isClosed, 1) | |
| atomic.StoreUint32(&db.isClosed, 1) | |
| if len(errs) > 0 { | |
| return stderrors.Join(errs...) | |
| } |
| var gotErrs []error | ||
| if err != nil { | ||
| if multiErr, ok := err.(interface{ Unwrap() []error }); ok { | ||
| gotErrs = multiErr.Unwrap() | ||
| } | ||
| } |
There was a problem hiding this comment.
The error checking logic has a flaw: if err is nil or if it doesn't implement the Unwrap() []error interface, gotErrs will be empty/nil but the test will still try to access gotErrs[0] through gotErrs[3] in the assertions below. This could lead to index out of bounds panics. Consider adding a check like 'require.NotNil(t, err)' before the type assertion, or failing the test if gotErrs is not populated as expected.
| type testCloseHooks struct { | ||
| lsmClose func() error | ||
| vlogClose func() error | ||
| walClose func() error | ||
| dirLockRelease func() error | ||
| calls []string | ||
| } |
There was a problem hiding this comment.
The testCloseHooks struct is not thread-safe. The 'calls' slice is accessed by the record() method without any synchronization. If Close() were ever called concurrently (though it shouldn't be in typical usage), this could lead to race conditions. Consider adding a mutex to protect the calls slice, or document that testCloseHooks is not thread-safe and should only be used in single-threaded test scenarios.
| if db.testCloseHooks != nil && db.testCloseHooks.lsmClose != nil { | ||
| db.testCloseHooks.record("lsm close") | ||
| if err := db.testCloseHooks.lsmClose(); err != nil { | ||
| errs = append(errs, fmt.Errorf("lsm close: %w", err)) | ||
| } | ||
| } else if err := db.lsm.Close(); err != nil { | ||
| errs = append(errs, fmt.Errorf("lsm close: %w", err)) | ||
| } | ||
| if err := db.vlog.close(); err != nil { | ||
| return err | ||
|
|
||
| if db.testCloseHooks != nil && db.testCloseHooks.vlogClose != nil { | ||
| db.testCloseHooks.record("vlog close") | ||
| if err := db.testCloseHooks.vlogClose(); err != nil { | ||
| errs = append(errs, fmt.Errorf("vlog close: %w", err)) | ||
| } | ||
| } else if err := db.vlog.close(); err != nil { | ||
| errs = append(errs, fmt.Errorf("vlog close: %w", err)) | ||
| } | ||
| if err := db.wal.Close(); err != nil { | ||
| return err | ||
|
|
||
| if db.testCloseHooks != nil && db.testCloseHooks.walClose != nil { | ||
| db.testCloseHooks.record("wal close") | ||
| if err := db.testCloseHooks.walClose(); err != nil { | ||
| errs = append(errs, fmt.Errorf("wal close: %w", err)) | ||
| } | ||
| } else if err := db.wal.Close(); err != nil { | ||
| errs = append(errs, fmt.Errorf("wal close: %w", err)) | ||
| } | ||
|
|
||
| if db.dirLock != nil { | ||
| if err := db.dirLock.Release(); err != nil { | ||
| return err | ||
| if db.testCloseHooks != nil && db.testCloseHooks.dirLockRelease != nil { | ||
| db.testCloseHooks.record("dir lock release") | ||
| if err := db.testCloseHooks.dirLockRelease(); err != nil { | ||
| errs = append(errs, fmt.Errorf("dir lock release: %w", err)) | ||
| } | ||
| } else if err := db.dirLock.Release(); err != nil { | ||
| errs = append(errs, fmt.Errorf("dir lock release: %w", err)) | ||
| } | ||
| db.dirLock = nil | ||
| } |
There was a problem hiding this comment.
The close hook implementation is verbose and repetitive. Each close operation requires 8-9 lines of nearly identical if-else logic. Consider refactoring to a helper method that handles the hook/real operation pattern, which would reduce code duplication and make the logic easier to maintain. For example: a method like 'closeWithHook(name string, hookFn func() error, realFn func() error) error' that returns the error to append.
|
@feichai0017 I've updated the code based on your suggestions. Please review again—thanks! |
|
LGTM,Thanks |
Previously, DB.Close() would return immediately on the first error (e.g. from lsm.Close()),
skipping cleanup of vlog, WAL, and dirLock. This could lead to resource leaks or stale locks.
Now, all close operations are attempted regardless of individual errors,
and any errors are aggregated using errors.Join.
Add test to verify that other components are closed even when one fails.
Fix #55
Summary by CodeRabbit
Bug Fixes
Tests