Skip to content

feat(memory): add cross-process locking and atomic writes for sqlite#3114

Open
yunasora wants to merge 3 commits into
docker:mainfrom
yunasora:feat/memory-atomic-writes
Open

feat(memory): add cross-process locking and atomic writes for sqlite#3114
yunasora wants to merge 3 commits into
docker:mainfrom
yunasora:feat/memory-atomic-writes

Conversation

@yunasora

Copy link
Copy Markdown

Fixes #3022

Implementation Details & Context

During the implementation, I aligned the changes with the existing codebase architecture and made a few adjustments based on the repository's current state:

  1. Database Configuration & Connection Pool:

    • The requested parameters (PRAGMA journal_mode=WAL, busy_timeout=5000, and SetMaxOpenConns(1)) are already centrally configured in sqliteutil/sqlite.go (which corresponds to the pkg/memory/database/sqlite/db.go mentioned in the checklist).
    • The memory SQLite DB utilizes this central entry point, so the concurrent state is natively guaranteed by the connection pool settings.
    • Separate read pools were not explicitly introduced because sqliteutil.OpenDB already enforces serialized writes via SetMaxOpenConns(1) while leaving read paths unlocked. WAL seamlessly handles the read/write concurrency.
  2. Cross-Platform File Lock:

    • Wrapped the entire write path (add/update/delete) inside the FileLock boundary across Linux, macOS, and Windows.
  3. Drift Guard Note:

    • Since the drift guard / generation mechanism is not yet present in the current codebase, the specific test case "drift guard fires correctly under contention" cannot be fully validated in isolation at this moment. However, because the write paths are now fully guarded by the FileLock boundary, the upcoming drift guard feature can be safely integrated into this perimeter later.

All cross-platform compilation checks and go test -race for memory/database modules have passed. Ready for review!

@yunasora yunasora requested a review from a team as a code owner June 15, 2026 02:14
@aheritier aheritier added area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat:). Use on PRs only. area/agent For work that has to do with the general agent loop/agentic features of the app labels Jun 15, 2026
@aheritier aheritier added the area/runtime Runtime engine, agent loop execution, tool dispatch, loop detection label Jun 18, 2026
@aheritier aheritier marked this pull request as draft June 22, 2026 13:04
@aheritier aheritier added the status/needs-rebase PR has merge conflicts or is out of date with main label Jun 22, 2026
@yunasora yunasora force-pushed the feat/memory-atomic-writes branch from 5143d6a to 9a04a21 Compare June 24, 2026 15:26
@yunasora yunasora marked this pull request as ready for review June 24, 2026 16:29
@aheritier aheritier removed the status/needs-rebase PR has merge conflicts or is out of date with main label Jun 24, 2026
@aheritier aheritier marked this pull request as draft June 30, 2026 22:03
@aheritier aheritier added the status/needs-rebase PR has merge conflicts or is out of date with main label Jun 30, 2026
@aheritier

Copy link
Copy Markdown
Contributor

👋 This PR has merge conflicts with the base branch. Please rebase or merge the latest base branch and resolve them. I've moved it to draft and added status/needs-rebase; it'll be picked back up automatically once the conflicts are cleared.

@yunasora yunasora force-pushed the feat/memory-atomic-writes branch from 9a04a21 to 6d98b1c Compare July 2, 2026 08:15
@yunasora yunasora marked this pull request as ready for review July 2, 2026 08:17
@aheritier aheritier removed the status/needs-rebase PR has merge conflicts or is out of date with main label Jul 2, 2026

@Sayt-0 Sayt-0 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of the write-serialization work. The core locking is sound: go build, go vet, golangci-lint (0 issues) and go test -race ./pkg/memory/database/... all pass locally on darwin/arm64, and the fcntl close-drops-locks footgun is correctly avoided by acquiring the in-process processLock before opening the fd. A few items should be addressed before merge.

Area Finding Severity
js/wasm backup.go imports sqliteutil -> modernc.org/sqlite, which has no js/wasm build, so GOOS=js GOARCH=wasm go build ./pkg/memory/database/ fails on this branch (it succeeds on main). This also makes the newly added lock_js.go stub unreachable. high
Consistency The new lock uses fcntl record locks, diverging from the flock-based pkg/cache pattern the issue asks to follow, and duplicates that logic instead of sharing it. medium
Scope ExportSnapshot has no non-test caller and its consumer (the drift guard) is not in the tree yet. medium
Overlap WAL, busy_timeout=5000 and SetMaxOpenConns(1) are already set centrally in sqliteutil.OpenDB; the new WAL/busy-timeout test asserts pre-existing behavior. info

Note on impact: the shipped wasm binary (cmd/wasm) does not import pkg/memory/database, so the artifact itself is not broken, but the description's claim that all cross-platform compilation checks pass does not hold for js/wasm. Inline comments follow.

Comment thread pkg/memory/database/backup.go Outdated

atomic "github.com/natefinch/atomic"

"github.com/docker/docker-agent/pkg/sqliteutil"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import pulls modernc.org/sqlite into the database package, which has no js/wasm build. On main, GOOS=js GOARCH=wasm go build ./pkg/memory/database/ succeeds; on this branch it fails. As a side effect the newly added lock_js.go stub becomes unreachable, since the package it lives in can no longer compile for js/wasm. Suggested fix: either drop lock_js.go (it serves no purpose once the package cannot target js/wasm), or move the sqlite-dependent snapshot code behind a non-js build tag so the package builds for js/wasm again.


import "os"

func lockFileExclusive(_ *os.File) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This js/wasm stub is currently dead code: backup.go in the same package imports sqliteutil -> modernc.org/sqlite, which does not build for js/wasm, so the package cannot compile for this target regardless of these no-ops. Keep this file only if the sqlite dependency is gated out of js/wasm (see comment on backup.go); otherwise remove it.


// lockFileExclusive attempts to acquire an exclusive advisory lock without
// blocking. The retry loop in FileLock.Lock handles waiting and cancellation.
func lockFileExclusive(f *os.File) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked issue asks to follow the existing pkg/cache/ lock pattern, which uses flock(2) (unix.Flock, tied to the open file description). This introduces a second, different mechanism: fcntl POSIX record locks (unix.FcntlFlock). fcntl locks carry the known footgun that closing any fd on the file drops all of the process's locks on it. It is correctly mitigated here by taking processLock before opening the fd, but a flock with LOCK_NB plus the existing retry loop would match pkg/cache, avoid the footgun, and allow the two implementations to be shared rather than duplicated. Please justify the divergence or align with the existing pattern.

Comment thread pkg/memory/database/backup.go Outdated
// renamed into place, so readers of finalPath see either the previous snapshot
// or the complete new snapshot. The source memory DB lock is held while the
// snapshot is created to serialize it with memory writes.
func ExportSnapshot(ctx context.Context, dbPath, finalPath string) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExportSnapshot has no non-test caller, and its stated purpose (the drift-guard .bak export) is not present in the tree yet (no drift/generation code under pkg/memory). This ships a public API plus ~100 lines of tests for a consumer that does not exist. Consider landing it together with the drift guard, or confirm it is intentionally pre-merged as scaffolding.

require.Error(t, err, "UpdateMemory should fail with canceled context")
}

func TestMemoryDatabaseUsesWALAndBusyTimeout(t *testing.T) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: this asserts WAL and busy_timeout=5000, which are configured centrally by sqliteutil.OpenDB and already hold on main. The test guards pre-existing behavior rather than anything introduced by this PR. Fine to keep as a regression guard, but it is not evidence of the new locking work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent For work that has to do with the general agent loop/agentic features of the app area/runtime Runtime engine, agent loop execution, tool dispatch, loop detection area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat:). Use on PRs only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(memory): Atomic writes and cross-process locking for memory DB access

3 participants