fix(store): preserve corrupt DB instead of deleting (#557) by yasku · Pull Request #620 · DeusData/codebase-memory-mcp

yasku · 2026-06-25T01:31:29Z

What does this PR do?

Fixes #557. The startup/query integrity check unlink()'d a project's .db (plus its -wal/-shm sidecars) on any anomaly, causing total data loss with no backup, no recovery path, and no visible warning. The trigger is frequently a false positive: a WAL leftover after a SIGKILL, a binary variant swap (standard ↔ ui), or a stale-lock break can all make a recoverable DB look corrupt.

This replaces the destructive unlink with a rename to <name>.db.corrupt.<timestamp> and emits a prominent stderr message, matching the issue's suggested fix. The user can now inspect, recover, or attach the file to a bug report.

Two call sites are fixed:

src/mcp/mcp.c resolve_store() — the reported path, hit on the first query against a project.
src/pipeline/pipeline.c try_incremental_or_delete_db() — only preserves on real corruption; a healthy DB rebuilt due to a mode change is still deleted, so .corrupt files don't accumulate on normal reindexes.

If rename fails (permissions, cross-device), it falls back to unlink so no orphan files are left behind.

Checklist

Every commit is signed off (git commit -s) — DCO verified via scripts/check-dco.sh
Tests pass locally (make -f Makefile.cbm test) — 5684 passed
Lint passes for changed files (clang-format clean; cppcheck clean on the three modified files)
New behavior is covered by a test (reproduce-first): resolve_store_preserves_corrupt_db_issue557 seeds a corrupt projects row, issues a query, and asserts the original .db is gone while a .corrupt sidecar remains.

Notes on the lint checklist item (transparency)

A few constraints worth making explicit so the maintainer isn't surprised:

make lint-ci does not pass cleanly locally, but only on pre-existing code I did not touch. My local cppcheck is 2.21.0, which is newer/stricter than CI's and flags pre-existing findings in src/cypher/cypher.c and src/cli/cli.c (e.g. knownConditionTrueFalse, compareValueOutOfTypeRangeError). Running cppcheck scoped to just my three changed files exits clean (0). I deliberately did not touch those files to keep this PR focused (per CONTRIBUTING's "avoid unrelated reformatting").
clang-format version skew. CI uses clang-format-20; I only have clang-format-22 locally. Both modified source files (mcp.c, pipeline.c) are whole-file clean under v22, and since the surrounding pre-existing code shows no diff under v22 either, v20 and v22 appear to agree on this file's style. If CI's v20 disagrees on any of my lines, I'm happy to reformat.
clang-tidy / magic numbers in the test. The new test uses raw strncmp(..., 19) / strncmp(..., 10) literals, matching the existing style in tests/ (e.g. test_cli.c, test_httpd.c, test_pipeline.c). This is fine because test sources are not in LINT_SRCS, and CI's lint step runs cppcheck + clang-format only (clang-tidy is enforced locally, not in CI).
Labels. I don't have triage/write access to this repo, so I couldn't apply any labels (e.g. bug, stability/performance) to match cbm v0.8.1 silently deletes project DBs on "corrupt" detection — data loss with no recovery #557. Please add whatever is appropriate.

The startup/query integrity check unlink()'d a project's .db (plus its -wal/-shm sidecars) on any anomaly, causing total data loss with no backup, no recovery path, and no visible warning. The trigger is often a false positive: a WAL leftover after a SIGKILL, a binary variant swap, or a stale-lock break can all make a recoverable DB look corrupt. Replace the destructive unlink with a rename to <name>.db.corrupt.<ts> and emit a prominent stderr message so the user can inspect, recover, or attach the file to a bug report. Two call sites are fixed: - mcp.c resolve_store(): the reported path, hit on first query. - pipeline.c try_incremental_or_delete_db(): only preserves on real corruption; a healthy DB rebuilt due to a mode change is still deleted so .corrupt files don't accumulate on normal reindexes. Add a regression test (resolve_store_preserves_corrupt_db_issue557) that seeds a corrupt projects row, issues a query, and asserts the original .db is gone while a .corrupt sidecar remains. Signed-off-by: yasku <aguss.cba@gmail.com>

yasku · 2026-06-25T02:15:14Z

Working on fixing this ☺️

CodeQL flagged a time-of-check/time-of-use race in try_incremental_or_delete_db(): the file at db_path was probed with stat() and then operated on by name (rename/unlink), so the underlying file could change between the two. Drop the redundant stat() existence probe and derive existence from a query-mode open (cbm_store_open_path_query, which never creates the file): a NULL result means absent/unreadable, so we return without touching the path. This removes the stat()->rename() dataflow CodeQL reports and matches resolve_store() in mcp.c, which already uses this pattern. Behavior is unchanged (absent -> skip, healthy -> reindex, corrupt -> preserve as .corrupt). Signed-off-by: yasku <aguss.cba@gmail.com>

jonlimx mentioned this pull request Jun 25, 2026

cbm v0.8.1 silently deletes project DBs on "corrupt" detection — data loss with no recovery #557

Open

github-advanced-security AI found potential problems Jun 25, 2026

View reviewed changes

Comment thread src/pipeline/pipeline.c Fixed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(store): preserve corrupt DB instead of deleting (#557)#620

fix(store): preserve corrupt DB instead of deleting (#557)#620
yasku wants to merge 2 commits into
DeusData:mainfrom
yasku:fix/557-preserve-corrupt-db

yasku commented Jun 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

yasku commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yasku commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist

Notes on the lint checklist item (transparency)

Uh oh!

Uh oh!

yasku commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yasku commented Jun 25, 2026 •

edited

Loading