Skip to content

perf(SDK-6463): prune excluded dirs from zip/md5 walks — minutes-long 'Creating tests.zip' stall on monorepos#1142

Open
Bhargavi-BS wants to merge 1 commit into
masterfrom
sdk-6463-zip-md5-dir-pruning
Open

perf(SDK-6463): prune excluded dirs from zip/md5 walks — minutes-long 'Creating tests.zip' stall on monorepos#1142
Bhargavi-BS wants to merge 1 commit into
masterfrom
sdk-6463-zip-md5-dir-pruning

Conversation

@Bhargavi-BS

Copy link
Copy Markdown
Collaborator

SDK-6463 (follow-up) — minutes-long stall creating tests.zip on large monorepos

Customer (Sapiens, NX monorepo, ~120 exclude globs, home_directory: "./") reported very slow test-setup upload. Their log shows the time is NOT the upload (3s, small zip) — it's spent before it:

17:06:58 info: Creating tests.zip with files in ./          <- start
17:08:24 info: Uploading the tests to BrowserStack          <- ~86s later

plus a similar hidden cost earlier in the spec-md5 cache check.

Root cause (verified)

Three full-tree filesystem walks of the monorepo root, none of which prune excluded directories:

  1. Zip walkarchiver.glob(pattern, { ignore }) (bin/helpers/archiver.js). archiver delegates to readdir-glob, whose ignore filters entries after the walker has already readdir+lstat-ed them. It therefore descends into node_modules, .git, dist, … visiting every file just to discard it. The library's separate skip option is the pruning mechanism — the CLI never used it.
  2. Spec-md5 walkcheckSpecsMd5hashUtil uses readdir-glob the same way: a second identical full-tree walk.
  3. Telemetry walkutils.fetchFolderSize(node_modules) (runs.js) recursively stats the entire node_modules, blocking between archive and upload, only to report a folder size.

On the customer's Windows machine (per-file stat + AV scanning) with a real NX monorepo this is minutes per run.

Fix

  • utils.getDirectorySkipPatterns(ignoreFiles) — reuse every ignore pattern ending in /** as a readdir-glob skip pattern; wired into both the archive glob and the md5 walk.
    Safety proof: skip prunes a directory whose relative path matches the pattern. A directory matching <x>/** is inside <x>, so every one of its descendants also matches <x>/** and was going to be ignored anyway — pruning cannot change which files are included. (Pruning engages one level below each excluded root: e.g. node_modules/** skips every child of node_modules, paying a single readdir of node_modules itself.)
  • fetchFolderSize — cooperative 5s deadline; folder size is best-effort telemetry and must never stall a run.

Verification (real CLI code, synthetic 144k-file monorepo, customer's exact 130 exclude patterns)

Walk Before After Output
archiveSpecs (tests.zip) 7.5s 0.1s zip entry lists byte-identical
checkSpecsMd5 5.5s 0.0s md5 unchanged (aeceb90f…) → upload-cache keys unaffected, no forced re-uploads

Direct readdir-glob measurement on the same tree: no-skip walk 3.04s / 140,883 entries visited vs skip 0.08s / 2,083 — the excluded trees are simply never entered.

Unit tests added for getDirectorySkipPatterns; suite: 675→677 passing, 0 new failures (16 pre-existing on master).

Scope

Separate from PR #1131 (config/a11y) and #1139 (a11y fail-guard): this addresses the customer's "test setup zip upload taking a lot of time" report independently.

🤖 Generated with Claude Code

…emetry size walk (SDK-6463)

On large monorepos the CLI stalled for minutes before uploading:
1. archiver.glob and the spec-md5 walk (hashUtil via checkUploaded) pass excludes as readdir-glob 'ignore', which filters entries only AFTER the walker has descended into and lstat'ed every file under node_modules/.git/dist/etc. The customer's log showed ~86s in 'Creating tests.zip' and a similar hidden cost in the md5 step.
2. utils.fetchFolderSize(node_modules) — telemetry only — recursively stats the entire node_modules between archiving and uploading.

Fix: reuse every ignore pattern ending in '/**' as readdir-glob's 'skip' option (getDirectorySkipPatterns), which prevents descending into matching directories. This is provably safe: a directory matching '<x>/**' means every descendant also matches, so pruning cannot change the archive contents or the md5. Verified on a synthetic 144k-file monorepo with the customer's exact 130 exclude patterns: zip 7.5s -> 0.1s and md5 5.5s -> 0.0s, with byte-identical zip entry lists and an UNCHANGED md5 (upload cache keys unaffected). Also adds a 5s cooperative deadline to the folder-size telemetry walk so it can never stall a run.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@Bhargavi-BS Bhargavi-BS requested a review from a team as a code owner July 5, 2026 20:50
@Bhargavi-BS Bhargavi-BS requested review from osho-20 and rahulpsq July 5, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant