Skip to content

Perf: skip redundant full-image mask on StdShiftIntensity nonzero=False path#8975

Open
aymuos15 wants to merge 1 commit into
Project-MONAI:devfrom
aymuos15:perf/stdshift-nonzero
Open

Perf: skip redundant full-image mask on StdShiftIntensity nonzero=False path#8975
aymuos15 wants to merge 1 commit into
Project-MONAI:devfrom
aymuos15:perf/stdshift-nonzero

Conversation

@aymuos15

@aymuos15 aymuos15 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Fixes #8972 .

Description

On the default nonzero=False path StdShiftIntensity._stdshift built an all-True boolean mask the size of the image and shifted through it (img[slices] = img[slices] + offset), forcing a full advanced-index gather and scatter plus the mask allocation and .any() scan even though every voxel is selected. That is equivalent to shifting the image directly, so the nonzero=False branch now does img + factor * std(img). Output is bit-for-bit identical and the nonzero=True path is untouched. RandStdShiftIntensity benefits too, since it also defaults to nonzero=False. Measured across 2D/3D, single and multi channel, float32 and float64 on both backends (best-of-3, CPU); output verified equal in every configuration:

shape dtype backend current (ms) proposed (ms) speedup
1x256x256 f32 numpy 0.123 0.048 2.58x
1x256x256 f32 torch 0.563 0.039 14.47x
1x1024x1024 f32 numpy 3.016 1.294 2.33x
1x1024x1024 f32 torch 7.040 0.274 25.66x
1x1024x1024 f64 torch 9.644 0.601 16.04x
1x64x64x64 f32 numpy 0.637 0.251 2.54x
1x64x64x64 f64 torch 2.646 0.095 27.84x
1x128x128x128 f32 numpy 7.019 2.914 2.41x
1x128x128x128 f32 torch 41.618 0.630 66.09x
1x128x128x128 f64 torch 44.534 1.825 24.40x
4x96x96x96 f32 torch 68.152 1.357 50.23x
4x160x160x160 f32 numpy 128.429 82.662 1.55x
4x160x160x160 f32 torch 342.641 16.727 20.48x
4x160x160x160 f64 torch 354.009 41.652 8.50x

numpy ranges 1.55x to 2.65x and torch 8.5x to 66x across the full sweep; the largest gains are on torch, where all-True boolean-mask indexing is especially costly.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e9dd9305-ba98-4634-b9a3-0e985fcb4e08

📥 Commits

Reviewing files that changed from the base of the PR and between f72c648 and 2e4fafe.

📒 Files selected for processing (1)
  • monai/transforms/intensity/array.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • monai/transforms/intensity/array.py

📝 Walkthrough

Walkthrough

StdShiftIntensity._stdshift now computes std(img) directly on the default nonzero=False path and returns img + factor * std(img). The nonzero=True path still builds slices = img != 0, checks for any nonzero elements, and applies the shift to img[slices] in place.

File Change Summary
monai/transforms/intensity/array.py Simplified _stdshift for the default full-image path; kept masked behavior for nonzero inputs.

Estimated code review effort: 1 (Trivial) | ~3 minutes

Related issues: #8972

Suggested labels: performance, transforms

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the performance optimization to StdShiftIntensity on the nonzero=False path.
Description check ✅ Passed The description matches the template well, includes the fix, a change summary, and the non-breaking change type.
Linked Issues check ✅ Passed The code change matches #8972 by removing the redundant full-image mask path while leaving nonzero=True behavior intact.
Out of Scope Changes check ✅ Passed The PR appears scoped to the requested StdShiftIntensity optimization and adds no unrelated changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@aymuos15 aymuos15 force-pushed the perf/stdshift-nonzero branch from a9f6888 to f72c648 Compare July 2, 2026 21:15
On the default nonzero=False path the boolean mask is all-True, so
img[slices] just gathers and scatters the whole image. Shift the image
directly instead, avoiding the mask allocation, .any(), gather and
scatter. Output is unchanged.

Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
@aymuos15 aymuos15 force-pushed the perf/stdshift-nonzero branch from f72c648 to 2e4fafe Compare July 2, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

StdShiftIntensity does redundant full-image mask + gather + scatter on the default nonzero=False path

1 participant