[ExecuTorch][WebGPU] Dynamic resize hooks for add and mul#20577
[ExecuTorch][WebGPU] Dynamic resize hooks for add and mul#20577JulianCloudNTH wants to merge 4 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20577
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 8ea5c50 with merge base db5f1ba ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review and check for any areas or opportunities for modularization |
|
Claude finished @JulianCloudNTH's task in 2m 16s —— View job Review: Dynamic resize hooks for
|
Stack from ghstack (oldest at bottom):
Make the elementwise add and mul ops serve any live shape from one graph.
Problem:
aten.add.Tensorandaten.mul.Tensorbaked their element count + param UBO(s) + output shape atbuild()for the max shape. On a dynamic-shape graph at a smaller live shape they would over-dispatch and leave the output sized at the max.Solution:
workgroup_count_x, and sets the outputcur_dims. Inert until an operand is resized.Implementation:
add: out follows the larger operand (robust when one input is a static residual and the other is the dynamic-S tensor); rewritesAddParams.mul: recomputes the broadcast output shape and rebuilds all threeTensorMetaUBOs viafill_tensor_meta_broadcast.own_uniform_bufferinstead of releasing at build.resize_*_node(recompute sizes + dispatch each execute).Constraints: Behavior-neutral on static graphs (the hook fires only when an operand's live shape differs from the max). No kernel/WGSL/numerics change.
Co-authored-with: Claude Code.
@exported-using-ghexport
Differential Revision: D109906093
Differential Revision: D109906093