embedding_q4gsw: support is_linear_weight packing by JulianCloudNTH · Pull Request #20639 · pytorch/executorch

JulianCloudNTH · 2026-06-30T17:26:09Z

Summary:
The WebGPU embedding_q4gsw op failed to load (Error::DelegateInvalidCompatibility, error 48): WebGPUGraph::build() threw is_linear_weight=true is unsupported and WebGPUBackend.cpp returned error 48 at load_forward.

The quantized_decomposed.embedding_4bit and torchao fusions in backends/vulkan/patterns/quantized_embedding.py repack the embedding weight into the 4-bit linear-layer nibble convention (low nibble = even dim, high nibble = odd dim) and emit et_vk.embedding_q4gsw.default(..., is_linear_weight=true) — always on OSS main, and for tied (embedding/LM-head-shared) weights via _detect_tied_linear_weight. The Vulkan runtime supports both packings via its compile-time _linear_weight shader variant; the WebGPU runtime rejected true outright, so any model whose embedding took that path failed to delegate.

This teaches the WebGPU runtime both packings. The handler now forwards is_linear_weight to the shader through the uniform (the spare _pad field is repurposed; struct size unchanged), and embedding_q4gsw.wgsl selects the nibble with use_high = is_even != is_linear_weight. The is_linear_weight=false path is byte-identical to before.

Differential Revision: D110211746

Summary: The WebGPU `embedding_q4gsw` op failed to load (`Error::DelegateInvalidCompatibility`, error 48): `WebGPUGraph::build()` threw `is_linear_weight=true is unsupported` and `WebGPUBackend.cpp` returned error 48 at `load_forward`. The `quantized_decomposed.embedding_4bit` and torchao fusions in `backends/vulkan/patterns/quantized_embedding.py` repack the embedding weight into the 4-bit linear-layer nibble convention (low nibble = even dim, high nibble = odd dim) and emit `et_vk.embedding_q4gsw.default(..., is_linear_weight=true)` — always on OSS main, and for tied (embedding/LM-head-shared) weights via `_detect_tied_linear_weight`. The Vulkan runtime supports both packings via its compile-time `_linear_weight` shader variant; the WebGPU runtime rejected `true` outright, so any model whose embedding took that path failed to delegate. This teaches the WebGPU runtime both packings. The handler now forwards `is_linear_weight` to the shader through the uniform (the spare `_pad` field is repurposed; struct size unchanged), and `embedding_q4gsw.wgsl` selects the nibble with `use_high = is_even != is_linear_weight`. The `is_linear_weight=false` path is byte-identical to before. Differential Revision: D110211746

pytorch-bot · 2026-06-30T17:26:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20639

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 4 Pending, 24 Unrelated Failures

As of commit a45d9ba with merge base e095c57 ():

NEW FAILURES - The following jobs have failed:

Build Presets / linux (linux, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
The process '/usr/bin/git' failed with exit code 128
Build Presets / linux (llm, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
The process '/usr/bin/git' failed with exit code 128
Build Presets / linux (pybind, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
pull / test-models-linux (linear, portable, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 62168390117dc213bfd2d636e2672f6347f6f80412321e7c24eda93bddb7ebe4 /exec failed with exit code 1
pull / unittest / macos / macos-job (gh)
The process '/usr/bin/git' failed with exit code 128

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-arm-backend-public-api-backward-compatibility / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-llama_runner_eager-linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu... / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-lora-multimethod-linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-mcu-cortex-m-backend / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux (emformer_join, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux (ic4, portable, linux.4xlarge.memory) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux (phi_4_mini, portable, linux.4xlarge.memory) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / test-models-linux-basic (mv3, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux-basic (vit, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest-buck / linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / unittest-buck / macos / macos-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest-editable / linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
pull / unittest-editable / macos / macos-job (gh) (detected as infra flaky with no log or failing log classifier)
Test ARM Backend / test-arm / test-backend-linux (arm_tosa_fp, models) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-lora-linux / linux-job (gh) (trunk failure)
The process '/usr/bin/git' failed with exit code 128
pull / test-models-linux (ic4, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh) (trunk failure)
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh) (trunk failure)
The process '/usr/bin/git' failed with exit code 128
pull / unittest / linux / linux-job (gh) (trunk failure)
Test QNN Backend / test-qnn / test-backend-linux (qnn, models) / linux-job (gh) (trunk failure)
Test WebGPU Backend / test-webgpu / test-backend-linux (webgpu, operators) / linux-job (gh) (trunk failure)
The process '/usr/bin/git' failed with exit code 128

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-06-30T17:26:20Z

@JulianCloudNTH has exported this pull request. If you are a Meta employee, you can view the originating Diff in D110211746.

github-actions · 2026-06-30T17:27:23Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2026

meta-codesync Bot added the meta-exported label Jun 30, 2026

meta-codesync Bot temporarily deployed to cadence June 30, 2026 17:26 Inactive

rascani approved these changes Jun 30, 2026

View reviewed changes

JulianCloudNTH merged commit 6d0b933 into pytorch:main Jun 30, 2026
164 of 198 checks passed

JulianCloudNTH deleted the export-D110211746 branch June 30, 2026 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

embedding_q4gsw: support is_linear_weight packing#20639

embedding_q4gsw: support is_linear_weight packing#20639
JulianCloudNTH merged 1 commit into
pytorch:mainfrom
JulianCloudNTH:export-D110211746

JulianCloudNTH commented Jun 30, 2026

Uh oh!

pytorch-bot Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

JulianCloudNTH commented Jun 30, 2026

Uh oh!

pytorch-bot Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20639

❌ 5 New Failures, 4 Pending, 24 Unrelated Failures

Uh oh!

meta-codesync Bot commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented Jun 30, 2026 •

edited

Loading

This PR needs a `release notes:` label