Skip to content

fix: fallback when backend rejects flash attention#1732

Merged
leejet merged 1 commit into
masterfrom
fix/flash-attn-backend-fallback
Jul 2, 2026
Merged

fix: fallback when backend rejects flash attention#1732
leejet merged 1 commit into
masterfrom
fix/flash-attn-backend-fallback

Conversation

@leejet

@leejet leejet commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Summary

  • Check backend support on the raw ggml_flash_attn_ext op before applying precision or KV rescale wrapping.

Related Issue / Discussion

N/A

Additional Information

N/A

Checklist

@leejet leejet merged commit 2574f59 into master Jul 2, 2026
11 checks passed
@leejet leejet deleted the fix/flash-attn-backend-fallback branch July 2, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant