[refactor](fe) Refactor SetPreAggStatus traversal to use per-scan context map instead of shared stack#65095
[refactor](fe) Refactor SetPreAggStatus traversal to use per-scan context map instead of shared stack#65095starocean999 wants to merge 1 commit into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Automated review completed. I did not find a substantiated blocking issue in this PR.
Critical checkpoint conclusions:
- Goal: the PR replaces the shared stack in
SetPreAggStatuswith per-RelationIdcontexts. The changed traversal matches that goal and avoids sharing scan context between sibling branches. - Scope: the change is focused to the pre-aggregation status rewriter plus two regression
explainexpectations. - Optimizer correctness: the snapshot/diff pattern was checked against
DefaultPlanRewriter.visitChildren,LogicalFilter,LogicalJoin,LogicalProject,LogicalAggregate, andLogicalRepeat. The handled nodes preserve context until the nearest aggregate consumes it; generic nodes remove only contexts introduced below that node, preserving pre-existing sibling/ancestor contexts. - Parallel paths: rule placement remains in the existing late rewrite stage, and the second registration path still uses the same custom rule. I did not find a missed sibling join/aggregate path in the changed code.
- Configuration, concurrency, lifecycle, persistence, compatibility, FE/BE protocol, and storage format: not applicable; this PR does not add or alter those surfaces.
- Tests: the regression suite updates the two right-join sibling aggregate cases from
preagg_t3OFF to ON, which targets the reported shared-context behavior. No.outartifact is expected for theseexplain { contains ... }assertions. - Validation: exact PR-range
git diff --checkpassed. I did not run FE build or regression tests because this runner lacks.worktree_initialized,thirdparty/installed, and executablethirdparty/installed/bin/protoc.
User focus: no additional user-provided review focus was present.
Subagent conclusions: optimizer-rewrite reported NO_NEW_VALUABLE_FINDINGS. tests-session-config proposed TSC-1 for trailing whitespace, but I dismissed it because it came from git show --check HEAD in a grafted checkout; the exact base-to-head git diff --check passes and the PR patch changes only two assertion strings. Final convergence round 1 ended with both subagents reporting NO_NEW_VALUABLE_FINDINGS against the same ledger and empty proposed inline comment set.
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 30121 ms |
TPC-DS: Total hot run time: 174162 ms |
ClickBench: Total hot run time: 25.47 s |
FE Regression Coverage ReportIncrement line coverage |
…text map instead of shared stack
|
/review |
|
run buildall |
Related PR: #48502
Problem Summary:
The SetPreAggStatus rule uses a bottom-up traversal to determine whether
pre-aggregation can be enabled for OlapScan nodes. The original implementation
used a
Stack<PreAggInfoContext>as the traversal context, which had afundamental flaw: multiple scan nodes under sibling subtrees (e.g., both sides
of a Join) incorrectly shared the same stack-level context.
This caused two problems:
of the plan tree.
visitLogicalAggregateusedretainAll()to filter outscan nodes that were not actually children of the current aggregate — an
inelegant patch that partially masked the underlying issue.
Root cause: The stack-based context could not guarantee a one-to-one
correspondence between a scan node and its nearest aggregate ancestor.
Fix: Replace the
Stack<PreAggInfoContext>with aMap<RelationId, PreAggInfoContext>, where each scan node gets its own dedicated context entry.Each visitor method uses a snapshot-diff pattern:
At
LogicalAggregate, child scan contexts are consumed (removed from the map)and their results are stored — no more
retainAll()filtering needed.None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)