Skip to content

Single-pass expression analysis groundwork - answer type questions from ExpressionResults#5857

Open
ondrejmirtes wants to merge 222 commits into
2.2.xfrom
resolve-type-rewrite-2
Open

Single-pass expression analysis groundwork - answer type questions from ExpressionResults#5857
ondrejmirtes wants to merge 222 commits into
2.2.xfrom
resolve-type-rewrite-2

Conversation

@ondrejmirtes

@ondrejmirtes ondrejmirtes commented Jun 12, 2026

Copy link
Copy Markdown
Member

Groundwork for the "new world" where an expression is traversed once: after processExpr, its ExpressionResult knows the before/after scopes, the type (typeCallback) and the narrowing (specifyTypesCallback), composed from child results instead of re-walking subtrees. Handlers then stop implementing TypeResolvingExprHandler; the old entry points (MutatingScope::resolveType, the TypeSpecifier dispatcher) are guarded behind NewWorld::disableOldWorld() and get mass-deleted in PHPStan 3.0.

What's on the branch, bottom up:

  • Guards + ExpressionResultFactory: old-world type resolution entry points throw when NewWorld::disableOldWorld() is flipped (the migration meter); all ExpressionResult construction goes through a generated factory.
  • ExpressionResult carries beforeScope, expr, typeCallback, specifyTypesCallback and is stored per node in ExpressionResultStorage (layered O(1) duplicate()), replacing the stored before-Scope.
  • ExprHandler / TypeResolvingExprHandler split: resolveType/specifyTypes move to the sub-interface so handlers can shed them one by one.
  • ExpressionResultStorageStack: old-world consumers (TypeSpecifier dispatcher, extensions, rules below PHP 8.1, unconverted handlers' resolveType) keep working for converted handlers' nodes. Every scope shares the stack created by its internal scope factory; NodeScopeResolver pushes the storage of the analysis in progress through MutatingScope::pushExpressionResultStorage() (always popped in finally, throwing on imbalance), and MutatingScope answers from the stored result - or processes a synthetic node on demand. Scopes never reference a storage directly, so nothing pins the result graph with the cycle collector disabled in bin/phpstan. Also adds MutatingScope::applySpecifiedTypes - filterBySpecifiedTypes without Scope::getType().
  • First two migrations: ScalarHandler and ArrayHandler no longer implement TypeResolvingExprHandler. The array migration is a precision win the old world cannot reach: each item type is captured at its own evaluation point, so [$b = 1, $b + 1, $c = $b, $c + 2, $c++, $c] infers array{1, 2, 1, 3, 1, 2}.

Verified: full test suite green, make phpstan clean, and analysis memory back at baseline (no leak from the result graph despite gc_disable()).

Closes phpstan/phpstan#13944
Closes phpstan/phpstan#12207
Closes phpstan/phpstan#7155

Closes phpstan/phpstan#2032

Closes phpstan/phpstan#10786

Closes phpstan/phpstan#13253
Closes phpstan/phpstan#14396

🤖 Generated with Claude Code

return $this->withFlavor(false);
}

private function withFlavor(bool $fiber): self

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this read withFiber?

ondrejmirtes and others added 20 commits June 30, 2026 13:23
Not all ExprHandlers will be TypeResolvingExprHandler coming into the future.
Instead of resolveType+specifyTypes, they will pass callbacks
into ExpressionResult doing similar job.
…esults

Old-world consumers (TypeSpecifier dispatcher, extensions, rules below
PHP 8.1, unconverted handlers' resolveType) keep working for nodes whose
handler no longer implements TypeResolvingExprHandler: every scope shares
the ExpressionResultStorageStack created by its internal scope factory,
NodeScopeResolver pushes the storage of the analysis in progress through
MutatingScope::pushExpressionResultStorage() (always popped in finally),
and MutatingScope resolves such nodes from the stored result - or by
processing a synthetic node on demand.

Also adds MutatingScope::applySpecifiedTypes (filterBySpecifiedTypes
without Scope::getType()) and the specifyTypesCallback slot on
ExpressionResult consulted by getTruthyScope()/getFalseyScope().

The cycle collector is disabled in bin/phpstan - scopes deliberately
never reference a storage directly, only the stack. Popping severs the
stack->storage edge when an analysis ends, so retained scopes do not pin
the whole result graph.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Each item type is captured at its own evaluation point in the sequence,
so [$b = 1, $b + 1, $c = $b, $c + 2, $c++, $c] infers
array{1, 2, 1, 3, 1, 2} - the old world resolves all items on a single
scope and cannot do this.

Until the item handlers (BinaryOp, inc/dec, Assign) migrate themselves,
the items resolve as their own results' before-scope evaluation instead
of cascading getTypeForScope(). Narrowing stays on the fallback path -
it is identical to the removed specifyDefaultTypes body.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
An unbalanced push/pop is the one way the ambient storage design can
still be misused - fail immediately instead of silently answering later
type questions from the wrong storage.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…sionResults

Both handlers stop implementing TypeResolvingExprHandler and wire
typeCallback + specifyTypesCallback, so their truthy/falsey scopes flow
through MutatingScope::applySpecifiedTypes - the first real exercise of
the new-world narrowing machinery.

The old-world TypeSpecifier dispatcher answers nodes of converted
handlers from the stored ExpressionResult
(MutatingScope::specifyTypesOfNewWorldHandlerNode), processes synthetic
nodes on demand (the 'foo' === $a::class rewrite builds synthetic
Instanceof_ nodes), and falls back to specifyDefaultTypes when the
result carries no callback - which is exactly what such handlers used
to implement.

Exercising the machinery surfaced two gaps in applySpecifiedTypes:

- Expressions not tracked in the scope lost their sureNot narrowing
  ($var->name instanceof Identifier ? ... : ... stopped narrowing the
  else branch). The current type to intersect with or subtract from is
  now priced from the stored ExpressionResult
  (getCurrentTypesOfSpecifiedExpr) instead of Scope::getType().
- Only Yes-certainty holders hold the current type of an expression.
  A Maybe-certainty holder holds the when-defined type (falsy after an
  or-merge in nsrt/bug-pr-339.php), which the certainty-aware
  Scope::getType() never returned - narrowing against it produced
  NeverType and mis-fired conditional expression holders.

AssignHandler's placeholder result for an assignment target now carries
VariableHandler::createTypeCallback - every stored result for a
converted handler's node type must answer type questions itself.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Closes phpstan/phpstan#2032

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A sure specification (e.g. is_string($a)) can only hold for a defined
variable, so it still makes the variable defined inside the branch -
one error at the test site, no cascade. Removing a type from a
certainly-undefined variable proves nothing about its definedness.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…:create()

How a type constraint on a node translates into narrowing entries is the
producing handler's knowledge, declared on its ExpressionResult - never
re-derived by unwrapping the AST elsewhere.

DefaultNarrowingHelper (recreated from the first rewrite attempt) carries
the default boolean-context narrowing (one sureNot entry, no type ask, no
nullsafe chain-walking) and createSubjectTypes(): ask the subject result's
createTypesCallback, fall back to a single sure/sureNot entry for the
subject node. No purity gates, no structural unwrapping.

AssignHandler fans a type constraint out to the assigned variable and the
assigned expression - nested assignments compose through the assigned
expression's own result, which is what will delete unwrapAssign.
CoalesceHandler delegates to its left side when the type rules the right
side in or out, so ($e ?? null) instanceof Foo narrows $e.

AssignHandler also wires specifyTypesCallback: the assigned variable
narrows by the boolean outcome, plus the $arr[$key] inference after
$key = array_key_first/array_key_last/array_search/array_find_key. The
null-context inferences stay in the old-world specifyTypes() - result-based
asks are always truthy or falsey.

specifyTypesCallbacks no longer touch TypeSpecifier: VariableHandler uses
DefaultNarrowingHelper, InstanceofHandler narrows its subject through
createSubjectTypes(). TypeSpecifierTest's assign-in-instanceof expectations
hold unchanged - the new channel reproduces create()'s emission exactly.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ingExprHandler

The composite handlers wire typeCallback and specifyTypesCallback composed
from their operands' results. This deletes the founding pathology of the
rewrite: BooleanAndHandler::resolveType's re-walk of the left operand on a
throwaway storage, BOOLEAN_EXPRESSION_MAX_PROCESS_DEPTH and both
flattened-chain code paths - deep chains compose through nested results.

Child narrowing flows through DefaultNarrowingHelper::getChildSpecifiedTypes():
the child result's specifyTypesCallback first, bridged through the old-world
dispatcher for unmigrated children (the dispatcher answers converted handlers
from stored results, so the bridge terminates; it dies in 3.0). The ternary
still rewrites itself as (cond && if) || (!cond && else) - the synthetic
takes the on-demand path where its real subnodes answer from stored results.

Lessons the conversion forced out of the engine:

- A handler must never ask the scope about its own node mid-processing -
  no stored result exists yet, so the ask takes the on-demand path and
  recurses infinitely (CoalesceHandler's filterByFalseyValue($expr) for
  the right-side scope hung the suite). The equivalent narrowing is built
  directly from the left result instead.
- Composite typeCallbacks evaluate later operands on their captured
  processing scopes. Re-filtering the asking scope loses the left side's
  side effects (by-ref writes, inline assignments); the child result's own
  point breaks synthetic compositions (min()'s $a < $b ? $a : $b reuses
  stored results of the real arg nodes, predating the synthetic's branch
  narrowing). The captured scope has both; native asks flavor it with
  doNotTreatPhpDocTypesAsCertain().
- ExpressionResult::getType()/getNativeType()/getTypeForScope() consult
  tracked expression holders before the typeCallback, mirroring the early
  return in MutatingScope::resolveType() - that is how the nullsafe
  handlers' ensured non-nullability of ($x ?? null) reaches type asks.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…perties

Closes phpstan/phpstan#10786

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…nullable

Every handler and synthetic ExpressionResult already wires a specifyTypesCallback
(AssignHandler's AssignRef branch now wires the default narrowing explicitly instead of
passing null), so the callback is required, not optional. getSpecifiedTypesForScope
therefore always returns SpecifiedTypes - drop the nullable return and the now-dead
null handling at the call sites, including the filterByTruthyValue/filterByFalseyValue
fallbacks in getTruthyScope/getFalseyScope.
…hyValue, in leaf handlers

The dynamic-name (property/static-property/method/static-call/variable), nullsafe
(property/method) and coalesce/assign-op handlers narrowed a scope by a synthetic
condition (Identical(name, const), NotIdentical(var, null), isset(left)) with
Scope::filterByTruthyValue()/filterByFalseyValue() - the old-world appliers the new
world must not call. Reprocess the synthetic condition with processExprOnDemand() and
apply its getSpecifiedTypesForScope() narrowing instead. VariableHandler::createTypeCallback
now takes the NodeScopeResolver to do so. The recursive BooleanAnd/Or/Match conditional-
holder narrowing is left for a dedicated pass.
…ary/suppress callbacks

The cast, cast-to-string, clone, bitwise-not, unary +/- and error-suppress type
callbacks read their single operand via $exprResult->getTypeForScope($s); since the
callback's scope is now always the result's beforeScope, this equals the operand's own
getType()/getNativeType() (picked by $s->nativeTypesPromoted) - the only remaining scope
use. This drops per-constant-array-item closure reanalysis for these wrappers (the cast
inside array_map is resolved once over the unioned element type), so two array_map
assertions are relaxed accordingly; that reanalysis will be restored separately.
… yield-from, interpolated-string callbacks

Continue removing the scope from the type callbacks: these handlers propagate a single
child result's type ($varResult/$callResult/$exprResult/the interpolation parts) via
getTypeForScope($s); replace with the child's getType()/getNativeType() picked by
$s->nativeTypesPromoted, leaving the native flag as the only scope use.
… callback

The ternary type callback read the condition's boolean via
$ternaryCondResult->getTypeForScope($s); replace with getType()/getNativeType() picked
by $s->nativeTypesPromoted, so the callback uses the scope only for the native flag (and
the captured branch scopes).
… of re-reading the slot

When a write to an array slot updated a byref-intertwined variable, assignVariable
re-read the slot via $scope->getType($assignedExpr) - re-evaluating the stored
ArrayDimFetch node, whose captured array variable predates the array and only resolves
correctly when the node is re-walked on the asking scope. Compute the slot value directly
from the just-assigned root variable's type instead (walking offsets, or applying the new
value to the iteratee for foreach-byref), so no stored node is re-evaluated on a foreign
scope. This lets the ArrayDimFetch type callback read its operands via getType/getNativeType
(dropping its scope use down to the native flag) without breaking reference propagation.
The binary-op type callback read its left/right operands via getTypeForScope($scope);
since the callback's scope is the operands' evaluation point, this equals their own
getType()/getNativeType() picked by $scope->nativeTypesPromoted. The earlier attempt at
this regressed array_map/array_filter precision, but that was an interaction with the
since-fixed ArrayDimFetch reference handling; with that in place the operand reads convert
cleanly. The callback still uses the scope for synthetic on-demand reads and the
richer-scope identical/equal helpers.
…pe callbacks

The !/&&/|| type callbacks read the left (or sole) operand's boolean via
getTypeForScope($s); replace with getType()/getNativeType() picked by
$s->nativeTypesPromoted. The right operand keeps reading its captured left-truthy/
left-falsey scope (the evaluation point), so the scope use drops to the native flag. This
touches only the type reads, not the truthy/falsey narrowing, so the nested-boolean
analysis stays linear (depth-14 unchanged).
The property-fetch type callback read the receiver via getTypeForScope($s) and resolved
the property reflection on $s. The receiver is now read from the operand result via
getType()/getNativeType(), and the property's class/visibility/assign context (which is
lexical) is taken from the captured beforeScope, so the callback no longer needs the
asking scope. The dynamic $obj->$name branch resolves each constant name on beforeScope
instead of narrowing the asking scope per name. This drops per-constant-array-item
reanalysis for property fetches inside array_map (e.g. array_map(fn($c) => $c->value,
Enum::cases())), so three bug-14649 enum assertions are relaxed accordingly.
Mirror the property-fetch change for static property fetches: read the class-expression
type from the operand result via getType()/getNativeType(), resolve a Name class and the
static property reflection (getStaticPropertyReflection) on the captured beforeScope (the
lexical class/visibility/assign context), and resolve the dynamic Foo::${$name} branch on
beforeScope. The type callback no longer needs the asking scope.
Mirror the property-fetch change for method calls: read the receiver type from the operand
result via getType()/getNativeType(), and resolve the method reflection and dynamic return
type extensions (methodCallReturnType) on the captured beforeScope - the lexical context.
The dynamic $obj->$name() branch resolves each constant name on beforeScope rather than
narrowing the asking scope per name. resolveReturnType now takes the reflection scope and a
nativeTypesPromoted flag instead of the asking scope; the throw-point caller passes the
processing scope with the phpdoc flag.
Mirror the method-call change for static calls: read the class-expression type from the
operand result via getType()/getNativeType() (or resolveTypeByName on beforeScope for a
Name class), and resolve the method reflection and dynamic return type extensions on the
captured beforeScope. The dynamic Foo::{$name}() branch resolves each constant name on
beforeScope. resolveReturnType takes the reflection scope and a nativeTypesPromoted flag
instead of the asking scope.
Mirror the method-call change for function calls: resolveReturnType now takes the
reflection scope and a nativeTypesPromoted flag instead of the asking scope. The name/arg
operands are read from their results via getType()/getNativeType(), and the function
reflection lookup, callable-parameter acceptors, ArgumentsNormalizer and the dynamic
function return type extensions all run on the captured beforeScope (the lexical context).
The throw-point caller passes the processing scope with the phpdoc flag. This drops
per-constant-array-item reanalysis for closures whose body is a function call inside
array_map (array_map(fn($v) => strval($v), ...)), so one array-map assertion is relaxed.
The nullsafe ?-> / ?->() type callbacks read the non-null inner result via
getTypeForScope($s) and applied the null-removal narrowing (var !== null) to the asking
scope. Read the inner result via getType()/getNativeType() and apply the null-removal
narrowing to the captured beforeScope (the evaluation point), pricing the synthetic
fetch/call with the native-aware on-demand helper. The callback now uses the scope only
for the native flag.
…etNativeType

Prepare these type callbacks for the scope-free signature: the assignment callbacks read
the assigned value via getType()/getNativeType() (the no-result fallback prices the RHS on
beforeScope), and the class-constant-fetch callback reads its class-expression operand via
getType()/getNativeType(). The scope param is now used only for the native flag.
The instanceof callback reads its expression and class operands via getType()/getNativeType().
The coalesce callback runs its isset resolution and left-is-set narrowing on the captured
beforeScope, and reads the right side from its own result (processed on the left-is-null
scope). Both callbacks now use the scope only for the native flag.
The assign-op callback's operand reader and the ??= coalesce branch (current expression
storage + on-demand pricing) now run on the captured beforeScope, with native-vs-phpdoc
selected by the native flag. The callback uses the scope only for the native flag.
NodeScopeResolver already resolves the foreach iteratee's type (and native type) by
reprocessing the iteratee on the (possibly non-empty-narrowed) originalScope. Pass those
into MutatingScope::enterForeach instead of having it re-read $originalScope->getType()/
getNativeType() on the iteratee - removing two Scope::getType calls from the foreach path.
Mirror the enterForeach change for enterForeachKey: take the already-resolved iteratee
type and native type as arguments instead of re-reading $originalScope->getType()/
getNativeType() on the iteratee, removing two more Scope::getType calls from the foreach
key path.
…t a scope

Every typeCallback closure now takes (bool $nativeTypesPromoted) instead of a
MutatingScope. It reads operand types from its captured child ExpressionResults
(getType()/getNativeType()) and any scope-dependent lookups from the captured
beforeScope, so an expression's type no longer depends on the scope it is asked
on - the single-pass goal that lets getTypeForScope() be memoized later.

ExpressionResult invokes the callback with false for getType(), true for
getNativeType(), and $scope->nativeTypesPromoted for getTypeForScope().

Enabling changes that land with it:
- VariableHandler reads the variable from its captured beforeScope.
- The assign/narrowing machinery (specifyExpressionType, addTypeToExpression,
  removeTypeFromExpression, unsetExpression) reads narrowable subjects from the
  scope's tracked state via a new getScopeStateType helper instead of
  Scope::getType (which would route back through the now scope-independent
  callbacks and read the stale beforeScope value).
- A few remaining engine Scope::getType callers move off it: getKeepVoidType,
  expressionTypeIsUnchangeable, getRealParameterDefaultValues (via
  InitializerExprTypeResolver) and IssetabilityDescriptor (via getVariableType).
…peCallback

A result with neither cannot answer its own type - getType()/getNativeType() would
fall through to beforeScope->getType()/getNativeType() with nothing backing them. No
construction site does this (closures/arrow functions set type+nativeType eagerly,
everything else sets a typeCallback), so guard the invariant in the constructor next
to the existing mutual-exclusion check.
…getType

On the tracked-holder path (typeCallback set but a narrowed holder for the whole
expression wins), getType()/getNativeType()/getTypeForScope()/getNativeTypeForScope()
fell back to MutatingScope::getType()/getNativeType(), re-entering the guard, the
resolvedTypes cache and the new-world dispatch only to land back on the holder. Add
MutatingScope::getTrackedExpressionType() - the same holder read resolveType() does for
its tracked early return - and call it directly. The constructor guard guarantees a
typeCallback is set when type is null, so this branch is only ever the tracked-holder
case and the holder is known to exist.
getTruthyScope()/getFalseyScope() applied the specifyTypesCallback narrowing to the
result's own scope, which for &&/|| is the merge of the two operand scopes. That was
wrong in two ways:

- the merge demotes a by-ref/side-effect definition made in the right operand (e.g. $m
  from $this->match(..., $m) on the right of an &&) to maybe-defined, losing it; and
- applying the whole &&/|| narrowing re-applies the LEFT operand's narrowing on top of a
  scope where the right operand reassigned the narrowed variable - e.g. ctype_digit($foo)
  re-applied to the int $foo after $foo = intval($foo) in
  `!ctype_digit($foo) || ($foo = intval($foo)) < 1`, giving int<48,57>|int<256,max>
  instead of int<1, max> (bug-9400).

&& is truthy (|| is falsey) only when the right operand was evaluated - on the left-truthy
(left-falsey) scope - and is itself truthy (falsey). That is exactly
$rightResult->getTruthyScope() ($rightResult->getFalseyScope()): it already carries the
left operand's narrowing (baked in by processing the right side on the left-narrowed
scope) and the right's by-ref definitions, and does not re-narrow a reassigned variable.
Pass it as truthyScopeOverride/falseyScopeOverride on ExpressionResult; only BooleanAnd and
BooleanOr need it, as the only handlers that merge operand scopes. The narrowing exposed to
parents still comes from specifyTypesCallback, so dropping its scope parameter later is
unaffected.
A `$result = preg_match($p, $s, $matches)` assignment records a conditional holder
"$result truthy -> $matches = <matched shape>". Building it intersected the narrowed shape
with $matches's current type, read via readStoredOrPriceOnDemand(). But preg_match writes
$matches by ref: the write lands in the scope's tracked variable type ($matches becomes
array{}|array{matched}), while the stored ExpressionResult from the earlier `$matches = []`
is left untouched. readStoredOrPriceOnDemand() returned that stale array{}, so the holder
recorded array{} & matched = NEVER, and `if ($result)` narrowed $matches to *NEVER*.

Read the holder expression's current type from the scope state for tracked variables -
getVariableType(), which is null-safe for superglobals and undefined variables - where a
by-ref write may have updated the type. Non-variable holder exprs (method calls etc.) have
no by-ref hazard and keep reading their stored result.
…ngExpr

NodeScopeResolver::findEarlyTerminatingExpr() reached Scope::getType() to recognise
never-returning expression statements, violating the single-pass invariant (the engine
must consume ExpressionResults, not Scope::getType()).

Move the early-terminating method/function recognition into the call handlers via a shared
EarlyTerminatingCallHelper: a MethodCall/StaticCall/FuncCall configured as early-terminating
now resolves to an explicit NeverType. The expression statement's exit point then follows
from the result's value type being an explicit never - which already covers exit/die/throw
and signature-never calls - so findEarlyTerminatingExpr and its getType() call are gone.

The earlyTerminatingMethodCalls/earlyTerminatingFunctionCalls lists move off
NodeScopeResolver's constructor onto the helper as DI parameters; the test-case overrides
are replaced by a nodeScopeResolverEarlyTerminating.neon parameter file.
Compose a child expression's narrowing directly from its ExpressionResult.
Where the result is null or nullable (synthetic casts/ternary condition, the
assigned expression, equality operands), obtain it first via the new
MutatingScope::obtainResultForNode() - the stored result, or an on-demand walk
against a duplicate of the current storage - then ask it. This is what
specifyTypesOfNewWorldHandlerNode() already did internally; it now delegates to
the same primitive. Drop the now-unused DefaultNarrowingHelper dependency from
the three handlers that no longer narrow through it.
@ondrejmirtes ondrejmirtes force-pushed the resolve-type-rewrite-2 branch from 125cf22 to 10c7ee9 Compare June 30, 2026 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants