Retire pruned leaked connections so they can't be reused (fixes #9240)#9508
Open
nickjn92 wants to merge 1 commit into
Open
Retire pruned leaked connections so they can't be reused (fixes #9240)#9508nickjn92 wants to merge 1 commit into
nickjn92 wants to merge 1 commit into
Conversation
When a response body is abandoned (never closed), the GC-based leak detector in RealConnectionPool.pruneAndGetAllocationCount removes the leaked call's reference but, since square#7456, no longer marks the connection noNewExchanges. A cleanup pass prunes every connection yet evicts at most one, so a leaked connection that isn't the evicted one stays idle and eligible for reuse while its socket still holds the abandoned response's unread bytes. The next call to reuse it reads those bytes as its status line, throwing "ProtocolException: Unexpected status line" and stapling one response's body onto another. 4.x marked the connection noNewExchanges here; this restores that line so any pruned leaked connection is retired. Fixes square#9240
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #9240.
If you leak a response body (execute a call and never close the body), a later unrelated request can come back corrupted. It gets the previous response's bytes stapled onto the front of its own, which surfaces as
ProtocolException: Unexpected status line: .... On 4.x this same mistake was a harmless leak, the connection got retired and was never reused. On 5.x it can silently corrupt a different call.What's happening
RealConnectionPool.pruneAndGetAllocationCountnotices a leaked allocation (itsCallReferencewas garbage collected) and drops the reference. On 4.12.0 it also setconnection.noNewExchanges = trueso that connection could never be handed out again. #7456 removed that line from the leak path:references.removeAt(i) - connection.noNewExchanges = true if (references.isEmpty()) {It only bites under concurrency.
closeConnectionsprunes every connection in a single pass but only ever evicts one of them. If there is just one leaked connection it becomes the last allocation, gets marked idle, and the eviction path closes it (and setsnoNewExchanges), which is why the existingleakedAllocationtest still passes. But as soon as you have more than one leaked connection, the ones that did not get evicted are left withnoNewExchanges == false, sitting idle in the pool, still holding the abandoned response's unread bytes on the socket.isEligible()says yes andisHealthy()says yes (leftover readable bytes are not EOF), so the next call grabs that dirty connection and reads the stale bytes as its status line.Here is the lifecycle of the connection that got stapled, from a quick instrumented run:
The fix
Put the line back, so any leaked connection that gets pruned is retired right away:
references.removeAt(i) + connection.noNewExchanges = true if (references.isEmpty()) {That matches 4.12.0 and the other
noNewExchanges = truespots inconnectionBecameIdleandevictAll. I left out theconnectionListener.noNewExchanges(connection)call on purpose.pruneAndGetAllocationCountruns while the connection lock is held (it is called from insidecloseConnections), and everywhere else in this file the listener events are fired outside the lock. If you would rather have the leak path show up on the listener too, I can work the event out under the lock and fire it fromcloseConnectionsafter the prune. Happy to go either way.Test
I added
ConnectionPoolTest.prunedLeakedConnectionIsRetiredEvenWhenNotEvicted. It follows the existingleakedAllocationtest (sameallocateAndLeakAllocationhelper andawaitGarbageCollection()), but leaks two connections so a single cleanup pass leaves a survivor that was not evicted yet still has to be retired.It fails reliably without the fix (ran it four times, failed every time) and passes with it. The rest of
:okhttp:jvmTestis green. The only failures I saw were theFastFallbackTestcases that need IPv6 dual stack, and those fail the same way on a clean checkout.