fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351
fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351brunovsiqueira wants to merge 4 commits intosupabase:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses race conditions in packages/gotrue token refresh flow where an in-flight refresh could overwrite a newer session established concurrently (e.g., via signIn, signOut, or setSession), and improves refresh de-duplication behavior.
Changes:
- Add a monotonic session version counter to detect and discard stale refresh results.
- Replace the previous global refresh completer with a same-token de-dup + different-token serialization mechanism.
- Add a new test suite covering common token refresh race scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| packages/gotrue/lib/src/gotrue_client.dart | Introduces session version guarding and a serialized refresh queue with same-token de-duplication. |
| packages/gotrue/test/src/token_refresh_race_test.dart | Adds unit tests to validate refresh concurrency behavior and stale-result handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I'm starting to think if this is the correct solution to fix this. Couldn't we more adopt how auth-js does this? I haven't looked too much into their solution, but they are locking their session, which could be a more generally solution. |
|
@Vinzent03 JS also have many issues related to token refresh, so JS may not be the right solution, I think both needs to rethink how token refresh is implemented, not saying this is the right approach, I didn't have enough time yet to review it. |
- Replace single _activeRefresh record with Map<String, Completer> to fix A-B-A dedup bug: refresh(A) → refresh(B) → refresh(A) now correctly deduplicates the second A call instead of enqueuing a third network request. - Bump _sessionVersion in updateUser and setInitialSession, which were writing _currentSession directly and bypassing the version guard. - Add _isDisposed flag checked in _executeRefresh to prevent mutating state or emitting events on closed stream controllers after dispose(). - Pass stack traces to completeError for better debugging. - Add A-B-A dedup test case.
|
@brunovsiqueira thanks a lot for your contribution, those are a lot of complex changes, and really hard scenarios to manually test. Did you face all the issues this PR claims to fix when using the library? Also, can you fix CI by formatting the test file, that seems like the only issue. |
|
Hey @grdsdev I added unit tests to reproduce the race conditions, then did the refactor, and the race conditions tests got fixed. Thats how I "tested" it, but I didnt experience them myself. Fixed the test file formatting in ad7c342. |
|
Sorry, it was @grdsdev who suggested applying the same pattern to token refresh 😅 |
Summary
Fixes race conditions in
GoTrueClient._callRefreshToken()where an in-flight token refresh could overwrite a newer session established by a concurrentsignIn,signOut, orsetSessioncall._saveSession/_removeSessionnow bump a monotonic_sessionVersioncounter. After the refresh network round-trip completes,_executeRefreshchecks whether the version changed — if so, the stale result is discarded instead of overwriting the newer session.Completerpattern ignored therefreshTokenparameter —_callRefreshToken(tokenB)would silently return tokenA's in-flight future. Now, same-token calls de-duplicate correctly while different-token calls are serialized via a Future chain (same pattern as the lifecycle reconnection in fix(supabase_flutter): simplify lifecycle reconnection with serial Future chain #1340)._autoRefreshTokenTick: reads_currentSessioninto a local variable once so thatrefreshTokenandexpiresAtcome from the same snapshot._removeSession()+ emitssignedOutif the session was replaced by a concurrent operation while the refresh was in-flight. Allcompleter.complete/completeErrorcalls are guarded with!completer.isCompletedfor dispose safety._isDisposedflag prevents_executeRefreshfrom mutating state or emitting events on closed stream controllers afterdispose().Test results: before vs after
All 8 race condition tests were run against
main(old code) and this branch. 7 of 8 fail onmain, all 8 pass after the fix:Related issues
Test plan
packages/gotrue/test/src/token_refresh_race_test.dartdart analyzeclean (only pre-existing deprecation infos)