Skip to content

fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351

Open
brunovsiqueira wants to merge 4 commits intosupabase:mainfrom
brunovsiqueira:fix/gotrue-token-refresh-races
Open

fix(gotrue): prevent stale token refresh from overwriting concurrent session changes#1351
brunovsiqueira wants to merge 4 commits intosupabase:mainfrom
brunovsiqueira:fix/gotrue-token-refresh-races

Conversation

@brunovsiqueira
Copy link
Copy Markdown
Contributor

@brunovsiqueira brunovsiqueira commented Apr 13, 2026

Summary

Fixes race conditions in GoTrueClient._callRefreshToken() where an in-flight token refresh could overwrite a newer session established by a concurrent signIn, signOut, or setSession call.

  • Session version guard: _saveSession / _removeSession now bump a monotonic _sessionVersion counter. After the refresh network round-trip completes, _executeRefresh checks whether the version changed — if so, the stale result is discarded instead of overwriting the newer session.
  • Token-aware dedup: the old Completer pattern ignored the refreshToken parameter — _callRefreshToken(tokenB) would silently return tokenA's in-flight future. Now, same-token calls de-duplicate correctly while different-token calls are serialized via a Future chain (same pattern as the lifecycle reconnection in fix(supabase_flutter): simplify lifecycle reconnection with serial Future chain #1340).
  • TOCTOU fix in _autoRefreshTokenTick: reads _currentSession into a local variable once so that refreshToken and expiresAt come from the same snapshot.
  • Defensive error handling: a failed refresh no longer calls _removeSession() + emits signedOut if the session was replaced by a concurrent operation while the refresh was in-flight. All completer.complete/completeError calls are guarded with !completer.isCompleted for dispose safety.
  • Dispose guard: _isDisposed flag prevents _executeRefresh from mutating state or emitting events on closed stream controllers after dispose().

Test results: before vs after

All 8 race condition tests were run against main (old code) and this branch. 7 of 8 fail on main, all 8 pass after the fix:

Test Before (main) After (this PR) Failure on main
signOut during in-flight refresh does not restore session Session restored after signOut (expected null, got Session)
signIn during in-flight refresh preserves new session Old refresh overwrote new session
concurrent refresh with same token deduplicates (already worked)
concurrent refresh with different tokens serializes Second token silently dropped (1 request instead of 2)
A-B-A pattern deduplicates correctly Second token silently dropped
dispose completes active refresh with error Timed out (30s) — completer never resolved
refresh failure does not sign out if session changed Session nulled despite concurrent signIn
refresh failure signs out when session unchanged signedOut event not emitted

Related issues

Test plan

  • 8 race condition tests in packages/gotrue/test/src/token_refresh_race_test.dart
  • All existing gotrue unit tests pass
  • All supabase_flutter tests pass
  • dart analyze clean (only pre-existing deprecation infos)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses race conditions in packages/gotrue token refresh flow where an in-flight refresh could overwrite a newer session established concurrently (e.g., via signIn, signOut, or setSession), and improves refresh de-duplication behavior.

Changes:

  • Add a monotonic session version counter to detect and discard stale refresh results.
  • Replace the previous global refresh completer with a same-token de-dup + different-token serialization mechanism.
  • Add a new test suite covering common token refresh race scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
packages/gotrue/lib/src/gotrue_client.dart Introduces session version guarding and a serialized refresh queue with same-token de-duplication.
packages/gotrue/test/src/token_refresh_race_test.dart Adds unit tests to validate refresh concurrency behavior and stale-result handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/gotrue/test/src/token_refresh_race_test.dart
Comment thread packages/gotrue/lib/src/gotrue_client.dart Outdated
Comment thread packages/gotrue/lib/src/gotrue_client.dart
Comment thread packages/gotrue/lib/src/gotrue_client.dart Outdated
Comment thread packages/gotrue/lib/src/gotrue_client.dart
@Vinzent03
Copy link
Copy Markdown
Collaborator

I'm starting to think if this is the correct solution to fix this. Couldn't we more adopt how auth-js does this? I haven't looked too much into their solution, but they are locking their session, which could be a more generally solution.

@grdsdev
Copy link
Copy Markdown
Contributor

grdsdev commented Apr 22, 2026

@Vinzent03 JS also have many issues related to token refresh, so JS may not be the right solution, I think both needs to rethink how token refresh is implemented, not saying this is the right approach, I didn't have enough time yet to review it.

- Replace single _activeRefresh record with Map<String, Completer> to
  fix A-B-A dedup bug: refresh(A) → refresh(B) → refresh(A) now
  correctly deduplicates the second A call instead of enqueuing a third
  network request.

- Bump _sessionVersion in updateUser and setInitialSession, which were
  writing _currentSession directly and bypassing the version guard.

- Add _isDisposed flag checked in _executeRefresh to prevent mutating
  state or emitting events on closed stream controllers after dispose().

- Pass stack traces to completeError for better debugging.

- Add A-B-A dedup test case.
@grdsdev
Copy link
Copy Markdown
Contributor

grdsdev commented Apr 24, 2026

@brunovsiqueira thanks a lot for your contribution, those are a lot of complex changes, and really hard scenarios to manually test.

Did you face all the issues this PR claims to fix when using the library?

Also, can you fix CI by formatting the test file, that seems like the only issue.

@brunovsiqueira
Copy link
Copy Markdown
Contributor Author

brunovsiqueira commented Apr 24, 2026

Hey @grdsdev
I didnt experience the issues myself. It was done as a follow up from #1340 where @icnahom suggested applying the same serial queue pattern to token refresh, and the race conditions matched symptoms reported in #1158.

I added unit tests to reproduce the race conditions, then did the refactor, and the race conditions tests got fixed. Thats how I "tested" it, but I didnt experience them myself.

Fixed the test file formatting in ad7c342.

@icnahom
Copy link
Copy Markdown
Contributor

icnahom commented Apr 24, 2026

Hey @grdsdev I didnt experience the issues myself. It was done as a follow up from #1340 where @icnahom suggested applying the same serial queue pattern to token refresh, and the race conditions matched symptoms reported in #1158.

Not familiar with the token refresh issue, but I can investigate.

@brunovsiqueira
Copy link
Copy Markdown
Contributor Author

Sorry, it was @grdsdev who suggested applying the same pattern to token refresh 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto Refresh is not working

5 participants