fix(google): use realtime input for Gemini 3.1 generateReply#1199
fix(google): use realtime input for Gemini 3.1 generateReply#1199rahulmanuwas wants to merge 1 commit intolivekit:mainfrom
Conversation
🦋 Changeset detectedLatest commit: 81139ae The changes in this PR will be included in the next version bump. This PR includes changesets to release 22 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Historical note from April 3, 2026. This comment reflects an earlier draft state before the later fixes in this PR branch and is kept only as debugging context. The current branch-state summary is the newer comment on this PR. At the time of this note:
That earlier failure is why the branch later moved away from the 2.5-style mid-session |
421ac56 to
b575539
Compare
|
Historical note from early April 4, 2026. This comment reflects an intermediate branch state before the later fixes in this PR branch and is kept only as debugging context. The current branch-state summary is the newer comment on this PR. At the time of this note, a docs-aligned Gemini 3.1 history-seeding attempt still failed, which narrowed the problem away from just |
|
Maintainer note: this comment is the current branch-state summary and supersedes the earlier April 3/4 exploratory notes below. Head: Current scope of this PR:
Validation for the code on this branch:
Manual runtime status for
So the branch is now intentionally narrower than the earlier spike comments: it fixes |
030ffbb to
81139ae
Compare
|
After livekit/agents#5413 merged on April 11, 2026, I checked From what I can tell, #1229 is a narrower draft mirror of the Python change. That makes sense as a parity step, but it does not appear to address the #1199 was intended to cover that remaining JS gap: switch Gemini 3.1 continuation to If you'd prefer this split differently for review, I'm happy to do that. I can break #1199 into a narrower parity PR and a follow-up PR for the |
Summary
generateReply()forgemini-3.1-flash-live-previewby avoiding mid-sessionsendClientContent(...)sendRealtimeInput({ text: instructions ?? "." })for Gemini 3.1 continuation insteadfunction_call_outputdelivery by still sending restricted-model tool results astool_response@livekit/agents-plugin-googleProblem
generateReply()was using a Gemini 2.5-style continuation flow based on mid-sessionsendClientContent(...). That path works on 2.5, butgemini-3.1-flash-live-previewrejects it withRequest contains an invalid argument.What This PR Changes
For Gemini 3.1,
generateReply()now triggers generation withsendRealtimeInput(...)instead of synthesizing a fakeclientContentturn.updateChatCtx()is split intentionally for restricted models:function_call_outputitems still go out astool_responsecontentevents are dropped as a safety netValidation
pnpm exec vitest run plugins/google/src/beta/realtime/realtime_api.test.tspnpm --filter @livekit/agents-plugin-google... buildManual
generateReply()results:gemini-3.1-flash-live-previewwith instructions: successgemini-3.1-flash-live-previewwithout instructions: successgemini-2.5-flash-native-audio-preview-12-2025control path: successExplicit Non-Goals
This PR does not claim full server-side chat-history support for Gemini 3.1 on the current JS SDK path. Initial or updated non-tool history still needs a supported provider/SDK mechanism.
Refs #1197.