[pull] main from danny-avila:main#71
Merged
pull[bot] merged 2 commits intoinnFactory:mainfrom Apr 29, 2026
Merged
Conversation
…nting (#122) * fix: filter deferred/programmatic tools out of toolSchemaTokens accounting calculateInstructionTokens summed every entry in this.tools regardless of whether it would be bound to the model. getToolsForBinding applies filterToolsForBinding, which excludes tools whose toolRegistry entry has defer_loading: true (and isn't yet discovered) or whose allowed_callers exclude 'direct'. The accounting path skipped that filter, so toolSchemaTokens reported the worst-case ceiling instead of the bound subset. At low maxContextTokens this triggered spurious empty_messages preflight rejections (e.g. 47705 reported tokens vs ~13K actually sent), with an error message that misled operators into thinking they had to trim MCP servers when the real prompt fit comfortably. Mirror the binding-path filter on this.tools inside calculateInstructionTokens. graphTools (handoff/subagent) are framework-managed and remain unfiltered to match getToolsForBinding. Fixes #121 * fix: skip instance-tool filter in event-driven mode for token accounting Codex review on #122 flagged that the previous fix filtered `this.tools` unconditionally inside calculateInstructionTokens. That diverges from getEventDrivenToolsForBinding, which appends `this.tools` UNFILTERED when toolDefinitions are present — so instance tools that have a deferred or non-direct registry entry would still be bound to the model but no longer counted, under-estimating instruction overhead. Mirror getToolsForBinding's branching exactly: only run filterToolsForBinding on `this.tools` in the non-event-driven path. In event-driven mode, deferred tools are represented through toolDefinitions and counted via getActiveToolDefinitions — `this.tools` must stay unfiltered to match what is bound. Add a regression test covering the event-driven branch. * fix: align getTokenBudgetBreakdown toolCount with bound tool subset Codex review on #122 flagged that getTokenBudgetBreakdown still computed toolCount from raw `this.tools.length + getActiveToolDefinitions().length`, so the diagnostic reported an inflated tool count whenever instance tools were deferred or non-direct in the registry — even though the preceding fix made toolSchemaTokens accurate. The "Tool definitions consume X tokens across N tools" empty_messages error in Graph.ts:987 is exactly the operator-confusion this PR is meant to eliminate, and leaving N inflated half-defeats the fix. Derive toolCount from `getToolsForBinding()?.length`. That's the same source `calculateInstructionTokens` now mirrors, so toolSchemaTokens and toolCount stay aligned in all paths (non-event-driven, event-driven, with or without graphTools). Add a regression test mirroring the existing toolDefinitions one for the instance-tools path. * refactor: single-source-of-truth for instance tool selection + filter programmatic-only toolDefinitions in accounting Addresses follow-up review findings on #122: 1. Extract `getEffectiveInstanceTools()` as the single source of truth for "which entries of `this.tools` are bound to the model". All three sites — `getToolsForBinding`, `getEventDrivenToolsForBinding`, and `calculateInstructionTokens` — now derive from it. Previously, the event-driven gate and the registry-filter decision were duplicated across `calculateInstructionTokens` and `getToolsForBinding`, with no structural mechanism preventing future drift. The original miscount bug was caused by exactly that drift. 2. `getActiveToolDefinitions` now also filters by `allowed_callers`, matching the gate `getEventDrivenToolsForBinding` applies. Previously, programmatic-only definitions (e.g. `allowed_callers: ['code_execution']`) were never bound to the model but still inflated `toolSchemaTokens` — the same class of accounting/binding mismatch this PR fixes for the `this.tools` path. Adds a regression test mirroring the existing deferred-toolDefinitions one. 3. The JSDoc rewrite for the new helper eliminates the `getEventDriven-/ToolsForBinding` line-wrapping that broke search. `getEventDrivenToolsForBinding` now also reuses `getActiveToolDefinitions()` instead of inlining the same filter. * test: cover toolCount alignment with graphTools Follow-up review on #122 noted that the prior commit's switch from `(this.tools?.length ?? 0) + getActiveToolDefinitions().length` to `getToolsForBinding()?.length` is a behavioral improvement (graphTools are now reflected in toolCount, matching what's already counted in toolSchemaTokens) but had no test coverage. A regression that dropped graphTools from getToolsForBinding's return value would have gone undetected. Adds a regression test asserting toolCount includes graphTools.
…tings (#120) * feat: distinguish inherited inputs from generated outputs in tool listings When the codeapi sandbox echoes back unchanged input passthroughs (skill files, downloaded inputs whose hash matched, inherited .dirkeep markers), it now flags them with `inherited: true` on the FileRef. The post-execution formatter previously rendered every entry under a single "Generated files:" header with the message "File is already downloaded by the user", which led the LLM to: 1. Report skill files as if it had just generated them ("here are the files I produced: ...") even when the run only referenced them. 2. Invent fictional source paths like /mnt/user-data/uploads/pptx/... trying to find the "originals" of files it thought it had created. Fix: split the listing into two sections. Generated files: - /mnt/data/report.pdf | File is already downloaded by the user ... Available files (inputs, not generated by this execution): - /mnt/data/pptx/SKILL.md | Available as an input — already known to the user ... A trailing note clarifies that "Available files" entries are inputs the user (or a skill) already provided and should not be presented as new outputs. Changes: - `src/types/tools.ts`: `FileRef` gains optional `inherited?: true` field, mirroring the codeapi sandbox response shape. - `src/tools/CodeExecutor.ts`: extract a shared `renderFileSection` helper (handles header, image-vs-other message, and joining). Update the formatter to split inherited from generated files. Export the helper for reuse. - `src/tools/BashExecutor.ts`: import `renderFileSection` from CodeExecutor; same split logic. - `src/tools/ProgrammaticToolCalling.ts`: same split logic in `formatCompletedResponse`. - `src/tools/__tests__/ProgrammaticToolCalling.test.ts`: two new cases — split section ordering with mixed files, and inherited-only case (no Generated files header emitted). Existing 69 tests still pass; 71/71 with the new cases. Pre-existing LLM-spec failures on main (anthropic/bedrock/openai SDK type drift) are not affected. * v3.1.74
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )