Skip to content

[pull] main from danny-avila:main#71

Merged
pull[bot] merged 2 commits intoinnFactory:mainfrom
danny-avila:main
Apr 29, 2026
Merged

[pull] main from danny-avila:main#71
pull[bot] merged 2 commits intoinnFactory:mainfrom
danny-avila:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 29, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…nting (#122)

* fix: filter deferred/programmatic tools out of toolSchemaTokens accounting

calculateInstructionTokens summed every entry in this.tools regardless
of whether it would be bound to the model. getToolsForBinding applies
filterToolsForBinding, which excludes tools whose toolRegistry entry has
defer_loading: true (and isn't yet discovered) or whose allowed_callers
exclude 'direct'. The accounting path skipped that filter, so
toolSchemaTokens reported the worst-case ceiling instead of the bound
subset.

At low maxContextTokens this triggered spurious empty_messages preflight
rejections (e.g. 47705 reported tokens vs ~13K actually sent), with an
error message that misled operators into thinking they had to trim MCP
servers when the real prompt fit comfortably.

Mirror the binding-path filter on this.tools inside
calculateInstructionTokens. graphTools (handoff/subagent) are
framework-managed and remain unfiltered to match getToolsForBinding.

Fixes #121

* fix: skip instance-tool filter in event-driven mode for token accounting

Codex review on #122 flagged that the previous fix filtered `this.tools`
unconditionally inside calculateInstructionTokens. That diverges from
getEventDrivenToolsForBinding, which appends `this.tools` UNFILTERED
when toolDefinitions are present — so instance tools that have a
deferred or non-direct registry entry would still be bound to the model
but no longer counted, under-estimating instruction overhead.

Mirror getToolsForBinding's branching exactly: only run
filterToolsForBinding on `this.tools` in the non-event-driven path.
In event-driven mode, deferred tools are represented through
toolDefinitions and counted via getActiveToolDefinitions — `this.tools`
must stay unfiltered to match what is bound.

Add a regression test covering the event-driven branch.

* fix: align getTokenBudgetBreakdown toolCount with bound tool subset

Codex review on #122 flagged that getTokenBudgetBreakdown still computed
toolCount from raw `this.tools.length + getActiveToolDefinitions().length`,
so the diagnostic reported an inflated tool count whenever instance
tools were deferred or non-direct in the registry — even though the
preceding fix made toolSchemaTokens accurate. The "Tool definitions
consume X tokens across N tools" empty_messages error in Graph.ts:987
is exactly the operator-confusion this PR is meant to eliminate, and
leaving N inflated half-defeats the fix.

Derive toolCount from `getToolsForBinding()?.length`. That's the same
source `calculateInstructionTokens` now mirrors, so toolSchemaTokens
and toolCount stay aligned in all paths (non-event-driven, event-driven,
with or without graphTools). Add a regression test mirroring the
existing toolDefinitions one for the instance-tools path.

* refactor: single-source-of-truth for instance tool selection + filter
programmatic-only toolDefinitions in accounting

Addresses follow-up review findings on #122:

1. Extract `getEffectiveInstanceTools()` as the single source of truth
   for "which entries of `this.tools` are bound to the model". All three
   sites — `getToolsForBinding`, `getEventDrivenToolsForBinding`, and
   `calculateInstructionTokens` — now derive from it. Previously, the
   event-driven gate and the registry-filter decision were duplicated
   across `calculateInstructionTokens` and `getToolsForBinding`, with no
   structural mechanism preventing future drift. The original miscount
   bug was caused by exactly that drift.

2. `getActiveToolDefinitions` now also filters by `allowed_callers`,
   matching the gate `getEventDrivenToolsForBinding` applies. Previously,
   programmatic-only definitions (e.g. `allowed_callers: ['code_execution']`)
   were never bound to the model but still inflated `toolSchemaTokens` —
   the same class of accounting/binding mismatch this PR fixes for the
   `this.tools` path. Adds a regression test mirroring the existing
   deferred-toolDefinitions one.

3. The JSDoc rewrite for the new helper eliminates the
   `getEventDriven-/ToolsForBinding` line-wrapping that broke search.

`getEventDrivenToolsForBinding` now also reuses `getActiveToolDefinitions()`
instead of inlining the same filter.

* test: cover toolCount alignment with graphTools

Follow-up review on #122 noted that the prior commit's switch from
`(this.tools?.length ?? 0) + getActiveToolDefinitions().length` to
`getToolsForBinding()?.length` is a behavioral improvement (graphTools
are now reflected in toolCount, matching what's already counted in
toolSchemaTokens) but had no test coverage. A regression that dropped
graphTools from getToolsForBinding's return value would have gone
undetected.

Adds a regression test asserting toolCount includes graphTools.
…tings (#120)

* feat: distinguish inherited inputs from generated outputs in tool listings

When the codeapi sandbox echoes back unchanged input passthroughs
(skill files, downloaded inputs whose hash matched, inherited
.dirkeep markers), it now flags them with `inherited: true` on the
FileRef. The post-execution formatter previously rendered every entry
under a single "Generated files:" header with the message "File is
already downloaded by the user", which led the LLM to:

1. Report skill files as if it had just generated them ("here are the
   files I produced: ...") even when the run only referenced them.
2. Invent fictional source paths like /mnt/user-data/uploads/pptx/...
   trying to find the "originals" of files it thought it had created.

Fix: split the listing into two sections.

  Generated files:
    - /mnt/data/report.pdf | File is already downloaded by the user
    ...

  Available files (inputs, not generated by this execution):
    - /mnt/data/pptx/SKILL.md | Available as an input — already known to the user
    ...

A trailing note clarifies that "Available files" entries are inputs
the user (or a skill) already provided and should not be presented as
new outputs.

Changes:

- `src/types/tools.ts`: `FileRef` gains optional `inherited?: true`
  field, mirroring the codeapi sandbox response shape.

- `src/tools/CodeExecutor.ts`: extract a shared `renderFileSection`
  helper (handles header, image-vs-other message, and joining). Update
  the formatter to split inherited from generated files. Export the
  helper for reuse.

- `src/tools/BashExecutor.ts`: import `renderFileSection` from
  CodeExecutor; same split logic.

- `src/tools/ProgrammaticToolCalling.ts`: same split logic in
  `formatCompletedResponse`.

- `src/tools/__tests__/ProgrammaticToolCalling.test.ts`: two new
  cases — split section ordering with mixed files, and inherited-only
  case (no Generated files header emitted).

Existing 69 tests still pass; 71/71 with the new cases. Pre-existing
LLM-spec failures on main (anthropic/bedrock/openai SDK type drift)
are not affected.

* v3.1.74
@pull pull Bot locked and limited conversation to collaborators Apr 29, 2026
@pull pull Bot added the ⤵️ pull label Apr 29, 2026
@pull pull Bot merged commit 7993680 into innFactory:main Apr 29, 2026
1 check passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant