Skip to content

Perf: medium benchmark ~52% slower since d96cfed (regression lives mostly in @query-doctor/core 0.7.0→0.7.2) #115

@veksen

Description

@veksen

TL;DR

Between d96cfed (2026-03-30) and 9b7973e (2026-04-17, HEAD), the query
optimization loop got ~52% slower on a synthetic medium workload
(20 tables, 100 queries): 1,152ms → 1,753ms mean.

~85% of the regression lives inside @query-doctor/core, not in this repo.
The analyzer's own code only contributes ~5%.

Benchmark file added in branch veksen/add-benchmarks at
src/remote/optimizer.bench.ts (vitest bench).

Data (medium bench, 5 iterations + 2 warmup)

Commit Date Core Mean (ms) RME vs baseline
d96cfed 2026-03-30 0.6.0 1,152 ±1.85% baseline
78a3d81 2026-03-31 0.7.0 1,184 ±1.46% +3%
00be1d0 2026-04-01 0.7.0 1,159 ±1.71% +1%
22102cf 2026-04-07 0.7.1 FAIL broken
e529e95 2026-04-08 0.7.1 FAIL broken
148b438 2026-04-09 0.7.2 1,565 ±2.06% +36%
f80156a 2026-04-09 0.7.2 1,575 ±1.96% +37%
b912a45 2026-04-09 0.8.0 1,669 ±5.10% +45%
fb456b2 2026-04-09 0.8.0 1,645 ±2.03% +43%
5b05e3e 2026-04-09 0.8.0 1,728 ±3.92% +50%
7b0a1b9 2026-04-17 0.8.0 1,750 ±1.70% +52%
9b7973e 2026-04-17 0.8.1 1,753 ±3.24% +52%

Breakdown — four regression steps

1. core 0.7.0 → 0.7.2 (biggest, +33%)

Commit 148b438 is a pure package.json version bump. +406ms of latency lives
inside @query-doctor/core. We have no 0.7.1 data point because core 0.7.1
crashes with invalid input syntax for type oid: \"\" (see below) — fixed in
0.7.2.

2. core 0.7.2 → 0.8.0 (+6%)

Commit b912a45, also a pure version bump. Another ~95ms inside core.

3. fix: remove relpages reference (+5%) — this repo

Commit 5b05e3e.
Three-line change removing relpages: 1 from
QueryOptimizer.defaultStatistics and relpages: 200_000 from runner.ts.

 private static readonly defaultStatistics: StatisticsMode = {
   kind: \"fromAssumption\",
-  relpages: 1,
   reltuples: 10_000,
 };

The bench passes its own fromStatisticsExport stats (with computed
relpages), so defaultStatistics is not used directly. The slowdown
therefore implies core 0.8.0 is either ignoring caller-provided relpages or
has a different cost estimation path when the default is missing. Worth
verifying.

4. core 0.8.0 patches + 0.8.1 (~flat)

7b0a1b9 and 9b7973e are within measurement noise of each other.

Side issue: core 0.7.1 is broken

On core 0.7.1 (22102cf, e529e95), every optimization fails with:

Error with optimization error: invalid input syntax for type oid: \"\"

Already fixed in 0.7.2, but worth a note: if anyone pinned 0.7.1 in a
downstream project, they have a hard failure, not a perf regression.

Suggested next steps

  1. Diff @query-doctor/core 0.7.0..0.7.2 — this is the biggest chunk
    (+33%) and lives outside this repo. Likely hot paths:
    IndexOptimizer.run(), Statistics.dumpStats(),
    Statistics.getExistingIndexes(). Enabling the core's built-in
    IndexOptimizer({ trace: true }) on the bench at 0.7.0 vs 0.7.2 and
    diffing the trace would pinpoint what changed.
  2. Investigate the relpages interaction in core 0.8.x. The bench
    provides relpages in inline stats but still slows down. Is the provided
    value being ignored, or does the defaults-path do something extra that the
    benchmark inadvertently hit?
  3. Wire the benchmark into CI via CodSpeed. @codspeed/vitest-plugin
    wraps vitest bench and flags regressions on PRs in a hardware-normalized
    environment. Given ~85% of the regression lived in a dependency bump, we
    would have caught this on the PR that did chore: bump core to 0.7.2.

How to reproduce

# benchmark branch
git checkout veksen/add-benchmarks
npm install
npm run bench   # small (3t/5q) / medium (20t/100q) / large (300t/1000q)

For bisecting between commits, the branch also has .context/run_bisect_bench.sh
as a template. It creates a git worktree per commit, runs the medium bench,
and writes results to .context/bench_bisect_results.csv.

Methodology note: a first pass with 3 iterations + 1 warmup produced RME

80% on the medium bench and misleading means (the large bench at HEAD looked
3x slower when it wasn't). Use ≥5 iterations + 2 warmup to get RME in the
±2-5% range.

Environment

  • macOS (Darwin 24.6.0), Apple Silicon
  • postgres:17 via @testcontainers/postgresql
  • vitest 4.0.18 (bench, powered by tinybench)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions