fix: report unique queries (matches site) instead of log-entry count in CI comment#107
fix: report unique queries (matches site) instead of log-entry count in CI comment#107
Conversation
There was a problem hiding this comment.
Query Doctor Analysis
2 queries analyzed
2 pre-existing issues
SELECT "guests"."id", "guests"."session_id", "guests"."username", "guests"."avatar_path", "guests"."color", "guests"."side", "guests"."audio_recording_path", "guests"."audio_recording_public", "gue...
indexassets(event_id, inserted_at desc)
cost 31,003,449 → 1,493 (100% reduction)SELECT * FROM guest_ip_addresses WHERE ip_address = '127.0.0.1';
indexguest_ip_addresses(ip_address)
cost 154,402 → 8 (100% reduction)
Using assumed statistics (10000000 rows/table). For better results, sync production stats.
4825d2a to
2d5aaa4
Compare
3aac4a7 to
7b0a1b9
Compare
The `total` counter was incrementing during log parsing, counting raw entries before deduplication and optimization. This caused the CI comment to show a different query count than the site, which only displays queries with uploadable states (improvements_available, no_improvement_found, error). Compute `total` from the deduplicated, optimized results filtered to uploadable states so the CI comment and site always agree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract UPLOADABLE_STATES and countUploadableQueries as exported functions so they can be tested. Tests verify: - Only improvements_available, no_improvement_found, error are counted - not_supported, timeout, waiting, optimizing are excluded - countUploadableQueries agrees with buildQueries for every state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…alyzed Drop UPLOADABLE_STATES / countUploadableQueries — they duplicated the filter in buildQueries. Use buildQueries(allResults, config).length directly so the CI comment count is derived from the same list we upload to the site. Rename queryStats.total to queryStats.analyzed across the type, the success template, and test fixtures. Restore the `Matched N unique queries out of M log entries` debug log using recentQueries.length (no separate counter). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9984e70 to
d007327
Compare
Perhaps the text |
Happy to review the text, created follow-up: #120 |
Summary
The CI comment's "X queries analyzed" disagreed with the query list on the site — most dramatically on the sin-bot PR 12 run, which reported 4236 queries analyzed in the PR comment while the site showed 74.
Root cause: the reported count was computed at ingest time from the number of
plan:log entries that passed filtering, not from the number of unique queries we actually analyze. The same 74 distinct queries ran ~57× each across 457 vitest specs, inflating the headline by ~57×.This change:
queryStats.analyzed(renamed fromqueryStats.total) usingbuildQueries(allResults, config).length— the exact same list used to upload to the site, so the headline always matches what the user sees.totalcounter from the ingest loop.countUploadableQueries/UPLOADABLE_STATEShelpers — they duplicated the filter that already lives inbuildQueries.Matched N unique queries out of M log entriesdebug log by usingrecentQueries.length(no extra counter needed).ReportStatistics.total→ReportStatistics.analyzed(docstring: "Number of unique queries analyzed and uploaded to the site"). Consumers updated in the success template and test fixtures.Why this is the right fix
Before deciding on the dedup approach, we added a diagnostic step to sin-bot's CI workflow that split the log by phase and counted
plan:/pg_catalog/information_schema/@qd_introspectionlines. Results on the sin-bot run:plan:entriesdrizzle-kit migratenpm test(457 specs)Key takeaways that shaped this PR:
information_schematraffic reachesplan:entries in either phase.isSystemQuery/information_schemafilter would not have materially moved the number.Given that, delegating to
buildQueriesis strictly better than any heuristic filter: there is one filter definition (insite-api.ts), and the CI comment is guaranteed to report whatever we actually uploaded.Test plan
npx vitest run src/runner.test.ts src/reporters/github/github.test.tspasses (20/20)total→analyzedrenamerunner.test.tsassertsbuildQueries(allResults).lengthcounts exactly the queries reported to the site on a mixed-state inputFollow-up (not in this PR)
.context/query-count-handoff.mdin this workspace and documents the ruled-out hypotheses so we don't retrace them.🤖 Generated with Claude Code