Skip to content

fix: recalculate isLatest after status changes (#1081)#1201

Merged
rdimitrov merged 3 commits intomainfrom
fix/1081-recalculate-latest-on-status-change
Apr 25, 2026
Merged

fix: recalculate isLatest after status changes (#1081)#1201
rdimitrov merged 3 commits intomainfrom
fix/1081-recalculate-latest-on-status-change

Conversation

@rdimitrov
Copy link
Copy Markdown
Member

@rdimitrov rdimitrov commented Apr 24, 2026

Summary

Fixes #1081. When the version flagged is_latest was soft-deleted, no other version was promoted, and because GetCurrentLatestVersion didn't filter deleted rows, subsequent publishes with a lower semver never became latest — leaving /v0/servers/{name}/versions/latest permanently 404.

This PR has two parts:

1. Code fix — recalculate is_latest on every status change

  • Recalculate is_latest after every status mutation (UpdateServerStatus, UpdateAllVersionsStatus, and the inline status change in UpdateServer), in the same transaction that already holds the per-server advisory lock.
  • Winner selection reuses the existing CompareVersions logic: highest non-deleted version gets the flag; if every version is deleted, the highest deleted version keeps it so admin lookups via includeDeleted=true still resolve the server (important for restore flows).
  • New SetLatestVersion DB primitive issues two statements (clear old, set new) — idx_unique_latest_per_server is a non-deferrable unique partial index, so a single UPDATE that flips two rows trips the constraint.

2. One-shot migration — heal servers already affected (no manual SQL needed)

014_heal_is_latest.sql promotes the highest non-deleted version on every server whose is_latest flag is on a deleted row (or absent entirely). Picker parses major.minor.patch numerically and falls back to published_at for non-semver or as a tiebreaker. Heals io.github.containers/kubernetes-mcp-server to 0.0.61 automatically and catches any other servers silently affected by the original bug, without hardcoded version numbers.

Picker behavior vs. CompareVersions:

  • Standard M.m.p semver: matches exactly, including the backport-after-delete case (e.g. published 1.0.1 then 1.0.0 hotfix after deleting 2.0.0 → picks 1.0.1, not 1.0.0).
  • Non-semver: falls through to published_at DESC. Reasonable.
  • Prereleases (1.0.0-alpha vs 1.0.0): both parse to (1,0,0), tiebreak by published_at. Diverges from CompareVersions (which always says 1.0.0 > 1.0.0-alpha). Edge case requires (a) server hit by this bug AND (b) prereleases at the top of the version list. Acceptable; the publisher can flip it with one mcp-publisher status call (which now triggers proper recalc thanks to part 1).

Verification

End-to-end curl repro of the exact scenario from the issue, run against both main and this branch:

Before (main):

publish 1.0.0 → 200
publish 0.0.59 → 200
PATCH .../1.0.0/status {deleted} → 200
GET /versions/latest → 404            ← bug
publish 0.0.60 → 200
GET /versions/latest → 404            ← bug persists, 0.0.60 has isLatest=False

After (this branch):

publish 1.0.0 → 200
publish 0.0.59 → 200
PATCH .../1.0.0/status {deleted} → 200
GET /versions/latest → 200 (0.0.59, isLatest=true)
publish 0.0.60 → 200
GET /versions/latest → 200 (0.0.60, isLatest=true)

Test plan

  • Service tests covering: soft-delete latest → next highest active promoted; the exact isLatest not recalculated when latest version is deleted; new versions below deleted latest never promoted #1081 repro end-to-end through the service; all-versions-deleted still keeps highest addressable for admin; restore-to-higher-version reclaims latest.
  • Migration tests covering: kubernetes-mcp-server scenario (heals to highest semver); backport scenario (semver beats most-recent-published); no-is_latest-row defensive case; all-deleted server left untouched; healthy server left untouched; non-semver versions fall back to published_at.
  • Full unit test suite (./internal/... ./cmd/...) passes against a real Postgres.
  • make lint (0 issues) and make validate pass.
  • Manual curl repro against a running registry confirms the fix resolves the failure mode and main still reproduces it.

🤖 Generated with Claude Code

When the version marked is_latest was soft-deleted, no other version was
promoted, and because GetCurrentLatestVersion did not filter deleted rows,
subsequent publishes with lower semver never became latest either —
leaving /versions/latest permanently 404.

Recalculate is_latest after every status mutation in the same transaction
(already under the per-server advisory lock). Pick the highest non-deleted
version; fall back to highest deleted when every version is gone so the
server remains addressable via includeDeleted=true lookups used by admin
and restore flows.

The new SetLatestVersion primitive issues two statements (clear then set)
because idx_unique_latest_per_server is a non-deferrable unique partial
index — Postgres checks it row-by-row within a single UPDATE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@manusa
Copy link
Copy Markdown

manusa commented Apr 25, 2026

Follow-up (not in this PR)

One-off SQL is still needed after deploy to heal io.github.containers/kubernetes-mcp-server — clear is_latest on the deleted 1.0.0 and set it on 0.0.59 (as requested in the issue). Requires a maintainer with prod DB access.

Please note that the latest kubernetes-mcp-server is now 0.0.61. The SQL query should go to the latest published release instead (At this time the one I mentioned. Thx!

Copy link
Copy Markdown
Member

@tadasant tadasant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-off SQL is still needed after deploy to heal io.github.containers/kubernetes-mcp-server — clear is_latest on the deleted 1.0.0 and set it on 0.0.59 (as requested in the issue). Requires a maintainer with prod DB access.

Can we do this in a db migration file so it gets automatically applied on deploy? Written in a way that isn't hardcoded to the version numbers.

rdimitrov and others added 2 commits April 25, 2026 11:23
Migration 014 promotes the highest non-deleted version on every server whose
is_latest flag is on a deleted row (or absent entirely). The picker parses
major.minor.patch numerically and falls back to published_at for non-semver
or as a tiebreaker, mirroring CompareVersions for standard semver and giving
a sensible answer otherwise.

This auto-heals io.github.containers/kubernetes-mcp-server (latest 0.0.61)
and any other servers silently affected by the original bug, without
hardcoded version numbers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdimitrov
Copy link
Copy Markdown
Member Author

@tadasant good call — added migration 014_heal_is_latest.sql that auto-heals affected servers on deploy, no hardcoded versions. Identifies servers where the only is_latest=true row is deleted (and the defensive case where no row is flagged at all), parses major.minor.patch numerically, falls back to published_at for non-semver. Tests cover six scenarios including the kubernetes-mcp-server case and a backport-after-delete edge case where the simpler "most recently published" picker would have picked wrong. Updated PR description with the details.

@manusa thanks for the heads-up on 0.0.61 being current. The migration heals to the highest non-deleted semver dynamically, so it'll pick whatever is highest at deploy time — no version number hardcoded.

@rdimitrov
Copy link
Copy Markdown
Member Author

End-to-end validation against a local registry

To be confident this won't put prod in a bad state, ran the migration against synthetic data that mirrors the affected scenarios — using the unpatched main build to seed the broken state (so we know the bug is reproduced authentically), then booting the patched build (which runs migration 014 on startup, exactly as it will in prod).

Scenarios seeded via the public API on unpatched main

# Server Setup Expected post-migration
A k8s-style publish 1.0.0, 0.0.50, 0.0.51, 0.0.59, 0.0.60, 0.0.61; delete 1.0.0 0.0.61 promoted
B backport-after-delete publish 2.0.0; delete 2.0.0; publish 1.0.1, 1.0.0 1.0.1 promoted (semver beats publish-time)
C simple delete-latest publish 1.0.0, 2.0.0; delete 2.0.0 1.0.0 promoted
D healthy publish 1.0.0, 2.0.0 unchanged
E all-deleted publish 1.0.0, 2.0.0; delete both unchanged (admin lookups still resolve)
F deprecated-latest publish 1.0.0, 2.0.0; deprecate 2.0.0 unchanged (deprecated counts as non-deleted)
G non-semver publish build-a, build-b, build-c; delete build-c build-b promoted (publish-time fallback)

Pre-migration snapshot (after seeding on unpatched build)

all-deleted          1.0.0   deleted   f
all-deleted          2.0.0   deleted   t
backport             1.0.0   active    f
backport             1.0.1   active    f
backport             2.0.0   deleted   t   ← bug: is_latest stranded on deleted row
deprecated-latest    1.0.0   active    f
deprecated-latest    2.0.0   deprecated t
healthy              1.0.0   active    f
healthy              2.0.0   active    t
k8s-style            0.0.50  active    f
k8s-style            0.0.51  active    f
k8s-style            0.0.59  active    f
k8s-style            0.0.60  active    f
k8s-style            0.0.61  active    f
k8s-style            1.0.0   deleted   t   ← bug
nonsemver            build-a active    f
nonsemver            build-b active    f
nonsemver            build-c deleted   t   ← bug
simple               1.0.0   active    f
simple               2.0.0   deleted   t   ← bug

Switched to fix branch and rebooted — migration 014 runs on startup

2026/04/25 18:48:45 Applied migration 14: 014_heal_is_latest
2026/04/25 18:48:45 All migrations applied successfully

Diff (post − pre)

 backport             1.0.0   active    f
-backport             1.0.1   active    f
-backport             2.0.0   deleted   t
+backport             1.0.1   active    t
+backport             2.0.0   deleted   f
 ...
-k8s-style            0.0.61  active    f
-k8s-style            1.0.0   deleted   t
+k8s-style            0.0.61  active    t
+k8s-style            1.0.0   deleted   f
 nonsemver            build-a active    f
-nonsemver            build-b active    f
-nonsemver            build-c deleted   t
-simple               1.0.0   active    f
-simple               2.0.0   deleted   t
+nonsemver            build-b active    t
+nonsemver            build-c deleted   f
+simple               1.0.0   active    t
+simple               2.0.0   deleted   f

healthy, deprecated-latest, and all-deleted are byte-identical pre/post — only the broken servers were touched.

Public API checks post-migration

k8s-style              200 0.0.61    isLatest=True
backport               200 1.0.1     isLatest=True
simple                 200 1.0.0     isLatest=True
healthy                200 2.0.0     isLatest=True
deprecated-latest      200 2.0.0     isLatest=True
nonsemver              200 build-b   isLatest=True
all-deleted            404                              ← public correctly 404s
all-deleted (?include_deleted=true)  200 2.0.0  isLatest=True   ← admin still resolves

Why this is safe for prod

  • The HAVING clause in the migration matches only servers where the is_latest=true row is deleted (or no row is flagged) — healthy servers are byte-identical pre/post. Verified.
  • Idempotent: re-running on healed data finds nothing to match → no-op.
  • Worst-case mispick (semver-tied prereleases — none observed in our scenarios): a single mcp-publisher status call now triggers proper recalc via the code fix.

@rdimitrov rdimitrov merged commit 0cb1f6b into main Apr 25, 2026
6 checks passed
@rdimitrov rdimitrov deleted the fix/1081-recalculate-latest-on-status-change branch April 25, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

isLatest not recalculated when latest version is deleted; new versions below deleted latest never promoted

3 participants