Skip to content

docs(design): refresh admin dashboard status — P2 + P4 shipped#688

Merged
bootjp merged 2 commits intomainfrom
docs/admin-design-status-update
Apr 27, 2026
Merged

docs(design): refresh admin dashboard status — P2 + P4 shipped#688
bootjp merged 2 commits intomainfrom
docs/admin-design-status-update

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 27, 2026

Summary

State-refresh of docs/design/2026_04_24_partial_admin_dashboard.md now that all four phases of the original P1–P4 plan have shipped.

Phase Was Now
P1 shipped shipped (no change)
P2 partial — read-only #658 only shipped (+ #669 writes + #673 forwarding)
P3 shipped shipped (no change)
P4 mostly shipped (+ operator doc #674, deployment runbook #669, script wiring #669+#678)

Also added the AdminDeleteBucket TOCTOU to Outstanding open items — coderabbitai flagged it during PR #669 review (pre-existing race, s3.go:deleteBucket inherits the same shape; recorded as a code comment on AdminDeleteBucket for the future fix).

Doc stays at _partial_ rather than promoting to _implemented_ because two outstanding items remain: AdminForward acceptance criterion 5 (rolling-upgrade flag, explicitly deferred at design time) and the new TOCTOU. Updated the closing rename trigger to cover both.

Test plan

  • No design changes — this is a state refresh
  • Status table entries cross-checked against the merge commits on main
  • AdminDeleteBucket TOCTOU description matches the code comment in adapter/s3_admin.go and the operator-side workaround in docs/admin_deployment.md

The status table was last updated when the rename to `_partial_`
landed (PR #675). Since then:

- P2 write paths shipped — #669 (slice 2a, S3 bucket admin
  endpoints) and #673 (slice 2b, AdminForward integration). The
  read-only slice was already in via #658. Status flips from
  🟡 partial → ✅ shipped.
- P4 finished — operator doc in #674, deployment runbook in #669,
  rolling-update.sh admin wiring in #669 + #678. Status flips
  from 🟡 mostly → ✅ shipped.

Also added the AdminDeleteBucket TOCTOU to Outstanding open items.
coderabbitai flagged it during PR #669 review (🔴 / 🟠) — the
empty-bucket probe scans `ObjectManifestPrefixForBucket` at readTS
but the BucketMetaKey delete carries only that single point key in
`ReadKeys`, so a concurrent PutObject inserting a manifest key
between readTS and commit will not conflict and the object becomes
orphaned. Pre-existing race that `adapter/s3.go:deleteBucket`
inherits as well. Tracked here for the future fix (either bump
BucketGenerationKey on every PutObject, or add ReadRanges to
OperationGroup); the operator-side workaround documented in
docs/admin_deployment.md is to pause writes before admin delete.

Doc stays at `_partial_` rather than `_implemented_` because
two outstanding items remain: AdminForward criterion 5 (rolling-
upgrade compatibility flag, explicitly deferred) and the
AdminDeleteBucket TOCTOU. Updated the closing sentence
accordingly so the rename trigger covers both.

No design changes — this is a state refresh.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 48 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dd4c099d-de3c-4cc9-aeda-913cce1a5d65

📥 Commits

Reviewing files that changed from the base of the PR and between a3d312f and fb39e3c.

📒 Files selected for processing (1)
  • docs/design/2026_04_24_partial_admin_dashboard.md
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/admin-design-status-update

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 27, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Claude finished @bootjp's task in 1m 29s —— View job


Review of docs/design/2026_04_24_partial_admin_dashboard.md

  • Read the changed file
  • Cross-check status claims against PR description
  • Review TOCTOU entry accuracy
  • Check lifecycle marker and closing conditions
  • Post review findings

Overall: The status refresh is accurate and well-structured. One inaccuracy in the TOCTOU description to fix; everything else looks good.


Finding: stale cross-reference in TOCTOU entry

Line 20 of the updated doc says:

the operator-side workaround documented in docs/admin_deployment.md is to pause writes against the target bucket before issuing the admin delete.

After reading all 374 lines of docs/admin_deployment.md, there is no such section. The pause-writes workaround is currently only in the code comment at adapter/s3_admin.go:383:

// Operators concerned about the orphan window today should pause writes against
// the target bucket before issuing the admin delete.

The design doc is pointing to a location that doesn't contain what it claims. Either:

  • (a) Add a "Known limitations — AdminDeleteBucket orphan race" subsection to docs/admin_deployment.md (probably §5.6), or
  • (b) Fix the reference in the design doc to point at the code comment rather than the deployment doc.

Fix this →


What looks good


Minor nit (take or leave): the P4 row's "Landed via" now spans two PRs for some items (#674 / #669 / #678) but the separator is / while P2 uses +. Using consistent separators throughout the table (+ everywhere, as in P1 and P2) would be cleaner.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the elastickv Admin Dashboard design document to reflect that all implementation phases (P1–P4) have now shipped, while maintaining its 'partial' status due to deferred items. Specifically, it adds documentation for a TOCTOU race condition in AdminDeleteBucket that can lead to orphaned objects. Review feedback suggested refining the description of this issue to be more objective, warned of performance bottlenecks in one of the proposed fixes, and requested additional mitigation strategies for the operational risks associated with the current workaround.

Comment thread docs/design/2026_04_24_partial_admin_dashboard.md Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@bootjp bootjp merged commit 12d636c into main Apr 27, 2026
9 checks passed
@bootjp bootjp deleted the docs/admin-design-status-update branch April 27, 2026 12:57
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 27, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Claude finished @bootjp's task in 2m 48s —— View job


Review of PR #688 — docs(design): refresh admin dashboard status

  • Read changed file (docs/design/2026_04_24_partial_admin_dashboard.md)
  • Check cross-references and status claims
  • Review TOCTOU entry accuracy
  • Review lifecycle marker and closing conditions
  • Post final review

Overall: The status refresh is accurate and the doc structure is sound. Three issues worth recording for the next update.


Finding 1: Test plan references a non-existent section in docs/admin_deployment.md

The PR test plan claims:

AdminDeleteBucket TOCTOU description matches the code comment in adapter/s3_admin.go and the operator-side workaround in docs/admin_deployment.md

docs/admin_deployment.md (374 lines) has no mention of TOCTOU, orphan race, AdminDeleteBucket, or the pause-writes workaround. The workaround is only documented in the code comment at adapter/s3_admin.go:383.

The design doc text at line 20 doesn't directly claim admin_deployment.md contains it — that's fine. But the test plan check was wrong at merge time. For completeness, adding a "Known Limitations" subsection to docs/admin_deployment.md would make the cross-reference valid and give operators a runtime-visible place to find the workaround.

Fix this →


Finding 2: Code comment at adapter/s3_admin.go:362 violates CLAUDE.md conventions

The comment reads:

// Known orphan-race limitation (coderabbitai 🔴 / 🟠 on PR #669):

CLAUDE.md: "Don't reference the current task, fix, or callers… since those belong in the PR description and rot as the codebase evolves."

The coderabbitai attribution and PR #669 reference will become meaningless noise. The comment body already describes the race precisely — the parenthetical attribution adds nothing technically.

Suggested fix:

// Known orphan-race limitation (pre-existing, shared with s3.go:deleteBucket):

Fix this →


Finding 3: "bridge mode" suggestion is imprecise for a bucket-scoped operation

The TOCTOU outstanding item ends with:

the design should investigate mitigation strategies like a temporary proxy or bridge mode to avoid service interruption during this state.

"Service interruption" and "bridge mode" imply cluster-wide impact. The actual scope of the workaround is narrower: pause writes to a single target bucket, not the whole service. A proxy/bridge mode would be overkill and potentially misleading to operators reading this as a future design direction.

Consider: "Future work should consider either (a) or (b) above to remove the need for a write-pause; the workaround today is bucket-scoped and does not require any cluster-level intervention."


What looks good


Minor nit: AdminForward outstanding item uses **bold** but the TOCTOU item doesn't — inconsistent formatting in the same list. Also the P4 row uses / as separator while P2 uses +.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant