feat(admin): SPA degraded banner + conflict hatching for fan-out (Phase 2-C PR-2)#687
feat(admin): SPA degraded banner + conflict hatching for fan-out (Phase 2-C PR-2)#687
Conversation
…se 2-C PR-2) Wires the Phase 2-C fan-out wire format (PR #686) into the admin SPA. The server's KeyVizMatrix.fanout block was previously ignored by the SPA; an operator with --keyvizFanoutNodes configured got the aggregated heatmap but no signal that one peer had failed or that a row's totals were soft due to a leadership flip. API client (web/admin/src/api/client.ts): - KeyVizFanoutNode and KeyVizFanoutResult mirror the Go shapes from internal/admin/keyviz_fanout.go. - KeyVizRow gains the optional `conflict` flag; KeyVizMatrix gains the optional `fanout` block. Page (web/admin/src/pages/KeyViz.tsx): - Header counter: "cluster view (N of M nodes)" appears in the per-matrix metadata strip when fanout is present. A healthy cluster reads "3 of 3 nodes" inline; nothing else changes. - FanoutBanner: a degraded-mode card above the heatmap when responded < expected, lists every failed node with its error string and explains what is and is not in the heatmap. Hidden when the cluster is healthy so the page stays clean. - ConflictOverlay: an SVG layer inside the scroll container with a 4px diagonal hatch over rows whose conflict flag is true. SVG rather than canvas because the hatch needs to scroll with the heatmap (same idiom TimeAxis already uses) and SVG's pattern primitive is the lowest-overhead way to express it. - RowDetail: shows a "conflict" pill with a hover-tooltip explaining the leadership-flip semantics so an operator who notices the hatch can find out why without leaving the page. The design doc 6 (PR #685) is the spec for these UI bits; nothing in this PR changes the wire format. Type check (tsc -b --noEmit) and Vite build clean.
📝 WalkthroughWalkthroughThis PR extends the KeyViz monitoring interface to track cluster degradation and row-level conflicts. New TypeScript interfaces carry fanout metadata (responding and expected nodes) from the server. The UI displays this metadata via a new banner, summary updates, visual overlay, and status indicators. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
web/admin/src/pages/KeyViz.tsx (2)
240-271: FanoutBanner: solid. One small thought on the<= expectedguard.Returning
nullwhenresponded >= fanout.expectedcorrectly hides the banner for healthy clusters and is also defensive against a malformed response whereresponded > expected. The failed-node enumeration vianodes.filter((n) => !n.ok)lines up withKeyVizFanoutNode.oksemantics.Two optional polish items for a follow-up (not blocking):
- The "node{fanout.responded === 1 ? "" : "s"}" pluralization assumes English; if the SPA gains i18n later this will need to move to an ICU-style plural rule. Fine for now.
- Consider giving the banner
role="status"(orrole="alert"if you want to interrupt screen readers when a previously-healthy cluster degrades) so assistive tech announces the degradation when the response transitions from healthy → degraded mid-session.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/admin/src/pages/KeyViz.tsx` around lines 240 - 271, Add an ARIA role to the FanoutBanner container so assistive tech announces cluster degradation when the UI transitions; update the <div className="card border-danger/40"> in the FanoutBanner component to include an appropriate role attribute (e.g., role="status" for passive announcement or role="alert" to interrupt) and ensure the role is applied to the outermost element rendered by the FanoutBanner function.
313-326: Hardcoded SVG pattern id — fine today, butuseId()would future-proof it.
id="keyviz-conflict-hatch"is global to the document. There's only oneHeatmaprendered in this app today, so this works. However, if a future change renders two heatmaps simultaneously (e.g., a comparison view, or even a brief overlap during route transitions with React 19's concurrent rendering), both<defs>will register the same id andfill="url(#keyviz-conflict-hatch)"will resolve to whichever happens to be in the DOM tree first.Since you're already on React 19,
useId()is the idiomatic fix:♻️ Suggested refactor
-import { useEffect, useMemo, useRef, useState } from "react"; +import { useEffect, useId, useMemo, useRef, useState } from "react";function ConflictOverlay({ rows, cellH, width }: ConflictOverlayProps) { + const patternId = useId(); const conflictRows = useMemo(() => { ... }, [rows]); if (conflictRows.length === 0) return null; const totalH = rows.length * cellH; return ( <svg ... > <defs> <pattern - id="keyviz-conflict-hatch" + id={patternId} ... > <line ... /> </pattern> </defs> {conflictRows.map((i) => ( <rect ... - fill="url(`#keyviz-conflict-hatch`)" + fill={`url(#${patternId})`} /> ))} </svg> ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/admin/src/pages/KeyViz.tsx` around lines 313 - 326, Replace the hardcoded SVG pattern id "keyviz-conflict-hatch" with a React-generated unique id via useId() in the KeyViz component (or the component that renders this <defs>), assign that id to the <pattern id={uniqueId}> and update all corresponding fill attributes that reference the pattern to use fill={`url(#${uniqueId})`} (or the equivalent string). This ensures the pattern id is unique across concurrent renders or multiple Heatmap/KeyViz instances; import and call React's useId, store the returned id in a variable (e.g., hatchId), use hatchId for the pattern id and for any fill="url(...)" references.web/admin/src/api/client.ts (1)
234-240: Type/comment mismatch onconflict.The comment states the field is "always present on the wire, defaults to false", but the TS type is
conflict?: boolean(optional). If the server contract genuinely guarantees the field is always emitted, the type can be tightened toconflict: booleanso consumers don't needrow.conflict === truedefensive checks (the existing call sites use truthy checks likeif (rows[i].conflict)and{row.conflict && ...}which would still work either way).Conversely, if you want to keep the optional type for forward/backward compatibility with older servers (e.g., a client running against a pre-PR-1 backend that hasn't been updated), then the comment is the part to soften — something like "present on Phase 2-C+ servers; absent on older builds, treated as false".
Pick one — but the current state has the docstring claim a stronger guarantee than the type expresses.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/admin/src/api/client.ts` around lines 234 - 240, The comment and the TypeScript type for the field conflict are inconsistent: the comment says "always present on the wire, defaults to false" but the code declares conflict?: boolean. Pick one approach and make them consistent: either tighten the type by changing conflict?: boolean to conflict: boolean (and keep the comment) if the server truly always emits the flag, or relax the comment to state "present on Phase 2-C+ servers; absent on older builds, treated as false" if you intend to keep conflict optional for backward compatibility — update the declaration or the docstring accordingly (reference the conflict field and its current declaration conflict?: boolean).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/admin/src/pages/KeyViz.tsx`:
- Around line 403-410: The conflict pill uses a native title tooltip which is
not keyboard/touch/screen-reader accessible; update the span rendered when
row.conflict to be keyboard-focusable and expose the explanatory text to AT by
adding tabIndex={0} and an accessible name/description (e.g., an aria-label or
aria-describedby that points to a visually hidden element containing the full
explanation), while keeping the visible "conflict" label and existing className
("pill-muted text-xs") so touch/keyboard users and screen readers receive the
"totals may understate..." caveat; change only the span in KeyViz.tsx (the
row.conflict element) to add tabIndex and aria attributes.
---
Nitpick comments:
In `@web/admin/src/api/client.ts`:
- Around line 234-240: The comment and the TypeScript type for the field
conflict are inconsistent: the comment says "always present on the wire,
defaults to false" but the code declares conflict?: boolean. Pick one approach
and make them consistent: either tighten the type by changing conflict?: boolean
to conflict: boolean (and keep the comment) if the server truly always emits the
flag, or relax the comment to state "present on Phase 2-C+ servers; absent on
older builds, treated as false" if you intend to keep conflict optional for
backward compatibility — update the declaration or the docstring accordingly
(reference the conflict field and its current declaration conflict?: boolean).
In `@web/admin/src/pages/KeyViz.tsx`:
- Around line 240-271: Add an ARIA role to the FanoutBanner container so
assistive tech announces cluster degradation when the UI transitions; update the
<div className="card border-danger/40"> in the FanoutBanner component to include
an appropriate role attribute (e.g., role="status" for passive announcement or
role="alert" to interrupt) and ensure the role is applied to the outermost
element rendered by the FanoutBanner function.
- Around line 313-326: Replace the hardcoded SVG pattern id
"keyviz-conflict-hatch" with a React-generated unique id via useId() in the
KeyViz component (or the component that renders this <defs>), assign that id to
the <pattern id={uniqueId}> and update all corresponding fill attributes that
reference the pattern to use fill={`url(#${uniqueId})`} (or the equivalent
string). This ensures the pattern id is unique across concurrent renders or
multiple Heatmap/KeyViz instances; import and call React's useId, store the
returned id in a variable (e.g., hatchId), use hatchId for the pattern id and
for any fill="url(...)" references.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ddd27651-c567-4d73-8ce7-99861ae2c184
📒 Files selected for processing (2)
web/admin/src/api/client.tsweb/admin/src/pages/KeyViz.tsx
| {row.conflict && ( | ||
| <span | ||
| className="pill-muted text-xs" | ||
| title="Two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice." | ||
| > | ||
| conflict | ||
| </span> | ||
| )} |
There was a problem hiding this comment.
Conflict pill: tooltip is desktop-hover-only; consider keyboard/touch reachability.
The title attribute renders a native tooltip that:
- requires mouse hover to display (no equivalent on touch devices),
- is not reachable via keyboard focus,
- ignores the explanatory tooltip text on screen readers in many browsers.
Given this pill carries semantically important context (the "totals may understate the true count" caveat), at least making it focusable + exposing the explanation as an accessible name is worth it. Minimal change:
♿ Proposed minimal fix
{row.conflict && (
<span
className="pill-muted text-xs"
+ tabIndex={0}
+ role="note"
+ aria-label="Conflict: two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice."
title="Two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice."
>
conflict
</span>
)}Also worth noting: the existing "aggregate" pill on the line above has the same shortcoming, so this could be folded into a small shared Pill component with an optional explanation prop in a follow-up — out of scope for this PR.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {row.conflict && ( | |
| <span | |
| className="pill-muted text-xs" | |
| title="Two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice." | |
| > | |
| conflict | |
| </span> | |
| )} | |
| {row.conflict && ( | |
| <span | |
| className="pill-muted text-xs" | |
| tabIndex={0} | |
| role="note" | |
| aria-label="Conflict: two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice." | |
| title="Two or more nodes reported a non-zero value for the same cell — typical during a leadership flip mid-window. Displayed totals may understate the true count by up to one leader's pre-transfer slice." | |
| > | |
| conflict | |
| </span> | |
| )} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/admin/src/pages/KeyViz.tsx` around lines 403 - 410, The conflict pill
uses a native title tooltip which is not keyboard/touch/screen-reader
accessible; update the span rendered when row.conflict to be keyboard-focusable
and expose the explanatory text to AT by adding tabIndex={0} and an accessible
name/description (e.g., an aria-label or aria-describedby that points to a
visually hidden element containing the full explanation), while keeping the
visible "conflict" label and existing className ("pill-muted text-xs") so
touch/keyboard users and screen readers receive the "totals may understate..."
caveat; change only the span in KeyViz.tsx (the row.conflict element) to add
tabIndex and aria attributes.
|
@claude review |
There was a problem hiding this comment.
Code Review
This pull request enhances the KeyViz heatmap by adding support for visualizing fan-out results and row-level conflicts. It introduces new API interfaces for cluster fan-out status and implements UI components including a FanoutBanner for degraded states, a ConflictOverlay using SVG patterns to highlight data conflicts, and updated row details. Review feedback suggests correcting the documentation for the conflict flag's wire format and addressing potential ID collisions in the SVG pattern definition.
| // Phase 2-C row-level conflict flag (always present on the wire, | ||
| // defaults to false). True when ≥2 nodes reported a non-zero | ||
| // value for the same cell — typically a leadership flip mid- | ||
| // window. Per design 4.2 / PR-1, this is a row-level signal; it | ||
| // moves to per-cell when the proto extension lands in 2-C+. | ||
| conflict?: boolean; |
There was a problem hiding this comment.
The comment states that the conflict flag is "always present on the wire," but the corresponding Go struct in internal/admin/keyviz_handler.go uses json:"conflict,omitempty". In Go, omitempty on a boolean field means it will be omitted from the JSON output if its value is false. Therefore, the field will only be present on the wire when it is true. The TypeScript interface correctly uses an optional property (conflict?: boolean), but the comment should be updated to reflect the actual wire behavior.
| // Phase 2-C row-level conflict flag (always present on the wire, | |
| // defaults to false). True when ≥2 nodes reported a non-zero | |
| // value for the same cell — typically a leadership flip mid- | |
| // window. Per design 4.2 / PR-1, this is a row-level signal; it | |
| // moves to per-cell when the proto extension lands in 2-C+. | |
| conflict?: boolean; | |
| // Phase 2-C row-level conflict flag (omitted on the wire when | |
| // false). True when ≥2 nodes reported a non-zero | |
| // value for the same cell — typically a leadership flip mid- | |
| // window. Per design 4.2 / PR-1, this is a row-level signal; it | |
| // moves to per-cell when the proto extension lands in 2-C+. | |
| conflict?: boolean; |
| <pattern | ||
| id="keyviz-conflict-hatch" | ||
| width={4} | ||
| height={4} | ||
| patternUnits="userSpaceOnUse" | ||
| patternTransform="rotate(45)" | ||
| > | ||
| <line x1={0} y1={0} x2={0} y2={4} stroke="currentColor" strokeWidth={1} opacity={0.45} /> | ||
| </pattern> | ||
| </defs> |
There was a problem hiding this comment.
The SVG pattern uses a hardcoded ID keyviz-conflict-hatch. While this is currently safe as there is only one heatmap on the page, using static IDs in reusable components can lead to collisions if multiple instances are rendered (e.g., during navigation transitions or if the UI is expanded to show multiple matrices). Consider generating a unique ID per component instance or defining the pattern once at a higher level in the DOM.
|
Claude finished @bootjp's task in 2m 0s —— View job Code Review — Phase 2-C PR-2: SPA fan-out banner + conflict hatching
Overall the PR is well-structured and the implementation is clean. The 1.
|
## Summary
Mirror the existing `ADMIN_*` env-driven plumbing for the KeyViz heatmap
sampler so operators can flip `--keyvizEnabled` and the
`--keyvizFanoutNodes` peer list from `deploy.env` without editing the
script per-deploy.
## What's added
- `KEYVIZ_ENABLED` master switch (validated as `true|false` at the top
of the local script alongside `ADMIN_ENABLED`).
- `KEYVIZ_FANOUT_NODES` comma-separated host:port list. Pre-quoted via
`printf %q` for the same reason `ADMIN_FULL_ACCESS_KEYS` is: commas
survive an unquoted env pass but pre-quoting keeps the pattern uniform.
- `build_keyviz_flags` helper (nameref-output, mirrors
`build_admin_flags`). Empty array when `KEYVIZ_ENABLED != "true"`, so
existing deploys see no behaviour change.
- `KEYVIZ_*` env vars threaded through the SSH env passthrough so the
heredoc-resident `build_keyviz_flags` reads them on the remote.
- `env.example` block documenting the variables and pointing at the
Phase 2-C fan-out auth follow-up
(`docs/design/2026_04_27_proposed_keyviz_cluster_fanout.md`).
## What's not added
- `KEYVIZ_STEP`, `KEYVIZ_MAX_TRACKED_ROUTES`, `KEYVIZ_HISTORY_COLUMNS`,
`KEYVIZ_FANOUT_TIMEOUT` — these have sensible binary-side defaults and
operators have not asked for env-driven overrides yet. Adding them is a
one-line `_flags+=` edit per knob if needed.
- Inter-node fan-out auth — Phase 2-C does not yet ship a pre-shared
bearer token, so a multi-node `KEYVIZ_FANOUT_NODES` setup surfaces 5/N
peer 401s in the SPA degraded banner. The env.example comment documents
this.
## Verification
Deployed to a live 5-node cluster (192.168.0.210-214) with
`KEYVIZ_ENABLED=true` and `KEYVIZ_FANOUT_NODES` set across all five
admin listeners; `--keyviz*` flags appeared on every container's `docker
inspect`, and the admin keyviz handler returned a matrix with the
`fanout` block populated.
```sh
$ docker inspect elastickv --format '{{json .Args}}' | python3 -c "import sys,json;a=json.load(sys.stdin);print([v for v in a if v.startswith('--keyviz')])"
['--keyvizEnabled', '--keyvizFanoutNodes']
$ curl -s -b cookies.txt https://elastickv.bootjp.dev/admin/api/v1/keyviz/matrix \
| jq '.fanout | {responded, expected}'
{
"responded": 1,
"expected": 6
}
```
(The 5/6 401s are the expected fan-out-auth gap, surfaced by the PR #687
`FanoutBanner` in the SPA. Tracked separately.)
## Self-review (CLAUDE.md 5 lenses)
1. **Data loss** — None. Build/runbook plumbing only.
2. **Concurrency** — None.
3. **Performance** — None on the deploy path. Per-node sampler cost is
covered by the sampler's own design doc.
4. **Data consistency** — None.
5. **Test coverage** — `bash -n` syntax-check passes. Existing
`rolling-update.sh` has no automated tests in this repo; the
verification path is "deploy to a real cluster and inspect docker run
args."
## Test plan
- [x] `bash -n scripts/rolling-update.sh` — syntax OK
- [x] Live 5-node deploy with KEYVIZ_ENABLED=true — verified
- [ ] CI: shellcheck (if/when added)
@claude review
Summary
Phase 2-C PR-2, per the design doc landed in #685 §6: wire the fan-out
Fanout/conflictshapes shipped by PR #686 into the admin SPA so operators can see when a peer failed and which rows are soft due to a leadership flip.Backend is unchanged. PR #686's
KeyVizMatrix.Fanoutblock is currently emitted on every fan-out request but the SPA was discarding it.What this PR adds
cluster view (N of M nodes)in the per-matrix metadata strip. Hidden whenfanoutis absent (single-node mode, no behavior change).responded < expected. Lists every failed node + error string. Hidden when the cluster is healthy.conflict === true. Layered inside the scroll container so the hatch tracks horizontal scroll (same idiom asTimeAxis). 4 px diagonal stripes, alpha 0.45, inheritingcurrentColorso it works against both light and dark themes.conflictpill with a hover-tooltip explaining the leadership-flip semantics.Five-lens self-review
nullreturned). Hatch: SVG with at most 1024<rect>elements (matchesKeyVizRowbudget); empirically negligible. Overlay only renders when at least one row hasconflict === true.tsc -b --noEmitandvite buildclean. UI behavior (banner visibility, hatch presence, header text) is hard to assert intscalone; manual verification documented below.Test plan
npm run lint(tsc -b --noEmit) — cleannpm run build(Vite) — clean, output writes tointernal/admin/distmake runwith--keyvizEnabled --keyvizFanoutNodes=127.0.0.1:8080. Open/keyviz; header readscluster view (1 of 1 nodes), no banner, no hatch.--keyvizFanoutNodes=127.0.0.1:8080,127.0.0.1:9999(port 9999 unused). Header reads1 of 2 nodes, the failed-node banner appears withconnection refusedfor127.0.0.1:9999.conflict=trueserver-side via/admin/api/v1/keyviz/matrix?...); the affected row renders with diagonal hatch overlay; hovering reveals theconflictpill.Out of scope
conflictbecomes per-cell instead of per-row.generated_atskew indicator — design §9 Q3, deferred.Closes the SPA half of Phase 2-C; the proto extension follows in PR-3.
Summary by CodeRabbit