Skip to content

feat: declare seconds unit on duration histograms#177

Merged
mpetrun5 merged 2 commits intomainfrom
feat/histogram-second-unit
Apr 28, 2026
Merged

feat: declare seconds unit on duration histograms#177
mpetrun5 merged 2 commits intomainfrom
feat/histogram-second-unit

Conversation

@GregTheGreek
Copy link
Copy Markdown

Summary

Adds metric.WithUnit("s") to the four Float64Histograms in MpcMetrics:

Why

sygma-core registers a sub-second bucket view in observability.InitMetricProvider:

// observability/metrics.go (initSecondView)
sdkmetric.NewView(
    sdkmetric.Instrument{Unit: "s"},
    sdkmetric.Stream{Aggregation: aggregation.ExplicitBucketHistogram{
        Boundaries: []float64{1e-6, 1e-5, 1e-4, 1e-3, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 100, 1000, 10000},
    }},
)

The view is keyed by Instrument{Unit: "s"}. The histograms in metrics/mpc.go declared a description but no unit, so the view never matched and the SDK fell back to OTel's default histogram boundaries [5, 10, 25, 50, ..., 10000] — which are tuned for milliseconds. Signing-phase durations (sub-second to a few seconds) collapse into the le=5 bucket, making histogram_quantile return values pinned to bucket boundaries rather than real percentiles.

Values are already recorded in seconds via d.Seconds(), so no math changes — only bucketing.

Grafana query change

The OTLP→Prometheus exporter appends a _seconds suffix when the instrument carries Unit: "s". After this PR, dashboard queries change:

Before After
relayer_SessionTime_bucket relayer_SessionTime_seconds_bucket
relayer_InitiateTime_bucket relayer_InitiateTime_seconds_bucket
relayer_CommSendTime_bucket relayer_CommSendTime_seconds_bucket
relayer_CommDnsResolveTime_bucket relayer_CommDnsResolveTime_seconds_bucket

In practice no dashboards depend on the old names yet — the OTel collector URL was only wired into staging in #174, so historical data is empty.

Test plan

  • go build ./... clean
  • go test ./metrics/... ./tss/... ./comm/p2p/... pass
  • After deploy, confirm relayer_SessionTime_seconds_count and the three new _seconds_count series are non-empty in Grafana
  • Confirm histogram_quantile(0.95, ...) returns values that vary with workload rather than snapping to bucket boundaries

Adds metric.WithUnit("s") to the four Float64Histograms in MpcMetrics
(SessionTime, InitiateTime, CommSendTime, CommDnsResolveTime). Without
the unit, the sub-second bucket view registered by sygma-core's
observability.InitMetricProvider does not match these instruments and
the SDK falls back to default boundaries tuned for milliseconds, so
sub-second values collapse into the le=5 bucket and quantile queries
return useless results.

Co-Authored-By: Claude
@github-actions
Copy link
Copy Markdown

Go Test coverage is 37.0 %\ ✨ ✨ ✨

@github-actions
Copy link
Copy Markdown

Go Test coverage is 37.0 %\ ✨ ✨ ✨

@mpetrun5 mpetrun5 merged commit 30c2a72 into main Apr 28, 2026
7 checks passed
@mpetrun5 mpetrun5 deleted the feat/histogram-second-unit branch April 28, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants