-
Notifications
You must be signed in to change notification settings - Fork 259
Fix high cardinality metrics by removing unbounded labels #2967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blobreactor-prometheus2
Are you sure you want to change the base?
Fix high cardinality metrics by removing unbounded labels #2967
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## blobreactor-prometheus2 #2967 +/- ##
===========================================================
- Coverage 64.15% 64.11% -0.05%
===========================================================
Files 371 371
Lines 18597 18591 -6
===========================================================
- Hits 11931 11919 -12
- Misses 5759 5764 +5
- Partials 907 908 +1
🚀 New features to boost your workflow:
|
a6fbc96 to
9511120
Compare
784b633 to
78c4ed4
Compare
9511120 to
08758cf
Compare
08758cf to
c2d983d
Compare
c2d983d to
1591a48
Compare
4c886ad to
9a5c6eb
Compare
1591a48 to
232134c
Compare
9a5c6eb to
8666900
Compare
232134c to
b9d10c6
Compare
b9d10c6 to
6051072
Compare
99565e2 to
5fe4aab
Compare
6051072 to
a459d97
Compare
| if err != nil { | ||
| s.logger.Error("Failed to read deposits", "error", err) | ||
| s.metrics.FailedToGetBlockLogs.With("block_num", blockNumStr).Add(1) | ||
| s.metrics.FailedToGetBlockLogs.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: with this change, we should be able to get the time when the failure happens and retrieve the corresponding logs correct?
If so, I am all for this change
| s.latestFcuReq.Store(&buildData.FCState) | ||
|
|
||
| s.metrics.markRebuildPayloadForRejectedBlockSuccess(nextBlkSlot) | ||
| s.metrics.markRebuildPayloadForRejectedBlockSuccess() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just make sure we duly log where we are making the metrics less expressive
abi87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR addresses a metrics issue by removing unbounded labels from Prometheus metrics that can cause cardinality explosion and degrade system performance in our monitoring systems.
I believe it should be fine to remove these labels as we should have information about them in our logs/loki.
Note: if any Grafana dashboards relied on these labels, they may need to be updated to reflect the new metric structure.