feat: add observability tooling for Flashbox images#93
feat: add observability tooling for Flashbox images#93MoeMahhouk wants to merge 12 commits intomainfrom
Conversation
8f75a8f to
b0557e3
Compare
| @@ -0,0 +1,5 @@ | |||
| process_names: | |||
| # Monitor the searcher container (conmon + all children via --children flag) | |||
| - name: "searcher-container" | |||
There was a problem hiding this comment.
should we also monitor lighthouse in bob-l1?
There was a problem hiding this comment.
If there is a need for a dedicated monitoring for lighthouse in bob-l1 image, then it should be somehow placed in the bob-l1 directory setup and make it extend this configuration if possible, wdyt?
| #!/bin/sh | ||
| set -eu | ||
|
|
||
| # Project-specific dynamic configuration for bob-l1 | ||
| # Called by fetch-config.sh with mode (qemu/vault) and config path | ||
|
|
||
| MODE="$1" | ||
| CONFIG_PATH="$2" | ||
|
|
||
| if [ "$MODE" = "qemu" ]; then | ||
| # Local QEMU development configuration | ||
| # GATEWAY is exported by the common fetch-config.sh | ||
| cat <<EOF >> "$CONFIG_PATH" | ||
| CONFIG_NETWORK_ID='1' | ||
| CONFIG_NETWORK_NAME='mainnet' | ||
| CONFIG_JWT_SECRET='1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef' | ||
| CONFIG_CL_STATIC_PEERS='' | ||
| CONFIG_EL_STATIC_PEERS='enode://abc123@${GATEWAY}:30303' | ||
| CONFIG_TITAN_IP='52.207.17.217' | ||
| CONFIG_FLASHBOTS_BUNDLE_1='18.221.59.61' | ||
| CONFIG_FLASHBOTS_BUNDLE_2='3.15.88.156' | ||
| CONFIG_FLASHBOTS_TX_STREAM_1='3.136.107.142' | ||
| CONFIG_FLASHBOTS_TX_STREAM_2='3.149.14.12' | ||
| EOF | ||
|
|
||
| elif [ "$MODE" = "vault" ]; then | ||
| # Production configuration from Vault | ||
| # get_data_value and get_ips_from_uris are exported by fetch-config.sh | ||
|
|
||
| # For bob-l1, we might not have Vault set up yet | ||
| # This is a placeholder for when Vault integration is added | ||
| echo "Warning: Vault configuration not yet implemented for bob-l1" | ||
| echo "Using environment variables or defaults..." | ||
|
|
||
| # You can add Vault-based configuration here when ready | ||
| # For now, we can use environment variables as fallback | ||
| cat <<EOF >> "$CONFIG_PATH" | ||
| CONFIG_NETWORK_ID='${CONFIG_NETWORK_ID:-1}' | ||
| CONFIG_NETWORK_NAME='${CONFIG_NETWORK_NAME:-mainnet}' | ||
| CONFIG_JWT_SECRET='${CONFIG_JWT_SECRET:-}' | ||
| CONFIG_CL_STATIC_PEERS='${CONFIG_CL_STATIC_PEERS:-}' | ||
| CONFIG_EL_STATIC_PEERS='${CONFIG_EL_STATIC_PEERS:-}' | ||
| CONFIG_TITAN_IP='${CONFIG_TITAN_IP:-52.207.17.217}' | ||
| CONFIG_FLASHBOTS_BUNDLE_1='${CONFIG_FLASHBOTS_BUNDLE_1:-18.221.59.61}' | ||
| CONFIG_FLASHBOTS_BUNDLE_2='${CONFIG_FLASHBOTS_BUNDLE_2:-3.15.88.156}' | ||
| CONFIG_FLASHBOTS_TX_STREAM_1='${CONFIG_FLASHBOTS_TX_STREAM_1:-3.136.107.142}' | ||
| CONFIG_FLASHBOTS_TX_STREAM_2='${CONFIG_FLASHBOTS_TX_STREAM_2:-3.149.14.12}' | ||
| EOF | ||
| fi |
There was a problem hiding this comment.
Do we need this file for L1?
There was a problem hiding this comment.
currently not because it is all hardcoded but if we want to unify the approaches for both l1 and l2, this should be included. Plus the remote write url is also being fetched dynamically from vault for both l1 and l2 currently
``` /usr/bin/fetch-config.sh: 136: export: Illegal option -f ```
Fix error in fetch-config.sh
| Description=Searcher Network and Firewall Rules | ||
| After=network.target network-setup.service | ||
| Requires=network-setup.service | ||
| After=network.target network-setup.service fetch-config.service |
There was a problem hiding this comment.
yes because we need to fetch the config that contains the remote write url for prometheus stuff and later included in the firewall configs accordingly
| if [ "$MODE" = "qemu" ]; then | ||
| # Local QEMU development configuration | ||
| # GATEWAY is exported by the common fetch-config.sh | ||
| cat <<EOF >> "$CONFIG_PATH" | ||
| CONFIG_NETWORK_ID='12345' | ||
| CONFIG_NETWORK_NAME='local-testnet' | ||
| CONFIG_JWT_SECRET='1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef' | ||
| CONFIG_EL_STATIC_PEERS='enode://abc123@${GATEWAY}:30303' | ||
| CONFIG_EL_PEERS_IPS='${GATEWAY}' | ||
| CONFIG_SIMULATOR_RPC_URL='http://${GATEWAY}:8545' | ||
| CONFIG_SIMULATOR_WS_URL='ws://${GATEWAY}:8546' | ||
| CONFIG_SIMULATOR_IP='${GATEWAY}' | ||
| EOF | ||
|
|
||
| elif [ "$MODE" = "vault" ]; then |
There was a problem hiding this comment.
I think we should find a better approach than this if/elif approach
There was a problem hiding this comment.
I agree, that is dependent on the merge efforts.
Note: this logic here isnt introduced in this PR but rather moved from the bob-l2 image to the base bob-common to be used by both l1 & l2.
This pull request introduces a comprehensive observability and monitoring stack to the project, centered around Prometheus and its exporters. It adds Prometheus, node-exporter, and process-exporter as services, configures them for system and container-level metrics collection, and sets up recording rules for aggregated metrics. The changes also include dynamic configuration improvements, firewall adjustments for metrics endpoints, and new helper scripts for environment-specific configuration.
Observability & Monitoring Integration
prometheus.service,node-exporter.service,process-exporter.service).prometheus.yml.tmpl,recording_rules.yml,process-exporter.yml).gomplateas a build dependency to render dynamic Prometheus configuration from templates.Firewall & Networking
METRICS_ENDPOINTvariable loaded from configuration.searcher-firewall.servicedependencies to ensure correct ordering with configuration fetching.Dynamic Configuration
bob-l1andbob-l2, supporting both QEMU development and Vault-based production environments. These scripts generate environment-specific config files based on mode and available secrets.Miscellaneous
These changes collectively enable robust, flexible, and secure monitoring of both the host system and key containers, and prepare the environment for future observability enhancements.