Valkey Performance Benchmark

A benchmarking tool for Valkey, an in-memory data store. Measures performance across different commits and configurations, including TLS and cluster modes.

Features

Benchmarks Valkey server with various commands (SET, GET, RPUSH, etc.)
Tests with different data sizes and pipeline configurations
Supports TLS and cluster mode testing
Handles automatic server setup and teardown
Collects detailed performance metrics and generates reports
Compares performance between different Valkey versions/commits
Provides CPU pinning via taskset using configuration file settings
Runs continuous benchmarking via GitHub Actions workflow
Tracks commits and manages progress automatically
Includes Grafana dashboards for visualizing performance metrics

Prerequisites

Git
Python 3.6+
Linux environment (for taskset CPU pinning)
Build tools required by Valkey. (gcc, make, etc.)
Install python modules required for this project: pip install -r requirements.txt.

Project Structure

valkey-perf-benchmark/
├── .github/workflows/        # GitHub Actions workflows
│   ├── valkey_benchmark.yml  # Continuous benchmarking workflow
│   ├── basic.yml            # Basic validation tests
│   ├── check_format.yml     # Code formatting checks
│   └── cluster_tls.yml      # Cluster and TLS specific tests
├── configs/                  # Benchmark configuration files
│   ├── benchmark-configs.json
│   └── benchmark-configs-cluster-tls.json
├── dashboards/              # Grafana dashboards and AWS infrastructure
│   ├── grafana/            # Grafana dashboard definitions and Helm config
│   ├── kubernetes/         # Kubernetes manifests (ALB Ingress)
│   ├── infrastructure/     # CloudFormation templates
│   ├── scripts/            # Phase-based deployment scripts (00-06)
│   ├── schema.sql          # PostgreSQL database schema
│   └── README.md           # Deployment documentation
├── utils/                   # Utility scripts
│   ├── postgres_track_commits.py  # Commit tracking and management
│   └── compare_benchmark_results.py  # Result comparison utilities
├── benchmark.py             # Main entry point
├── valkey_build.py          # Handles building Valkey from source
├── valkey_server.py         # Manages Valkey server instances
├── valkey_benchmark.py      # Runs benchmark tests
├── process_metrics.py       # Processes and formats benchmark results
└── requirements.txt         # Python dependencies

Each benchmark run clones a fresh copy of the Valkey repository for the target commit. When --valkey-path is omitted, the repository is cloned into valkey_<commit> and removed after the run to maintain build isolation and repeatability.

Usage

Basic Usage

# Run both server and client benchmarks with default configuration
python benchmark.py

# Run only the server component
python benchmark.py --mode server

# Run only the client component against a specific server
python benchmark.py --mode client --target-ip 192.168.1.100

# Use a specific configuration file
python benchmark.py --config ./configs/my-custom-config.json

# Benchmark a specific commit
python benchmark.py --commits 1a2b3c4d

# Use a pre-existing Valkey dir
python benchmark.py --valkey-path /path/to/valkey

# Without --valkey-path a directory named valkey_<commit> is cloned and later removed

# Use a custom valkey-benchmark executable
python benchmark.py --valkey-benchmark-path /path/to/custom/valkey-benchmark

# Use a pre-running Valkey Server
python benchmark.py --valkey-path /path/to/valkey --use-running-server

### Comparison Mode

# Compare with baseline
python benchmark.py --commits HEAD --baseline unstable

# Run multiple benchmark runs for statistical reliability
python benchmark.py --commits HEAD --runs 5

Benchmark Comparison and Analysis

The project includes a comparison tool for analyzing benchmark results with statistical analysis and graph generation.

Compare Benchmark Results

# Basic comparison between two result files
python utils/compare_benchmark_results.py --baseline results/commit1/metrics.json --new results/commit2/metrics.json --output comparison.md

# Generate graphs along with comparison
python utils/compare_benchmark_results.py --baseline results/commit1/metrics.json --new results/commit2/metrics.json --output comparison.md --graphs --graph-dir graphs/

# Filter to show only RPS metrics
python utils/compare_benchmark_results.py --baseline results/commit1/metrics.json --new results/commit2/metrics.json --output comparison.md --metrics rps --graphs

# Filter to show only latency metrics
python utils/compare_benchmark_results.py --baseline results/commit1/metrics.json --new results/commit2/metrics.json --output comparison.md --metrics latency --graphs

Comparison Tool Features

Automatic Run Averaging: Groups and averages multiple benchmark runs with identical configurations
Statistical Analysis: Calculates means, standard deviations, and Coefficient of Variation (CV) with sample standard deviation (n-1)
Coefficient of Variation: Provides normalized variability metrics (CV = σ/μ × 100%) for scale-independent comparison across performance metrics
Graph Generation: matplotlib-based visualization including:
- Consolidated comparison graphs for all metrics
- Variance line graphs showing individual run values with error bars
- RPS-focused filtering for integration purposes
Metrics Filtering: Supports filtering by metric type (all, rps, latency)
Standardized Output: Generates markdown reports with statistical information including CV

Statistical Display Format

When multiple runs are available, the comparison tool displays comprehensive statistical information:

Metric Value (n=X, σ=Y, CV=Z%)

Where:

n: Number of runs
σ: Standard deviation
CV: Coefficient of Variation as a percentage

The Coefficient of Variation (CV) is useful for:

Scale-independent comparison: Compares variability across metrics with different units (e.g., RPS vs latency)
Performance consistency assessment: Lower CV indicates more consistent performance
Benchmark reliability evaluation: High CV indicates unstable test conditions

Graph Types

Consolidated Comparison Graphs: Single graphs showing all metrics with legend format {commit}-P{pipeline}/IO{io_threads}
Variance Line Graphs: Individual run values with standard deviation visualization and error bars

Advanced Options

# Use an already running Valkey server (client mode only with `--valkey-path`)
python benchmark.py --mode client --valkey-path /path/to/valkey --use-running-server

# Specify custom results directory
python benchmark.py --results-dir ./my-results

# Set logging level
python benchmark.py --log-level DEBUG

# Use a custom valkey-benchmark executable (useful for testing modified versions)
python benchmark.py --valkey-benchmark-path /path/to/custom/valkey-benchmark

Benchmarking Remote Servers and Pre-Running Servers

When using --use-running-server or benchmarking remote servers, restarting the server between benchmark runs is the user's responsibility. Failure to restart between runs affects test results.

Custom Valkey-Benchmark Executable

The --valkey-benchmark-path option specifies a custom path to the valkey-benchmark executable. This is useful when:

Testing a modified version of valkey-benchmark
Using a pre-built binary from a different location
Benchmarking with a specific version of the tool

When not specified, the tool uses the default src/valkey-benchmark relative to the Valkey source directory.

# Example: Use a custom benchmark tool
python benchmark.py --valkey-benchmark-path /usr/local/bin/valkey-benchmark

# Example: Use with custom Valkey path
python benchmark.py --valkey-path /custom/valkey --valkey-benchmark-path /custom/valkey/src/valkey-benchmark

## Configuration

Create benchmark configurations in JSON format. Each object represents a single set of options and configurations are **not** automatically cross-multiplied. Example:

```json
[
  {
    "requests": [10000000],
    "keyspacelen": [10000000],
    "data_sizes": [16, 64, 256],
    "pipelines": [1, 10, 100],
    "commands": ["SET", "GET"],
    "cluster_mode": "yes",
    "tls_mode": "yes",
    "warmup": 10,
    "io-threads": [1, 4, 8],
    "server_cpu_range": "0-1",
    "client_cpu_range": "2-3"
  }
]

Configuration Parameters

Parameter	Description	Data Type	Multiple Values
`requests`	Number of requests to perform	Integer	Yes
`keyspacelen`	Key space size (number of distinct keys)	Integer	Yes
`data_sizes`	Size of data in bytes	Integer	Yes
`pipelines`	Number of commands to pipeline	Integer	Yes
`clients`	Number of concurrent client connections	Integer	Yes
`commands`	Valkey commands to benchmark	String	Yes
`cluster_mode`	Whether to enable cluster mode	String ("yes"/"no")	No
`tls_mode`	Whether to enable TLS	String ("yes"/"no")	No
`warmup`	Warmup time in seconds before benchmarking	Integer	No
`io-threads`	Number of I/O threads for server	Integer	Yes
`server_cpu_range`	CPU cores for server (e.g. "0-3", "0,2,4", or "144-191,48-95")	String	No
`client_cpu_range`	CPU cores for client (e.g. "4-7", "1,3,5", or "0-3,8-11")	String	No

When warmup is provided for read commands, the benchmark performs three stages:

A data injection pass using the corresponding write command with --sequential to seed the keyspace.
A warmup run of the read command (without --sequential) for the specified duration.
The main benchmark run of the read command.

Supported commands:

"SET", "GET", "RPUSH", "LPUSH", "LPOP", "SADD", "SPOP", "HSET", "GET", "MGET", "LRANGE", "SPOP", "ZPOPMIN"

Results

Benchmark results are stored in the results/ directory, organized by commit ID:

results/
└── <commit-id>/
    ├── logs.txt                         # Benchmark logs
    ├── metrics.json                     # Performance metrics in JSON format
    └── valkey_log_cluster_disabled.log  # Valkey server logs

Sample metrics.json

[
  {
    "timestamp": "2025-05-28T01:29:42+02:00",
    "commit": "ff7135836b5d9ccaa19d5dbaf2a0b0325755c8b4",
    "command": "SET",
    "data_size": 16,
    "pipeline": 10,
    "clients": 10000000,
    "requests": 10000000,
    "rps": 556204.44,
    "avg_latency_ms": 0.813,
    "min_latency_ms": 0.28,
    "p50_latency_ms": 0.775,
    "p95_latency_ms": 1.159,
    "p99_latency_ms": 1.407,
    "max_latency_ms": 2.463,
    "cluster_mode": false,
    "tls": false
  }
]

Continuous Benchmarking & CI/CD

GitHub Actions Workflows

The project includes several GitHub Actions workflows for automated testing and deployment:

valkey_benchmark.yml: Continuous benchmarking workflow that runs on self-hosted EC2 runners
- Benchmarks new commits from the Valkey unstable branch
- Manages commit tracking via PostgreSQL
- Uploads results to S3 and pushes metrics to PostgreSQL
- Supports manual triggering with configurable commit limits
basic.yml: Basic validation and testing
check_format.yml: Code formatting validation
cluster_tls.yml: Tests for cluster and TLS configurations

Commit Tracking

The system uses PostgreSQL to track benchmarking progress. Commits are stored in the benchmark_commits table with their status, configuration, and architecture.

Status Descriptions

in_progress: Workflow has selected the commit and is running the benchmark
complete: Full workflow completed - metrics are in PostgreSQL and available in dashboards

Automatic Cleanup

The system cleans up in_progress entries on the next workflow run. This ensures:

Failed benchmark runs are retried
Commits stuck in progress do not block future runs
The tracking reflects completed work accurately

Config Tracking

Each commit is tracked with the actual configuration content used for benchmarking. This provides:

Comparison of results across different configurations by actual content
Tracking of exact config parameters used for each benchmark
Detection when config content changes even if file name stays the same
Benchmarking the same commit with different configs
Reproducibility by storing complete config data

Performance Dashboard

The project includes a complete AWS infrastructure for visualizing benchmark results using Grafana:

AWS EKS Fargate - Serverless Kubernetes (no EC2 nodes)
Amazon RDS PostgreSQL - Stores benchmark metrics and Grafana configuration
CloudFront CDN - Global content delivery with HTTPS
Application Load Balancer - Secured for CloudFront-only access

See dashboards/README.md for complete deployment guide and architecture details.

Utilities

Commit Management

utils/postgres_track_commits.py: Manages commit tracking, status updates, and cleanup operations using PostgreSQL
utils/push_to_postgres.py: Pushes benchmark metrics to PostgreSQL with dynamic schema support
utils/compare_benchmark_results.py: Compares benchmarparing benchmark results across commits

Configuration Files

configs/benchmark-configs.json: Standard benchmark configurations
configs/benchmark-configs-cluster-tls.json: Specialized configurations for cluster and TLS testing

Development

Local Development

For local development, simply run:

python benchmark.py

Adding New Configurations

Create new JSON configuration files in the configs/ directory following the existing format. Each configuration object represents a benchmark scenario.

License

Please see the LICENSE.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Valkey Performance Benchmark

Features

Prerequisites

Project Structure

Usage

Basic Usage

Benchmark Comparison and Analysis

Compare Benchmark Results

Comparison Tool Features

Statistical Display Format

Graph Types

Advanced Options

Benchmarking Remote Servers and Pre-Running Servers

Custom Valkey-Benchmark Executable

Configuration Parameters

Results

Continuous Benchmarking & CI/CD

GitHub Actions Workflows

Commit Tracking

Status Descriptions

Automatic Cleanup

Config Tracking

Performance Dashboard

Utilities

Commit Management

Configuration Files

Development

Local Development

Adding New Configurations

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
configs		configs
dashboards		dashboards
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
POSTGRES_MIGRATION_SUMMARY.md		POSTGRES_MIGRATION_SUMMARY.md
README.md		README.md
benchmark.py		benchmark.py
benchmark_build.py		benchmark_build.py
process_metrics.py		process_metrics.py
requirements.txt		requirements.txt
valkey_benchmark.py		valkey_benchmark.py
valkey_build.py		valkey_build.py
valkey_server.py		valkey_server.py

License

valkey-io/valkey-perf-benchmark

Folders and files

Latest commit

History

Repository files navigation

Valkey Performance Benchmark

Features

Prerequisites

Project Structure

Usage

Basic Usage

Benchmark Comparison and Analysis

Compare Benchmark Results

Comparison Tool Features

Statistical Display Format

Graph Types

Advanced Options

Benchmarking Remote Servers and Pre-Running Servers

Custom Valkey-Benchmark Executable

Configuration Parameters

Results

Continuous Benchmarking & CI/CD

GitHub Actions Workflows

Commit Tracking

Status Descriptions

Automatic Cleanup

Config Tracking

Performance Dashboard

Utilities

Commit Management

Configuration Files

Development

Local Development

Adding New Configurations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages