KaiEvolve

An open-source evolutionary coding agent that uses LLMs to discover and optimize algorithms through iterative evolution, for scientific and algorithmic discovery.

Overview

KaiEvolve is an evolutionary coding agent that uses Large Language Models to automatically optimize and discover algorithms through iterative improvement. Starting from the AlphaEvolve research, it incorporates advanced features for reproducibility, multi-language support, sophisticated evaluation pipelines, and integration with cutting-edge LLM optimization techniques. It serves as both a research platform for evolutionary AI and a practical tool for automated code optimization.

What makes KaiEvolve different

A faithful AlphaEvolve loop gets you mutation, evaluation, and a quality-diversity archive. KaiEvolve adds the machinery that makes long runs actually compound — the search remembers what it learned and steers itself instead of re-deriving the problem every iteration:

Compounding memory (HMRD + running literature review) — every successful program carries a 4-section Hypothesis / Method / Result / Discussion summary. At each checkpoint, KaiEvolve distills only the new programs into a bounded "lab notebook" that is injected into every prompt and persists to disk, so the LLM builds on accumulated knowledge rather than a rotating window of neighbors — and a fresh run can pick up where the last one left off. See Compounding memory.
Adaptive model selection (Thompson sampling) — hand the ensemble several models with default weights and a Beta–Bernoulli bandit learns which one to spend on from the rewards each earns, instead of a fixed split. See Adaptive model selection.
Multi-file evolution — coordinate EVOLVE-BLOCKs across a whole directory of files by shared block ID, not just a single program. See Multi-file evolution.
Self-seeding runs — kai run --init-from <run_dir> starts a new run from a previous run's best program. See Seeding a run.
Human + automated steering — redirect a live run with a markdown steering brief, or let an opt-in research director meta-agent write strategic directives. See Watching and steering.
Noise-aware fitness — for stochastic evaluators, re-evaluate and average so selection isn't fooled by a single lucky draw (see examples/noisy_optimization/).
Two-phase warmup — optional prompt optimization (pairwise feedback descent) and hyperparameter tuning before the main run (see configs/twophase_warmup.yaml).

Core capabilities

Underneath those, KaiEvolve is a complete evolutionary coding system:

Evolutionary coding agent: LLM-guided evolution of whole code files, not just functions
MAP-Elites + island-based evolution: a quality-diversity archive across multiple populations with periodic migration, balancing exploration and exploitation
Inspiration vs. performance prompting: top performers and diverse inspirations are sampled separately, with multi-strategy (elite / diverse / exploratory) selection
Adaptive feature dimensions: default complexity & diversity axes, extensible to any metric your evaluator returns
Cascade evaluation + artifacts side-channel: multi-stage testing plus a feedback channel that hands build errors, tracebacks, and profiling data back to the LLM
LLM ensemble over any OpenAI-compatible API: OpenAI, Anthropic, Google, or local models, with optillm for test-time compute (MoA, reflection) and plugins
Multi-objective optimization with a combined_score convention for fitness
Checkpoint / resume of full evolution state, with robust loading
Scientific reproducibility: deterministic, per-component seeding (seed=42 by default)
Language agnostic: Python, Rust, R, Metal shaders, and more
Process-based parallelism: true parallel evaluation past Python's GIL, with memory/timeout limits
kai CLI + zero-build web viewer: live monitoring, side-by-side comparison, CSV/JSON export, and interactive solution visualizations (see Exploring runs)

How It Works

Each iteration runs the same loop, in a fresh process working from a database snapshot:

Prompt sampler builds a context-rich prompt: top-performing programs (for optimization guidance), diverse inspirations (for creative exploration), execution artifacts and error feedback, the running literature review, any active steering brief, and — via optillm plugins — dynamically fetched documentation.
LLM ensemble generates the mutation. With default model weights, a Thompson sampler picks which model to call from the rewards each has earned; with explicit weights it falls back to a fixed split. optillm adds test-time compute (MoA, reflection); model selection is deterministic under a fixed seed.
Evaluator pool runs multi-stage cascade evaluation in parallel under memory/timeout limits, collecting artifacts and optional LLM-based quality feedback. Stochastic evaluators can re-evaluate and average for noise-aware fitness.
Program database maps the result into the MAP-Elites archive across islands, tracks lineage and metadata, and — at checkpoints — distills new programs' HMRD summaries into the literature review for the next iterations.

Getting Started

Installation

To install natively, use:

git clone https://github.com/firstbatchxyz/kai-evolve.git
cd kai-evolve
pip install -e .

Optional extras:

pip install -e ".[viewer]"      # the `kai viewer` web UI (FastAPI + Jinja)
pip install -e ".[embeddings]"  # local embedding models for novelty detection /
                                # strategy clustering (pulls in PyTorch; the
                                # default "api" embedding backend needs no extra)

Quick Start

Setting up LLM Access

KaiEvolve uses the OpenAI SDK, which means it works with any LLM provider that supports an OpenAI-compatible API:

Set the API Key: Export the OPENAI_API_KEY environment variable:
```
export OPENAI_API_KEY=your-api-key-here
```
Using Alternative LLM Providers:
- For providers other than OpenAI (e.g., Anthropic, Cohere, local models), update the api_base in your config.yaml:
```
llm:
  api_base: "https://your-provider-endpoint.com/v1"
```
Maximum Flexibility with optillm:
- For advanced routing, rate limiting, or using multiple providers, we recommend optillm
- optillm acts as a proxy that can route requests to different LLMs based on your rules
- Simply point api_base to your optillm instance:
```
llm:
  api_base: "http://localhost:8000/v1"
```

This setup ensures KaiEvolve can work with any LLM provider - OpenAI, Anthropic, Google, Cohere, local models via Ollama/vLLM, or any OpenAI-compatible endpoint.

import asyncio
import os

from kaievolve import KaiEvolve

# Ensure API key is set
if not os.environ.get("OPENAI_API_KEY"):
    raise ValueError("Please set OPENAI_API_KEY environment variable")

# Initialize the system
evolve = KaiEvolve(
    initial_program_path="path/to/initial_program.py",
    evaluation_file="path/to/evaluator.py",
    config_path="path/to/config.yaml"
)

# Run the evolution
best_program = asyncio.run(evolve.run(iterations=1000))
print("Best program metrics:")
for name, value in best_program.metrics.items():
    print(f"  {name}: {value:.4f}")

Command-Line Usage

KaiEvolve can also be run from the command line:

python kaievolve-run.py path/to/initial_program.py path/to/evaluator.py --config path/to/config.yaml --iterations 1000

For a concrete first run, try the bundled circle packing example:

pip install -r examples/circle_packing/requirements.txt  # scipy + matplotlib
python kaievolve-run.py examples/circle_packing/initial_program.py \
  examples/circle_packing/evaluator.py \
  --config examples/circle_packing/config_phase_1.yaml \
  --iterations 50

Note: if you omit --config, KaiEvolve warns and falls back to built-in defaults: models gpt-4o-mini/gpt-4o, with the API base taken from the OPENAI_API_BASE environment variable (default: https://openrouter.ai/api/v1). Those model names only resolve if your endpoint actually serves them, so passing an explicit config is recommended.

Resuming from Checkpoints

KaiEvolve automatically saves checkpoints at intervals specified by the checkpoint_interval config parameter (default is 10 iterations). You can resume an evolution run from a saved checkpoint:

python kaievolve-run.py path/to/initial_program.py path/to/evaluator.py \
  --config path/to/config.yaml \
  --checkpoint path/to/checkpoint_directory \
  --iterations 50

When resuming from a checkpoint:

The system loads all previously evolved programs and their metrics
Checkpoint numbering continues from where it left off (e.g., if loaded from checkpoint_50, the next checkpoint will be checkpoint_60)
All evolution state is preserved (best programs, feature maps, archives, etc.)
Each checkpoint directory contains a copy of the best program at that point in time

Example workflow with checkpoints:

# Run for 50 iterations (creates checkpoints at iterations 10, 20, 30, 40, 50)
python kaievolve-run.py examples/circle_packing/initial_program.py \
  examples/circle_packing/evaluator.py \
  --iterations 50

# Resume from checkpoint 50 for another 50 iterations (creates checkpoints at 60, 70, 80, 90, 100)
python kaievolve-run.py examples/circle_packing/initial_program.py \
  examples/circle_packing/evaluator.py \
  --checkpoint examples/circle_packing/kaievolve_output/checkpoints/checkpoint_50 \
  --iterations 50

Seeding a run from a previous best

Resuming continues an existing run's full state. To instead start a fresh run from where another one finished — a new config, a different model roster, or a clean archive seeded by the best solution so far — use --init-from with a run directory:

kai run --init-from bench_results/<run>/run_0 \
  examples/circle_packing/evaluator.py \
  --config examples/circle_packing/config_phase_2.yaml \
  --iterations 100

KaiEvolve locates that run's best program (<run_dir>/best/best_program.py, or the latest checkpoints/checkpoint_*/best_program.py) and uses it as the initial program — so you only need to pass the evaluator. This pairs naturally with the persistent literature review: point both runs at the same review file and knowledge compounds across them.

Comparing Results Across Checkpoints

Each checkpoint directory contains the best program found up to that point, making it easy to compare solutions over time:

checkpoints/
  checkpoint_10/
    best_program.py         # Best program at iteration 10
    best_program_info.json  # Metrics and details
    programs/               # All programs evaluated so far
    metadata.json           # Database state
  checkpoint_20/
    best_program.py         # Best program at iteration 20
    ...

You can compare the evolution of solutions by examining the best programs at different checkpoints:

# Compare best programs at different checkpoints
diff -u checkpoints/checkpoint_10/best_program.py checkpoints/checkpoint_20/best_program.py

# Compare metrics
cat checkpoints/checkpoint_*/best_program_info.json | grep -A 10 metrics

Docker

You can also install and execute via Docker:

docker build -t kaievolve .
docker run --rm -v $(pwd):/app --network="host" kaievolve examples/circle_packing/initial_program.py examples/circle_packing/evaluator.py --config examples/circle_packing/config_phase_1.yaml --iterations 1000

Exploring runs: the `kai` CLI and web viewer

Evolution produces a lot of programs; the kai command makes them legible. Run kai with no arguments for a guided menu, or pick a subcommand:

Command	What it does
`kai monitor`	Live dashboard, watch runs progress in real time
`kai runs`	One-shot table of every run and its score
`kai show`	Open one run and step through what the AI tried
`kai best`	Print (or save) the best solution a run found
`kai compare`	Compare setups side by side on a shared scale
`kai export`	Dump results to CSV/JSON for your own analysis
`kai steer`	Set a live steering brief a running job reads (human-in-the-loop)
`kai viewer`	Open the richer web view in your browser
`kai glossary`	Plain-language explanation of the terms
`kai run`	Run KaiEvolve on your own code (the optimizer itself)

Point kai at your results with --root (or set KAI_ROOT); --configs tells it where each setup's config lives.

The web viewer

kai viewer is a local, dependency-light web view of a results tree, with no database and no build step. It gives an at-a-glance overview, a per-run dashboard, and a click-to-inspect drawer that couples each step's reasoning, code diff, and the solution it produced.

Overview. Every setup per task, ranked by score, with cost and a progress sparkline:

Run and step drawer. A run is a score trajectory beside a summary panel, then a full-width table of every step the AI tried. Click any row and a drawer opens with that program's notes, its diff against its parent, and an interactive visualization of the actual solution (here, the circle packing it produced; hover to inspect, scroll to zoom):

Solution visualizations (`visualize.py`)

The interactive solution view is driven by an optional per-task visualize.py that sits next to the task's evaluator.py and exposes a single function:

def render(program_path: str) -> str:
    """Return a self-contained HTML fragment visualizing the program's output."""

The viewer runs it lazily in a sandboxed subprocess and caches the result, so it works retroactively on existing runs. Tasks without a visualize.py simply omit the solution view. See skills/visualization/SKILL.md and the two reference visualizers under examples/alphaevolve/ (circle packing and the autocorrelation step function).

Watching and steering a live run

Active runs append a per-iteration progress.jsonl feed to their output directory; kai monitor and the web viewer read it, so both update in real time while a run is going (and keep working on finished runs).

Two steering channels can redirect a run while it is running:

Human steering brief — point prompt.steering_brief_path at a markdown file and edit it mid-run: its contents are re-read every iteration and injected into generation prompts, so directives like "focus on the inner loop" or "avoid approach X" take effect without restarting or touching config. kai steer sets, appends to, or shows that file from the terminal.
Research director (opt-in) — an automated meta-agent that periodically reads the population and writes a strategic directive into the same steering channel. Enable it with prompt.research_director_enabled: true; optionally set research_director_interval (defaults to the migration interval) and research_director_model for a dedicated reasoning model (defaults to the run's model roster). See skills/research-director/SKILL.md.

Both channels compose: the human brief and the director's directive are injected together.

Configuration

KaiEvolve is highly configurable with advanced options:

# Example configuration showcasing advanced features
max_iterations: 1000
random_seed: 42  # Full reproducibility by default

llm:
  # Advanced ensemble configuration
  models:
    - name: "gemini-2.0-flash-lite"
      weight: 0.7
    - name: "moa&readurls-gemini-2.0-flash"  # optillm test-time compute
      weight: 0.3
  temperature: 0.7
  
database:
  # MAP-Elites configuration
  population_size: 500
  num_islands: 5  # Island-based evolution
  migration_interval: 20
  feature_dimensions: ["complexity", "diversity"]  # Default quality-diversity features
  
evaluator:
  # Advanced evaluation features
  enable_artifacts: true  # Capture execution feedback
  cascade_evaluation: true  # Multi-stage testing
  use_llm_feedback: true  # AI-based code quality assessment
  
prompt:
  # Sophisticated prompt engineering
  num_top_programs: 3      # Performance examples
  num_diverse_programs: 2  # Creative inspiration
  include_artifacts: true  # Execution feedback
  
  # Template customization
  template_dir: null               # Directory for custom prompt templates
  use_template_stochasticity: true # Enable random variations in prompts
  template_variations: {}          # Define variation placeholders

Sample configuration files are available in the configs/ directory:

default_config.yaml: Comprehensive configuration with all available options
island_config_example.yaml: Advanced island-based evolution setup

Template Customization

KaiEvolve supports advanced prompt template customization to increase diversity in code evolution:

Custom Templates with `template_dir`

You can override the default prompt templates by providing custom ones:

prompt:
  template_dir: "path/to/your/templates"

Create .txt files in your template directory with these names:

diff_user.txt - Template for diff-based evolution
full_rewrite_user.txt - Template for full code rewrites
evolution_history.txt - Format for presenting evolution history
top_program.txt - Format for top-performing programs
previous_attempt.txt - Format for previous attempts

Template Variations with Stochasticity

To add randomness to your prompts and prevent getting stuck in local optima:

Enable stochasticity in your config:

prompt:
  use_template_stochasticity: true
  template_variations:
    greeting:
      - "Let's improve this code."
      - "Time to enhance this program."
      - "Here's how we can optimize:"
    analysis_intro:
      - "Current metrics show"
      - "Performance analysis indicates"
      - "The evaluation reveals"

Use variation placeholders in your custom templates:

# custom_template.txt
{greeting}
{analysis_intro} the following results:
{metrics}

The system will randomly select one variation for each placeholder during prompt generation, creating diverse prompts that can lead to more creative code evolutions.

Note: The default templates don't include variation placeholders, so you'll need to create custom templates to use this feature effectively.

Feature Dimensions in MAP-Elites

Feature dimensions control how programs are organized in the MAP-Elites quality-diversity grid:

Default Features: If feature_dimensions is NOT specified in your config, KaiEvolve uses ["complexity", "diversity"] as defaults.

Built-in Features (always computed internally by KaiEvolve):

complexity: Code length (recommended default)
diversity: Code structure diversity compared to other programs (recommended default)

Only complexity and diversity are used as defaults because they work well across all program types.

Custom Features: You can mix built-in features with metrics from your evaluator:

database:
  feature_dimensions: ["complexity", "performance", "correctness"]  # Mix of built-in and custom
  # Per-dimension bin configuration (optional)
  feature_bins: 
    complexity: 10        # 10 bins for complexity
    performance: 20       # 20 bins for performance (from YOUR evaluator)
    correctness: 15       # 15 bins for correctness (from YOUR evaluator)

Important: KaiEvolve will raise an error if a specified feature is not found in the evaluator's metrics. This ensures your configuration is correct. The error message will show available metrics to help you fix the configuration.

See the Configuration Guide for a full list of options.

Default Metric for Program Selection

When comparing and selecting programs, KaiEvolve uses the following priority:

combined_score: If your evaluator returns a combined_score metric, it will be used as the primary fitness measure
Average of all metrics: If no combined_score is provided, KaiEvolve calculates the average of all numeric metrics returned by your evaluator

This ensures programs can always be compared even without explicit fitness definitions. For best results, consider having your evaluator return a combined_score that represents overall program fitness.

Artifacts Channel

KaiEvolve includes an artifacts side-channel that allows evaluators to capture build errors, profiling results, etc. to provide better feedback to the LLM in subsequent generations. This feature enhances the evolution process by giving the LLM context about what went wrong and how to fix it.

The artifacts channel operates alongside the traditional fitness metrics.

Example: Compilation Failure Feedback

from kaievolve.evaluation_result import EvaluationResult

return EvaluationResult(
    metrics={"compile_ok": 0.0, "score": 0.0},
    artifacts={
        "stderr": "SyntaxError: invalid syntax (line 15)",
        "traceback": "...",
        "failure_stage": "compilation"
    }
)

The next generation prompt will include:

## Last Execution Output
### Stderr
SyntaxError: invalid syntax (line 15)

### Traceback
...

Example: LLM Feedback

An example for an LLM artifact side channel is part of the default evaluation template, which ends with

Return your evaluation as a JSON object with the following format:
{{
    "readability": [score],
    "maintainability": [score],
    "efficiency": [score],
    "reasoning": "[brief explanation of scores]"
}}

The non-float values, in this case the "reasoning" key of the json response that the evaluator LLM generates, will be available within the next generation prompt.

Configuration

Artifacts can be controlled via configuration and environment variables:

# config.yaml
evaluator:
  enable_artifacts: true

prompt:
  include_artifacts: true
  max_artifact_bytes: 4096  # 4KB limit in prompts
  artifact_security_filter: true

# Environment variable to disable artifacts
export ENABLE_ARTIFACTS=false

Benefits

Faster convergence - LLMs can see what went wrong and fix it directly
Better error handling - Compilation and runtime failures become learning opportunities
Rich debugging context - Full stack traces and error messages guide improvements
Zero overhead - When disabled, no performance impact on evaluation

Compounding memory: HMRD + literature review

A plain evolutionary loop is amnesiac: each prompt sees a handful of neighboring programs and re-derives the problem from scratch. KaiEvolve gives the search a memory in two layers.

HMRD summaries. Every successful program is asked to emit a tagged Hypothesis / Method / Result / Discussion prefix, which workers parse and attach to the program. This turns the population from a pile of code into a record of what was tried and why it did or didn't work. HMRD is on by default (prompt.require_hmrd) and also feeds the optional strategy clustering (prompt.strategy_clustering_enabled) and the approaches synthesis in the web viewer.

Running literature review. Opt in with prompt.literature_review_enabled and KaiEvolve maintains a bounded "lab notebook" distilled from those HMRD summaries. At each checkpoint it distills only the programs added since the last checkpoint and merges them into the review, then injects the review into every generation prompt. The cost stays flat regardless of population size, and because the review persists to a file it compounds across runs — a new run loads the accumulated notebook and builds on it.

prompt:
  require_hmrd: true                 # 4-section summaries (default on)
  literature_review_enabled: true    # maintain + inject the running review
  literature_review_path: null       # null => <output_dir>/literature_review.md;
                                      # set a stable path to accumulate across runs
  literature_review_max_chars: 4000  # bounded (~1000 tokens)
  literature_review_interval: null   # distill every N iters; null => checkpoint_interval

Set literature_review_path to a shared file and chain runs with --init-from to carry knowledge forward.

Adaptive model selection (Thompson sampling)

Picking a single model per run is a bet; static ensemble weights are a fixed bet. KaiEvolve can instead learn the split. When you list more than one model and leave the weights at their default (1.0), the ensemble switches on a Beta–Bernoulli Thompson sampler: it round-robins until each model has a few samples, then routes each call to the model whose recent reward distribution looks most promising — automatically shifting spend toward whatever is actually working on this task. Set explicit weights to opt out and use a fixed weighted split.

llm:
  models:
    - name: "google/gemini-2.5-flash-lite"   # no `weight:` => Thompson sampling
    - name: "google/gemini-2.5-flash"
  thompson_window_size: 50          # how many recent samples inform each model's estimate
  thompson_min_samples: 3           # round-robin until each model has this many samples
  thompson_exploration_bonus: 0.0   # >0 nudges toward less-used models

The reward a model earns can be the child's absolute score or its improvement over the parent (llm.reward_mode: "absolute" | "improvement", optionally cost-aware); selection is deterministic under a fixed seed.

Multi-file evolution

Some problems don't live in one file — a kernel plus its launcher, a model plus its training loop. KaiEvolve can evolve EVOLVE-BLOCKs that span a directory, coordinated by a shared block ID. Mark the corresponding blocks in each file with the same ID, point kai run at the directory, and name the block to evolve:

# file_a.py
# EVOLVE-BLOCK-START id="kernel"
...
# EVOLVE-BLOCK-END

# file_b.py
# EVOLVE-BLOCK-START id="kernel"
...
# EVOLVE-BLOCK-END

kai run --directory path/to/project --block-id kernel \
  path/to/evaluator.py --config path/to/config.yaml --iterations 200

KaiEvolve presents the matching blocks to the LLM together, applies the evolved result back across the files, and evaluates the project as a whole. The best solution is written back to the original files at the end.

Examples

See the examples/ directory for complete examples of using KaiEvolve on various problems:

Mathematical Optimization

Circle Packing

Our implementation of the circle packing problem. For the n=26 case, we achieve state-of-the-art results matching published benchmarks.

Below is the optimal packing found by KaiEvolve after 800 iterations:

AlphaEvolve Benchmark Tasks

examples/alphaevolve/

Five open mathematical problems from the AlphaEvolve paper (autocorrelation_C1, packing_circles_max_sum_of_radii, no_isosceles_triangles, happy_ending, unit_distances), each packaged as an initial program plus evaluator. Run them with the suite config tuned for these tasks, configs/bench_alphaevolve.yaml.

GPU Kernels

examples/kernelbench/

Evolving CUDA/GPU kernels for PyTorch operators (e.g. 23_softmax) against a reference implementation — the platform-optimization case, where fitness is raw throughput on real hardware.

Stochastic Evaluators

examples/noisy_optimization/

A minimal benchmark for noise-aware fitness: the evaluator reports a noisy estimate of each candidate's quality, so a single draw can mislead selection. Raising evaluator.re_evaluations above 1 averages repeats to shrink the variance — a worked example of evolving under a stochastic objective.

Preparing Your Own Problems

To use KaiEvolve for your own problems:

Mark code sections to evolve with # EVOLVE-BLOCK-START and # EVOLVE-BLOCK-END comments
Create an evaluation function that returns a dictionary of metrics
Configure KaiEvolve with appropriate parameters
Run the evolution process

Driving KaiEvolve from an LLM agent? skills/using-kaievolve/SKILL.md is an agent-facing reference for the full workflow (task setup, the evaluator contract, running, and reading results).

Acknowledgements

KaiEvolve is a fork of OpenEvolve by Asankhaya Sharma (@codelion), used under the Apache-2.0 license. OpenEvolve is itself an open-source implementation of Google DeepMind's AlphaEvolve. KaiEvolve builds on that foundation; the original copyright and license are preserved (see LICENSE and NOTICE).

Citation

If you use KaiEvolve in your research, please cite both KaiEvolve and the upstream OpenEvolve work it is built on:

@software{kaievolve,
  title = {KaiEvolve: an evolutionary coding agent for scientific and algorithmic discovery},
  author = {Dria},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/firstbatchxyz/kai-evolve}
}

@software{openevolve,
  title = {OpenEvolve: an open-source evolutionary coding agent},
  author = {Asankhaya Sharma},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/codelion/openevolve}
}

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.claude/commands		.claude/commands
.github		.github
benchmarks/kernelbench		benchmarks/kernelbench
configs		configs
docs		docs
examples		examples
experiments		experiments
kaievolve		kaievolve
prompts		prompts
scripts		scripts
skills		skills
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
kaievolve-run.py		kaievolve-run.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

KaiEvolve

Overview

What makes KaiEvolve different

Core capabilities

How It Works

Getting Started

Installation

Quick Start

Setting up LLM Access

Command-Line Usage

Resuming from Checkpoints

Seeding a run from a previous best

Comparing Results Across Checkpoints

Docker

Exploring runs: the kai CLI and web viewer

The web viewer

Solution visualizations (visualize.py)

Watching and steering a live run

Configuration

Template Customization

Custom Templates with template_dir

Template Variations with Stochasticity

Feature Dimensions in MAP-Elites

Default Metric for Program Selection

Artifacts Channel

Example: Compilation Failure Feedback

Example: LLM Feedback

Configuration

Benefits

Compounding memory: HMRD + literature review

Adaptive model selection (Thompson sampling)

Multi-file evolution

Examples

Mathematical Optimization

Circle Packing

AlphaEvolve Benchmark Tasks

examples/alphaevolve/

GPU Kernels

examples/kernelbench/

Stochastic Evaluators

examples/noisy_optimization/

Preparing Your Own Problems

Acknowledgements

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Exploring runs: the `kai` CLI and web viewer

Solution visualizations (`visualize.py`)

Custom Templates with `template_dir`

Packages