GASBench

A benchmark evaluation package for discriminative models on Bittensor Subnet 34 (GAS - Generative Adversarial Subnet).

GASBench evaluates AI-generated content detection models across image, video, and audio modalities.

Overview

This package provides a self-contained benchmark evaluation system for testing models on diverse datasets:

Image, Video & Audio Benchmarks
Test discriminative models on curated datasets for AI-generated content detection
Data Processing
Dataset download, caching, preprocessing, and augmentation with aspect ratio preservation for image/video, and standardized resampling/windowing for audio
Comprehensive Metrics
Accuracy, MCC, cross-entropy, inference times, and per-dataset breakdowns

For model submission requirements, see the
👉 Safetensors Model Specification (required for competition)

Note: ONNX format is no longer accepted for competition. See ONNX.md for legacy reference.

Installation

GPU support is installed by default:

cd gasbench
pip install -e .

For CPU-only (no CUDA):

pip install -e .[cpu]

Usage

Command Line Interface

# Run image benchmark with safetensors model
gasbench run --image-model ./my_image_model/ --debug

# Run video benchmark
gasbench run --video-model ./my_video_model/ --debug

# Run audio benchmark
gasbench run --audio-model ./my_audio_model/ --debug

# Full benchmark (all datasets)
gasbench run --image-model ./my_model/ --full

# Custom cache directory
gasbench run --video-model ./my_model/ --cache-dir /tmp/my_cache

# Save results to a specific directory
gasbench run --image-model ./my_model/ --results-dir ./results

# For SN34 miners: Use only gasstation datasets
gasbench run --image-model ./my_model/ --gasstation-only

Model directory must contain: model_config.yaml, model.py, *.safetensors

Results are automatically saved to a timestamped JSON file.

Python API

import asyncio
from gasbench import run_benchmark, print_benchmark_summary, save_results_to_json

async def evaluate_model():
    results = await run_benchmark(
        model_path="path/to/my_model/",  # directory with model_config.yaml, model.py, *.safetensors
        modality="image",  # "image" | "video" | "audio"
        debug_mode=False,
        gasstation_only=False,
    )

    print_benchmark_summary(results)

    output_path = save_results_to_json(results, output_dir="./results")
    return results

results = asyncio.run(evaluate_model())

Model Requirements (High-Level)

Models must be submitted in safetensors format as a directory containing model_config.yaml, model.py, and *.safetensors weights.
Your model.py must define a load_model(weights_path, num_classes) function that returns a PyTorch nn.Module.
GASBench expects batched inputs (raw 0-255 pixel values for image/video, waveform tensors for audio) and logits outputs.
Image/video preprocessing (resize/crop/augment) and audio preprocessing (mono/resample/crop) are handled by GASBench. Input normalization should be done inside your model's forward() method.
Binary (real vs synthetic) classification with num_classes: 2 is the standard format.

For full submission requirements, see:
👉 Safetensors Model Specification (required for competition)

Note: ONNX format is no longer accepted. See ONNX.md for legacy reference.

Metrics

sn34_score -- Primary competition metric. Geometric mean of normalized MCC and Brier score: $\sqrt{MCC_{norm}^{1.2} \cdot Brier_{norm}^{1.8}}$. Rewards both discrimination accuracy and probability calibration.
binary_mcc -- Matthews Correlation Coefficient for binary real/synthetic classification (-1 to +1)
binary_brier -- Brier score measuring calibration quality (0 = perfect, 0.25 = random)
binary_cross_entropy -- Log-loss for predicted probabilities
benchmark_score -- Overall benchmark score
avg_inference_time_ms -- Mean inference time per sample
p95_inference_time_ms -- 95th percentile inference time

Cache Directory

By default, datasets are cached at /tmp/benchmark_data/. Specify a custom location with cache_dir parameter or --cache-dir flag.

JSON Output Structure

A structured JSON summary is automatically generated after each run:

{
  "metadata": {
    "run_id": "64a8c5eb-560a-4822-9ae5-b51d27737831",
    "timestamp": 1765074390.323,
    "datetime": "2025-12-07T02:26:30.323068",
    "model_path": "./my_audio_model/",
    "modality": "audio",
    "mode": "full",
    "gasstation_only": false,
    "benchmark_completed": true,
    "duration_seconds": 1054.31
  },
  "overall_score": 0.8523,
  "validation": {
    "model_path": "./my_audio_model/",
    "num_classes": 2,
    "weights_file": "model.safetensors"
  },
  "errors": [],
  "results": {
    "total_samples": 5000,
    "correct_predictions": 4261,
    "accuracy": 0.8522,
    "avg_inference_time_ms": 12.3,
    "p95_inference_time_ms": 45.6,
    "binary_mcc": 0.7045,
    "binary_brier": 0.1523,
    "binary_cross_entropy": 0.2345,
    "sn34_score": 0.7891
  },
  "accuracy_by_media_type": {
    "real": {
      "samples": 2000,
      "correct": 1780,
      "accuracy": 0.89
    },
    "synthetic": {
      "samples": 2500,
      "correct": 2200,
      "accuracy": 0.88
    }
  },
  "dataset_info": {
    "datasets_used": ["dataset-a", "dataset-b", "..."],
    "evaluation_type": "synthetic_detection",
    "dataset_media_types": {
      "dataset-a": "real",
      "dataset-b": "synthetic"
    },
    "samples_per_dataset": {
      "dataset-a": 100,
      "dataset-b": 100
    }
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 316 Commits
docs		docs
src/gasbench		src/gasbench
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GASBench

Overview

Installation

Usage

Command Line Interface

Python API

Model Requirements (High-Level)

Metrics

Cache Directory

JSON Output Structure

About

Uh oh!

Releases 24

Packages

Contributors 5

Uh oh!

Languages

License

BitMind-AI/gasbench

Folders and files

Latest commit

History

Repository files navigation

GASBench

Overview

Installation

Usage

Command Line Interface

Python API

Model Requirements (High-Level)

Metrics

Cache Directory

JSON Output Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 24

Packages 0

Contributors 5

Uh oh!

Languages

Packages