Skip to content

Latest commit

 

History

History
43 lines (30 loc) · 2.04 KB

File metadata and controls

43 lines (30 loc) · 2.04 KB

Implementation Checklist

Use this checklist when changing protocol, data loading, credit assignment, evaluation, or paper-facing scripts.

Task and dataset contracts

  • load_task() exposes train_datasets and eval_suites in a form directly consumable by load_task_datasets().
  • Local dataset paths resolve correctly even when the current working directory is not the repo root.
  • Eval suite names propagate unchanged into dataset datasource names.
  • Fixture tasks under tests/fixtures/tasks/ still load successfully.

MAS protocol contracts

  • RoleGraph still rejects duplicate names, missing dependencies, and cycles.
  • Prompt rendering still supports {question}, {context}, and prior role outputs without raising on missing keys.
  • The paper-default Reasoner -> Actor path remains intact in configs/tasks/*.yaml and configs/roles/**/*.json.

Algorithm and credit contracts

  • The paper-facing C3 path remains centered on openrlhf/trainer/ppo_utils/experience_maker.py plus c3/credit/c3/*.
  • c3/algorithms/c3.py remains documented as a fallback path, not the primary implementation.
  • Changes to marl_algorithm=auto behavior are intentional and documented.

Evaluation and aggregation contracts

  • MathEnv and CodeEnv reward entrypoints remain compatible with the rollout metadata contract.
  • main_results.py still aggregates by the expected benchmark names:
    • math: MATH500, CMATH-test, GSM8K-test
    • code: MBPP-test, MBPP+
  • Analysis bucket metadata remains compatible with c3/analysis/metrics.py and c3/tools/analysis_results.py.

Tests and release gate

  • pytest -q tests passes.
  • Fixture-based smoke passes for both math and code tasks.
  • bash scripts/audit/pre_release.sh passes.
  • bash scripts/audit/release_gate.sh passes.

Documentation sync

  • If the implementation path changed, update docs/CODE_MAP.md.
  • If the paper-facing mapping changed, update docs/IMPLEMENTATION_AUDIT.md.
  • If release behavior changed, update README.md, docs/GETTING_STARTED.md, and docs/RELEASE_POLICY.md.