Refactor humaneval_infilling.py to load multiple subsets from the dataset and remove TODO comment by Ki-Seki · Pull Request #105 · SculptAI/GIMBench

Ki-Seki · 2026-04-13T14:14:15Z

No description provided.

…aset and remove TODO comment

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Refactors the HumanEval infilling benchmark entrypoint to load and evaluate multiple dataset subsets instead of a single split, and removes an outdated TODO about the dataset repo.

Changes:

Load four dataset subsets (MultiLine, RandomSpan, RandomSpanLight, SingleLine) and concatenate them into a single dataset.
Shuffle the concatenated dataset with the configured seed before evaluation.
Remove the TODO comment about needing dataset repo repairs.

Comments suppressed due to low confidence (1)

src/gimbench/code/humaneval_infilling.py:28

This script hard-codes split="test" while other dataset entrypoints store the split in args.dataset["split"] and pass it through to load_dataset. For consistency and configurability (and to avoid hidden behavioral changes if the desired split differs), add a split key to args.dataset and use it when loading each subset.

    args.dataset = {
        "path": "Sculpt-AI/humaneval_infilling",
        "subsets": ["MultiLine", "RandomSpan", "RandomSpanLight", "SingleLine"],
    }

    ds = concatenate_datasets(
        [load_dataset(args.dataset["path"], split="test", name=subset) for subset in args.dataset["subsets"]]
    ).shuffle(seed=args.seed)
    logger.info(f"Loaded {len(ds)} samples from dataset {args.dataset}")
    logger.info(f"Columns: {ds.column_names}")
    logger.info(f"First sample: {ds[0]}")

    conduct_eval(args, ds)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/gimbench/code/humaneval_infilling.py

Refactor humaneval_infilling.py to load multiple subsets from the dat…

1e2fe82

…aset and remove TODO comment

Copilot AI review requested due to automatic review settings April 13, 2026 14:14

[pre-commit.ci] auto fixes from pre-commit.com hooks

8b3dd46

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of Ki-Seki April 13, 2026 14:14 View session

Copilot AI reviewed Apr 13, 2026

View reviewed changes

src/gimbench/code/humaneval_infilling.py Show resolved Hide resolved

Ki-Seki merged commit b2fc9da into main Apr 13, 2026
3 checks passed

Ki-Seki deleted the fix/subset branch April 13, 2026 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor humaneval_infilling.py to load multiple subsets from the dataset and remove TODO comment#105

Refactor humaneval_infilling.py to load multiple subsets from the dataset and remove TODO comment#105
Ki-Seki merged 2 commits intomainfrom
fix/subset

Ki-Seki commented Apr 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ki-Seki commented Apr 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants