fix(evaluator_storage): correct docstring on directory organization #925

kristol07 · 2025-11-12T10:00:22Z

AgentScope Version

commit: 5c3a770

I am updating the latest code in main branch.

Description

fix(evaluator_storage): correct save path ordering in FileEvaluatorStorage

In docstring, the directory structure is:

The files are organized in a directory structure:
    - save_dir/
        - evaluation_result.json
        - evaluation_meta.json
        - {task_id}/
            - {repeat_id}/
                - solution.json
                - evaluation/
                    - {metric_name}.json

But the implementation doesn't follow this structure.

Checklist

Please check the following items before code is ready to be reviewed.

Code has been formatted with pre-commit run --all-files command
All tests are passing
Docstrings are in Google style
Related documentation has been updated (e.g. links, examples, etc.)
Code is ready for review

…orage

kristol07 · 2025-11-12T10:08:36Z

@qbc2016 @DavdGao Please take a review, this is minor change.

DavdGao · 2025-11-17T04:08:45Z

@kristol07 Thanks for pointing out the issue, but the it seems like it's the typo in docstrings rather than the code implementation. Considering we are developing evaluation visualization in agentscope-studio with the current directory organization, maybe just fix the wrong description in docstrings instead?

kristol07 · 2025-11-17T04:26:41Z

@DavdGao I think the best approach depends on how you want to interpret or evaluate the results. In my situation, since there are multiple distinct testing scenarios and I want to assess my agent's stability in each one, I’m more interested in the outcomes of each repeated task within the same scenario. Therefore, grouping the results by task ID is preferable in my case, that's why I thought it's code error. On the other hand, if all the testing scenarios are of the same type, it makes more sense to group by repeat ID and review the overall results across all test scenarios, that may be the case of agentscope-studio.

Grouped by task (test case):

Grouped by repeatId:

kristol07 · 2025-11-20T02:17:36Z

@DavdGao I think the best approach depends on how you want to interpret or evaluate the results. In my situation, since there are multiple distinct testing scenarios and I want to assess my agent's stability in each one, I’m more interested in the outcomes of each repeated task within the same scenario. Therefore, grouping the results by task ID is preferable in my case, that's why I thought it's code error. On the other hand, if all the testing scenarios are of the same type, it makes more sense to group by repeat ID and review the overall results across all test scenarios, that may be the case of agentscope-studio.

Grouped by task (test case):

Grouped by repeatId:

Hi @DavdGao Do you have any suggestion on the flexibility to be provided to developers? For your comment, pr is updated already.

cla-assistant · 2025-12-02T09:50:57Z

All committers have signed the CLA.

DavdGao

LGTM, and thanks for your contribution to the agentscope library

fix(evaluator_storage): correct save path ordering in FileEvaluatorSt…

c8b36f3

…orage

kristol07 closed this Nov 12, 2025

kristol07 reopened this Nov 12, 2025

DavdGao added the Documentation Improvements or additions to documentation label Nov 17, 2025

fix: docstring but not code impl

7cfb310

kristol07 changed the title ~~fix(evaluator_storage): correct save path ordering in FileEvaluatorSt…~~ fix(evaluator_storage): correct docstring on directory organization Nov 20, 2025

DavdGao approved these changes Dec 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(evaluator_storage): correct docstring on directory organization #925

fix(evaluator_storage): correct docstring on directory organization #925

Uh oh!

kristol07 commented Nov 12, 2025 •

edited

Loading

Uh oh!

kristol07 commented Nov 12, 2025

Uh oh!

DavdGao commented Nov 17, 2025

Uh oh!

kristol07 commented Nov 17, 2025 •

edited

Loading

Uh oh!

kristol07 commented Nov 20, 2025 •

edited

Loading

Uh oh!

cla-assistant bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

DavdGao left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(evaluator_storage): correct docstring on directory organization #925

Are you sure you want to change the base?

fix(evaluator_storage): correct docstring on directory organization #925

Uh oh!

Conversation

kristol07 commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AgentScope Version

Description

Checklist

Uh oh!

kristol07 commented Nov 12, 2025

Uh oh!

DavdGao commented Nov 17, 2025

Uh oh!

kristol07 commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kristol07 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cla-assistant bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavdGao left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kristol07 commented Nov 12, 2025 •

edited

Loading

kristol07 commented Nov 17, 2025 •

edited

Loading

kristol07 commented Nov 20, 2025 •

edited

Loading

cla-assistant bot commented Dec 2, 2025 •

edited

Loading

DavdGao left a comment •

edited

Loading