Skip to content

Mitigate judge's order bias with swap mode#4

Merged
geoalgo merged 1 commit intoOpenEuroLLM:mainfrom
Ali-Elganzory:order_bias
Nov 17, 2025
Merged

Mitigate judge's order bias with swap mode#4
geoalgo merged 1 commit intoOpenEuroLLM:mainfrom
Ali-Elganzory:order_bias

Conversation

@Ali-Elganzory
Copy link
Copy Markdown
Member

No description provided.

@geoalgo geoalgo merged commit 609eaf3 into OpenEuroLLM:main Nov 17, 2025
1 check failed
judge_model=args.judge_model,
n_instructions=args.n_instructions,
provide_explanation=args.provide_explanation,
swap_mode=args.swap_mode,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point args.swap_mode is a string but we map it to a boolean, we should probably rather test args.swap_mode == "fixed" rather?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. This got lost in the change from correct_order_bias (bool) to swap_mode (str).

for annotation in annotations_reversed
]
)
prefs = (prefs + (1 - prefs_reversed)) / 2.0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work in case of nans? Seems like return 2n battles would be cleaner?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does:

>>> pd.Series([1, 2, 3, None]) + pd.Series([1, 2, None, 4])
0    2.0
1    4.0
2    NaN
3    NaN
dtype: float64

Nonetheless, I think your suggestion is, generally, cleaner. How about the following?

prefs = pd.concat([prefs, (1 - prefs_reversed)]).reset_index(drop=True)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds good to me

@Ali-Elganzory Ali-Elganzory mentioned this pull request Nov 17, 2025
@Ali-Elganzory
Copy link
Copy Markdown
Member Author

I have created #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants