Skip to content

Add security benchmark with ASTRA#361

Open
XZ-X wants to merge 5 commits into
OpenHands:mainfrom
XZ-X:astra-dev
Open

Add security benchmark with ASTRA#361
XZ-X wants to merge 5 commits into
OpenHands:mainfrom
XZ-X:astra-dev

Conversation

@XZ-X
Copy link
Copy Markdown

@XZ-X XZ-X commented Jan 26, 2026

We use ASTRA to generate a red-teaming dataset based on the security policy in the OpenHands coding agent. The dataset is publicly available at here.

This PR contains code for downloading, inferencing, and reporting performance the ASTRA dataset.

@juanmichelini juanmichelini self-requested a review January 29, 2026 16:23
@juanmichelini
Copy link
Copy Markdown
Collaborator

@XZ-X thanks for the PR! I undestand it is a PR, just so you have in the radar, could you add a README that includes example commands to run?

@juanmichelini juanmichelini removed their request for review March 13, 2026 21:14
@XZ-X
Copy link
Copy Markdown
Author

XZ-X commented Apr 27, 2026

Hi @juanmichelini, sorry for the delay. I added the readme and tested my script for the latest branches of the benchmark repo. Thank you for your suggestions.

@XZ-X XZ-X marked this pull request as ready for review April 27, 2026 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants