Skip to content

Migrate language S3 to brainscore-storage bucket#374

Open
KartikP wants to merge 4 commits intomainfrom
migrate-s3-to-brainscore-storage
Open

Migrate language S3 to brainscore-storage bucket#374
KartikP wants to merge 4 commits intomainfrom
migrate-s3-to-brainscore-storage

Conversation

@KartikP
Copy link
Contributor

@KartikP KartikP commented Feb 18, 2026

Summary

  • Updates brainscore_language/utils/s3.py to use brainscore-storage bucket with brainscore-language/ key prefix
  • Aligns with the same S3 pattern used by brainscore-vision (bucket path splitting via core's load_assembly_from_s3)
  • The old brainscore-language standalone bucket is no longer referenced

Test plan

  • Verify uploads land at s3://brainscore-storage/brainscore-language/{filename}
  • Verify downloads resolve from https://brainscore-storage.s3.amazonaws.com/brainscore-language/{filename}
  • Migrate existing objects from old bucket to new path
  • Run Pereira2018 linear benchmarks to confirm ceiling loading still works

Language was using a standalone "brainscore-language" bucket for uploads
and a hardcoded URL for downloads. This aligns with the vision pattern:
bucket=brainscore-storage with key prefix brainscore-language/.

Existing data on the old brainscore-language bucket will need to be
migrated separately for the linear ceilings to continue loading.
The brainscore-storage bucket does not have versioning enabled, so passing
version_id strings causes 400 errors. Make version_id optional (default None)
in load_from_s3 and remove all version_id values from data registrations,
ceiling kwargs, and packaging scripts. sha1 still provides integrity checks.
The test was calling load_tuckute2024_5subj() which defaults to a
relative CSV path that only exists on the original author's machine.
Load via load_dataset('Tuckute2024.language') instead, validating the
same assembly properties against the S3-hosted data.
@mike-ferguson mike-ferguson added the submission_prepared Attached to a PR is metadata and layer mapping is successful. label Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

submission_prepared Attached to a PR is metadata and layer mapping is successful.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments