Migrate language S3 to brainscore-storage bucket#374
Open
Conversation
Language was using a standalone "brainscore-language" bucket for uploads and a hardcoded URL for downloads. This aligns with the vision pattern: bucket=brainscore-storage with key prefix brainscore-language/. Existing data on the old brainscore-language bucket will need to be migrated separately for the linear ceilings to continue loading.
The brainscore-storage bucket does not have versioning enabled, so passing version_id strings causes 400 errors. Make version_id optional (default None) in load_from_s3 and remove all version_id values from data registrations, ceiling kwargs, and packaging scripts. sha1 still provides integrity checks.
The test was calling load_tuckute2024_5subj() which defaults to a
relative CSV path that only exists on the original author's machine.
Load via load_dataset('Tuckute2024.language') instead, validating the
same assembly properties against the S3-hosted data.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
brainscore_language/utils/s3.pyto usebrainscore-storagebucket withbrainscore-language/key prefixload_assembly_from_s3)brainscore-languagestandalone bucket is no longer referencedTest plan
s3://brainscore-storage/brainscore-language/{filename}https://brainscore-storage.s3.amazonaws.com/brainscore-language/{filename}