Open
Conversation
bentsku
approved these changes
Feb 20, 2026
bentsku
left a comment
There was a problem hiding this comment.
LGTM, I do not see a big issue with this, even if we ended not using it, it is behind an optional flag. We can also extend it if we find other cases.
Thanks for jumping on this, really pragmatic and clean approach 👌
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
For our use case, we have decided to distinguish between 3 cases when it comes to Avro compatibility:
Problem
Avro supports default values, and they are often used within dataclasses.
Imagine the following code:
The generated schema will look like this:
{ "type": "record", "name": "PyType", "namespace": "tests.unit.persistence.avro.aws.test_store_schemas", "fields": [ { "type": { "type": "long", "logicalType": "timestamp-micros" }, "name": "creation_time", "default": 1771420337157151 } ] }Given that we use the default factories to set default values, the
defaultwill be different each time we regenerate the schema. As the code did not change, they should be identical but differ in the default.Avro has a way to represent the schema called "canonical", which excludes the
defaultattribute. Unfortunately, we can't use this option because any addition with a default (which is compatible) would have been marked as incompatible because the defaults are stripped.Solution
We introduce a new option that puts a placeholder instead of factory defaults. We do this for
datetimeanduuids, as they are the ones we did find to change all the time.Usage
When saving Avro states, we'll keep the option disabled, as we want to properly record defaults in the schema saved along with the records. For the schemas we want to keep on disk, we'll turn the option on.