Extra fields present in the parquet data not present in the protobuf schema will be ignored.
However, it might be possible that:
- there are some fields in protobuf schema which are missing in the parquet data
- field names are same but the data type is different
We would need answers to as well as solve for :
- Should the Parquet Data Source set default values for fields which are not found in the parquet file but present in the schema ? If yes, what should be the default value ?
- If no defaults are wanted to be set, should the Dagger job fail ?