Add a JSON reader option to ignore type conflicts by scovich · Pull Request #7276 · apache/arrow-rs

scovich · 2025-03-12T18:20:13Z

Which issue does this PR close?

Closes Option for JSON parser to return NULL field values on type mismatch #7230

Rationale for this change

JSON data is notoriously non-homogenous, but the JSON parser today is super strict -- it requires a concrete schema and parsing fails if any field of any row encounters a type conflict. In such cases, it can be preferable for incompatible fields to parse as NULL instead of producing a hard error.

What changes are included in this PR?

Adds a new method arrow_json::reader::ReaderBuilder::with_ignore_type_conflicts, which can override the default behavior of throwing on type conflict, to return NULL values instead.

Plumb that flag through to all ten decoders so they honor it: Null, Boolean, Primitive, Decimal, Timestamp, String, StringView, List, Map, Struct.

Add both positive and negative unit tests for each decoder type, to ensure the plumbing worked.

Are there any user-facing changes?

New API method, see above.

scovich · 2025-03-12T18:28:13Z

arrow-json/src/reader/null_array.rs

+            for p in pos {
+                if !matches!(tape.get(*p), TapeElement::Null) {
+                    return Err(tape.error(*p, "null"));
+                }


NOTE: Indentation-only change

tustvold · 2025-03-12T19:26:44Z

Have you run the benchmarks for this?

scovich · 2025-03-12T20:32:20Z

Have you run the benchmarks for this?

Not yet... but https://github.com/apache/arrow-rs/blob/main/CONTRIBUTING.md makes it look very easy. Will do so and report back.

scovich · 2025-03-12T20:35:55Z

@tustvold is cargo bench -p arrow-json sufficient? Or do I need to benchmark some other sub-crates as well? Asking because there didn't seem to be very many benchmarks in the arrow-json crate?

alamb · 2025-03-12T20:43:30Z

@tustvold is cargo bench -p arrow-json sufficient? Or do I need to benchmark some other sub-crates as well? Asking because there didn't seem to be very many benchmarks in the arrow-json crate?

I think this one is probably what @tustvold is referring to: https://github.com/apache/arrow-rs/blob/a75da00eed762f8ab201c6cb4388921ad9b67e7e/arrow/benches/json_reader.rs#L30-L29

so like

cargo bench --bench json_reader

scovich · 2025-03-12T21:45:41Z

Hmm, the benchmark results are not stable from run to run. Even benchmarking the main branch against itself gives a randomly and changing set of regressions and improvements. I tried on two very different computers: 2021 MacBook Pro (Apple M1 Max) and a 2019 Lenovo T490s (Intel Core i5-8365U). Different absolute numbers, same large jitter.

Is there some trick for getting stable numbers? I tried increasing the measurement interval to 10s, it didn't solve the problem.

alamb · 2025-03-12T22:11:12Z

Is there some trick for getting stable numbers? I tried increasing the measurement interval to 10s, it didn't solve the problem.

I am not suepr familar

Maybe you could use a non laptop (sometimes they vary based on thermostats, etc)?

…ts-option

scovich · 2025-03-13T16:19:15Z

Finally got some reasonably stable benchmark results using an EC2 m6i.8xlarge instance, rustc 1.85.0. It uncovered one issue with the append helpers I had introduced. After addressing that, we now have:

benchmark	run1	run2	run3	run4	run5
small_bench_primitive	--noise--	--noise--	--noise--	--noise--	--noise--
large_bench_primitive	2.27% faster	2.27% faster	1.65% faster	2.71% faster	1.36% faster
small_bench_list	--noise--	--noise--	3.69% faster	2.71% faster	6.02% faster

I don't know why my changes should have caused a speedup, but at least there's no slowdown.

Benchmarking commands used

# 5 runs against upstream main branch
git checkout 82c2d5f4c 
for i in $(seq 5); do cargo bench --bench json_reader -- --save-baseline main$i; done

# 5 runs against this PR
git switch json-ignore-type-conflicts-option
for i in $(seq 5); do cargo bench --bench json_reader -- --save-baseline feature$i; done

# compare the run results
for i in $(seq 5); do cargo bench --bench json_reader -- --load-baseline feature$i --baseline main$i; done

(see https://bheisler.github.io/criterion.rs/book/user_guide/command_line_options.html#baselines)

scovich · 2025-03-13T20:35:32Z

@tustvold -- does the above work? Or are there other benchmarks to double check?

scovich · 2025-04-04T20:29:40Z

bump?

Blizzara · 2025-04-23T16:56:28Z

Hey @tustvold, have you had a chance to look at this? :) It would be very useful for our use-case as well 🤞

alamb · 2025-04-28T18:16:31Z

@scovich and @Blizzara -- would this PR be superceded by this PR?

Add custom decoder in arrow-json #7442

BTW I am pretty sure this usecase is what @mwylde described in his blog from arroyo last year: https://www.arroyo.dev/blog/fast-arrow-json-decoding

scovich · 2025-04-28T23:35:23Z

@scovich and @Blizzara -- would this PR be superceded by this PR?
* [Add custom decoder in arrow-json #7442](https://github.com/apache/arrow-rs/pull/7442)

Potentially? But I'd be very interested to see how one could actually achieve the same result with that other PR in practice. I suspect it would require a pretty annoying exercise to replicate each decoder type, just to inject the null handling. Maybe each custom decoder could be a thin wrapper around the default one... except I don't think any of the decoders are public today (see e.g. https://docs.rs/arrow-json/55.0.0/arrow_json/all.html)?

Blizzara · 2025-04-29T12:53:56Z

@scovich and @Blizzara -- would this PR be superceded by this PR?

Add custom decoder in arrow-json #7442

@cht42 can confirm from our side, but I believe we'd need (or like) both; this PR handles type conflicts for most types, while the other (#7442) allows us to handle the specific case for strings. Using #7442 for all conflicts may be possible but likely requires us to have more custom code (as we then need to override all decoders rather than just one, I think).

Rafferty97 · 2026-03-04T23:11:36Z

@scovich @Blizzara I've noticed this PR has been sitting around for a while. From my perspective, this looks like a useful addition to the codebase. Is there still any interest in moving this along?

scovich · 2026-03-05T03:38:17Z

@scovich @Blizzara I've noticed this PR has been sitting around for a while. From my perspective, this looks like a useful addition to the codebase. Is there still any interest in moving this along?

This and similar PR for customizing JSON parsing have stalled on the architectural/API question of whether we should publicly widen the JSON parser interface with more configs, rely on variant for fancy tricks, or just expose the tape decoder and let people do whatever they want. I don't have a good answer to that, tho now that we actually have variant support, suspect that going through variant would only address some of the motivations for customizing a JSON parser.

Rafferty97 · 2026-03-05T05:33:21Z

@scovich Thanks for providing that added context. Given I've got some outstanding PRs that also expand the capabilities of the JSON parser, and thus the available config options, it would be good to tie it together into one conversation. Where is that discussion being held?

scovich · 2026-03-06T13:54:08Z

The conversation has been stalled for several months at this point. A very good question when/how we should resume it. Maybe @alamb or @tustvold has ideas?

alamb · 2026-03-11T17:30:57Z

Sorry -- I have been overwhelmed for a few weeks on DataFusion releases and other things. I hope to get back to this at some point.

alamb

Thank you @scovich

I just went through this PR carefully (and had codex help me). I think it looks good

Some things I think we should do before merging:

Update the comments to explain the user facing behavior in more detail (I left some suggestions)
Resolve the merge conflicts

alamb · 2026-03-18T14:48:06Z

arrow-json/src/reader/mod.rs

+    /// Sets whether the decoder should produce NULL instead of returning an error if it encounters
+    /// a type conflict on a nullable column (effectively treating it as a non-existent column).
+    ///
+    /// NOTE: The inferred NULL on type conflict will still produce errors for non-nullable columns,
+    /// the same as any other NULL or missing value.


Can we please define what a "type conflict" means more specifically? Perhaps with an example

I think it means something like:

the JSON decoder encounters a value that can not be parsed into the specified column type.

For example, if the type is declared to be a nullable [DataType::Int32] but the reader encounters a string value "foo":

If with_ignore_type_conflicts is set to false (the default), the reader will return an error.

If with_ignore_type_conflicts is set to true, the reader will fill in NULL value for that array element

alamb · 2026-03-18T14:53:02Z

arrow-json/src/reader/mod.rs

+    #[test]
+    fn test_type_conflict_non_nullable() {
+        let fields = [
+            Field::new("bool", DataType::Boolean, false),


alamb

Thanks again @scovich

alamb · 2026-03-18T20:41:24Z

The initial benchmarks show some slowdown: #7276 (comment)

I am rerunning to see if it reproducable

alamb · 2026-03-18T21:16:56Z

The benchmarks appear to show 10-15% slowdown for some queries

decode_list_long_i64_json/131072 1.12 345.3±1.77ms 226.8 MB/sec 1.00 307.0±1.67ms 255.0 MB/sec
decode_list_long_i64_serialize 1.09 211.4±6.92ms ? ?/sec 1.00 193.8±5.98ms ? ?/sec
decode_list_short_i64_json/131072 1.10 21.7±0.12ms 240.1 MB/sec 1.00 19.7±0.03ms 264.9 MB/sec

Need to look into potential performance regression

scovich · 2026-03-18T22:19:59Z

@alamb -- I tweaked the inner loop logic in a way that seems to help considerably on my laptop. Can you re-spin the benchmarks?

alamb · 2026-03-21T21:15:46Z

New benchmarks look much better to me (no regressions)

However, the tests are now failing 🤔

scovich · 2026-03-23T18:02:56Z

New benchmarks look much better to me (no regressions)

🎉

However, the tests are now failing 🤔

🤦 found+fixed an unfaithful bit of the refactoring. I don't think it would impact performance, just match arm wrangling.

alamb

Looks good to me -- thank you @scovich

alamb · 2026-03-31T19:53:15Z

run benchmark json_reader

alamb · 2026-03-31T19:53:17Z

run benchmark json_reader

adriangbot · 2026-03-31T19:56:05Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4165081597-637-snwnl 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing json-ignore-type-conflicts-option (0e0d257) to 3b61796 (merge-base) diff
BENCH_NAME=json_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench json_reader
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-03-31T19:56:13Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4165081766-638-w4k2l 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing json-ignore-type-conflicts-option (0e0d257) to 3b61796 (merge-base) diff
BENCH_NAME=json_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench json_reader
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-03-31T20:07:15Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                        json-ignore-type-conflicts-option      main
-----                                        ---------------------------------      ----
decode_binary_hex_json                       1.00     13.7±0.04ms        ? ?/sec    1.02     14.0±0.04ms        ? ?/sec
decode_binary_view_hex_json                  1.02     14.1±0.10ms        ? ?/sec    1.00     13.9±0.06ms        ? ?/sec
decode_fixed_binary_hex_json                 1.01     13.8±0.05ms        ? ?/sec    1.00     13.7±0.06ms        ? ?/sec
decode_list_long_i64_json/131072             1.01    310.2±1.39ms   252.4 MB/sec    1.00    306.7±1.81ms   255.3 MB/sec
decode_list_long_i64_serialize               1.02    187.7±4.79ms        ? ?/sec    1.00    184.5±5.72ms        ? ?/sec
decode_list_short_i64_json/131072            1.02     20.3±0.19ms   257.0 MB/sec    1.00     20.0±0.29ms   261.1 MB/sec
decode_list_short_i64_serialize              1.00     11.5±0.17ms        ? ?/sec    1.02     11.8±0.22ms        ? ?/sec
decode_wide_object_i64_json                  1.02    469.6±3.31ms        ? ?/sec    1.00   459.5±16.10ms        ? ?/sec
decode_wide_object_i64_serialize             1.00   422.3±10.78ms        ? ?/sec    1.02   430.9±14.88ms        ? ?/sec
decode_wide_projection_full_json/131072      1.00    764.3±8.26ms   227.7 MB/sec    1.00   765.9±13.19ms   227.2 MB/sec
decode_wide_projection_narrow_json/131072    1.05    453.2±0.56ms   383.9 MB/sec    1.00    430.8±0.77ms   403.9 MB/sec
infer_json_schema/1000                       1.00  1576.6±15.54µs    80.1 MB/sec    1.00  1578.8±17.01µs    80.0 MB/sec
large_bench_primitive                        1.00   1522.8±2.49µs        ? ?/sec    1.01   1533.2±6.02µs        ? ?/sec
small_bench_list                             1.00      7.9±0.02µs        ? ?/sec    1.02      8.0±0.02µs        ? ?/sec
small_bench_primitive                        1.01      4.4±0.01µs        ? ?/sec    1.00      4.4±0.01µs        ? ?/sec
small_bench_primitive_with_utf8view          1.00      4.4±0.01µs        ? ?/sec    1.00      4.4±0.01µs        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	304.2s
Peak memory	3.5 GiB
Avg memory	2.9 GiB
CPU user	287.4s
CPU sys	16.6s
Disk read	0 B
Disk write	602.9 MiB

branch

Metric	Value
Wall time	307.0s
Peak memory	3.5 GiB
Avg memory	2.9 GiB
CPU user	289.1s
CPU sys	17.8s
Disk read	0 B
Disk write	1.2 MiB

File an issue against this benchmark runner

adriangbot · 2026-03-31T20:07:16Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                        json-ignore-type-conflicts-option      main
-----                                        ---------------------------------      ----
decode_binary_hex_json                       1.00     13.6±0.08ms        ? ?/sec    1.02     13.9±0.06ms        ? ?/sec
decode_binary_view_hex_json                  1.03     14.2±0.33ms        ? ?/sec    1.00     13.7±0.06ms        ? ?/sec
decode_fixed_binary_hex_json                 1.02     13.8±0.08ms        ? ?/sec    1.00     13.5±0.06ms        ? ?/sec
decode_list_long_i64_json/131072             1.01    308.8±1.11ms   253.6 MB/sec    1.00    305.3±0.65ms   256.5 MB/sec
decode_list_long_i64_serialize               1.00    186.7±4.71ms        ? ?/sec    1.00    186.1±4.57ms        ? ?/sec
decode_list_short_i64_json/131072            1.03     20.4±0.11ms   256.4 MB/sec    1.00     19.8±0.10ms   264.2 MB/sec
decode_list_short_i64_serialize              1.00     11.5±0.16ms        ? ?/sec    1.06     12.2±0.23ms        ? ?/sec
decode_wide_object_i64_json                  1.03    468.8±3.34ms        ? ?/sec    1.00    456.4±6.73ms        ? ?/sec
decode_wide_object_i64_serialize             1.00   422.4±11.35ms        ? ?/sec    1.03   436.8±15.92ms        ? ?/sec
decode_wide_projection_full_json/131072      1.00    762.4±8.26ms   228.2 MB/sec    1.00   759.8±11.77ms   229.0 MB/sec
decode_wide_projection_narrow_json/131072    1.05    451.5±0.47ms   385.4 MB/sec    1.00    431.3±1.22ms   403.4 MB/sec
infer_json_schema/1000                       1.00   1577.6±9.68µs    80.0 MB/sec    1.00  1583.8±24.64µs    79.7 MB/sec
large_bench_primitive                        1.00   1524.0±2.48µs        ? ?/sec    1.01   1534.1±2.59µs        ? ?/sec
small_bench_list                             1.00      7.9±0.02µs        ? ?/sec    1.01      8.0±0.02µs        ? ?/sec
small_bench_primitive                        1.01      4.4±0.01µs        ? ?/sec    1.00      4.4±0.01µs        ? ?/sec
small_bench_primitive_with_utf8view          1.01      4.4±0.02µs        ? ?/sec    1.00      4.4±0.01µs        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	304.4s
Peak memory	3.5 GiB
Avg memory	2.9 GiB
CPU user	286.0s
CPU sys	18.2s
Disk read	0 B
Disk write	622.4 MiB

branch

Metric	Value
Wall time	306.3s
Peak memory	3.5 GiB
Avg memory	2.9 GiB
CPU user	288.7s
CPU sys	17.5s
Disk read	0 B
Disk write	1.6 MiB

File an issue against this benchmark runner

alamb · 2026-03-31T20:34:46Z

👌

alamb · 2026-04-06T19:38:01Z

This PR looks like it has some conflicts but otherwise is ready to go

…tion

alamb · 2026-04-07T13:38:02Z

Thank you @scovich

scovich added 2 commits March 6, 2025 12:36

Initial prototype

c7c74ce

test coverage

5e87e5b

github-actions bot added the arrow Changes to the arrow crate label Mar 12, 2025

better null decoding

9f9d7c5

scovich commented Mar 12, 2025

View reviewed changes

scovich added 3 commits March 13, 2025 08:38

More efficient append helpers

0936aa5

Merge remote-tracking branch 'oss/main' into json-ignore-type-conflic…

995f1cd

…ts-option

missing unit tests

bb4e8d4

This comment has been minimized.

Sign in to view

alamb reviewed Mar 18, 2026

View reviewed changes

alamb previously approved these changes Mar 18, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

benchmarksmanship

2f91639

This comment has been minimized.

Sign in to view

fix oopsie

0e0d257

scovich requested a review from alamb March 23, 2026 18:01

alamb approved these changes Mar 31, 2026

View reviewed changes

scovich added 2 commits April 6, 2026 13:18

Merge remote-tracking branch 'oss' into json-ignore-type-conflicts-op…

abf42d1

…tion

import fixes

84f3073

alamb merged commit 43d984e into apache:main Apr 7, 2026
23 checks passed

Conversation

scovich commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

scovich Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

tustvold commented Mar 12, 2025

Uh oh!

scovich commented Mar 12, 2025

Uh oh!

scovich commented Mar 12, 2025

Uh oh!

alamb commented Mar 12, 2025

Uh oh!

scovich commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Mar 12, 2025

Uh oh!

scovich commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scovich commented Mar 13, 2025

Uh oh!

scovich commented Apr 4, 2025

Uh oh!

Blizzara commented Apr 23, 2025

Uh oh!

alamb commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scovich commented Apr 28, 2025

Uh oh!

Blizzara commented Apr 29, 2025

Uh oh!

Rafferty97 commented Mar 4, 2026

Uh oh!

scovich commented Mar 5, 2026

Uh oh!

Rafferty97 commented Mar 5, 2026

Uh oh!

scovich commented Mar 6, 2026

Uh oh!

alamb commented Mar 11, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb commented Mar 18, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

alamb commented Mar 18, 2026

Uh oh!

scovich commented Mar 18, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

alamb commented Mar 21, 2026

Uh oh!

scovich commented Mar 12, 2025 •

edited

Loading

scovich commented Mar 12, 2025 •

edited

Loading

scovich commented Mar 13, 2025 •

edited

Loading

alamb commented Apr 28, 2025 •

edited

Loading

scovich commented Mar 23, 2026 •

edited

Loading

alamb commented Apr 6, 2026 •

edited

Loading