feat: support more primitives by freshtonic · Pull Request #72 · cipherstash/ore.rs

freshtonic · 2026-05-04T23:50:26Z

Summary

Introduces a top-level ToOrderableBytes trait in packages/orderable-bytes/src/lib.rs that replaces the per-type free to_orderable_bytes functions with a single trait carrying const ENCODED_LEN: usize, type Bytes: AsRef<[u8]>, and fn to_orderable_bytes(&self) -> Self::Bytes. Existing chrono and decimal modules and the in-tree ore-rs consumer are migrated to the trait API.
Adds a new primitive module (packages/orderable-bytes/src/primitive.rs) with 14 ToOrderableBytes impls covering bool, char, every native unsigned and signed integer (u8–u128, i8–i128), and both IEEE 754 floats (f32, f64).
Wires every primitive into ore-rs end-to-end: OreEncrypt now has impls for all 14 primitives in packages/ore-rs/src/encrypt.rs, all routing through ToOrderableBytes. The previously-bespoke u32/u64/f64 OreEncrypt impls are migrated to the same path so the file is uniform with chrono.rs / decimal.rs. The legacy ToOrderedInteger / FromOrderedInteger bridge in packages/ore-rs/src/convert.rs becomes dead code and is removed.
Each primitive impl emits the type's native byte width — no padding. Consumers that need a fixed wider encoding (e.g. an ORE construction with [u8; 8] plaintext blocks) can zero-extend upstream of the encrypter; widening is monotonic on lex order so the encoding's guarantees are preserved.

Encoding strategies

bool — false → 0x00, true → 0x01 (already in lex order)
Unsigned integers — native big-endian (already in lex order)
Signed integers — XOR the sign bit at the native width, then big-endian. Moves negatives below positives and preserves order within each sign class.
char — big-endian bytes of *self as u32. Rust's Ord for char is by code point and surrogates are not representable, so the native u32 lex order is exactly the order we need.
f32 / f64 — branchless monotonic mapping: negatives flip every bit, positives flip just the sign bit. -0.0 is canonicalised to +0.0 before encoding so byte equality matches -0.0 == 0.0. NaN handling is unspecified — the trait's order/equality contract only applies to non-NaN inputs.

Compatibility

The new f64 byte encoding is bit-for-bit identical to the legacy ToOrderedInteger::map_to::<u64>().to_be_bytes() output (same sign-bit canonicalisation, same XOR mask, same -0.0 → +0.0 collapse), so existing f64 ciphertexts on disk remain comparable against ciphertexts produced by this branch.

Test plan

cargo test -p orderable-bytes — 32 unit tests in primitive.rs (known-anchor + ascending-order pairs per type, plus 0.0/-0.0/subnormal coverage for both floats); --all-features brings the total to 54 with chrono and decimal quickcheck/property tests passing under the migrated trait API
cargo test -p ore-rs --all-features — 53 unit tests + 5 doc-tests pass, including the signed_zeros_compare_equal regression test that pins (-0.0).encrypt(&ore) == 0.0.encrypt(&ore)
cargo fmt --check --all — clean
In-tree ore-rs consumers (chrono.rs, decimal.rs, encrypt.rs) all call .to_orderable_bytes() via the trait; no per-type bespoke encoding paths remain (verified: zero to_be_bytes / map_to callers in those files)
char ordering test spans ASCII, BMP, and supplementary planes (crosses the surrogate gap from U+D7FF to U+E000)
f32 / f64 subnormal tests confirm strict ordering: 0.0 < f*::from_bits(1) < f*::MIN_POSITIVE

auxesis

@freshtonic walked me through this on a call, and it looks good.

Replace the per-type free `to_orderable_bytes` functions with impls of a new top-level `ToOrderableBytes` trait that exposes the encoded length as an associated `const ENCODED_LEN` and the byte array as an associated `type Bytes: AsRef<[u8]>`. Migrate the in-tree `ore-rs` consumer to the trait API.

Adds a `numeric` module with `ToOrderableBytes` impls for the signed integer primitives `i16`, `i32`, `i64` and the IEEE 754 double `f64`, each emitting the type's native byte width: - Integers: sign-flip the top bit, then big-endian. Moves negatives below positives in lex order while preserving order within each sign class. - f64: standard IEEE 754 monotonic mapping (flip all bits for negatives, sign bit only for positives), with `-0.0` canonicalised to `+0.0` so the two share an encoding. NaN handling is unspecified (NaN is unordered under `PartialOrd`). Mirrors the `IntoOrePlaintext<u64>` impls in cipherstash-suite::ope_indexer::conversion, but at native widths rather than always widening to u64.

Match the `IntoOrePlaintext<u64>` widening used by the cipherstash-suite ORE indexer: sign-flip at native width, then zero-extend to `u64` before BE serialisation. All four primitive impls now return `[u8; 8]` so they share the same downstream ORE ciphertext shape.

Adds a `bool` impl in the `numeric` module, padded to `[u8; 8]` to match the other primitive impls. `false` encodes as `[0; 8]` and `true` as `[0, 0, 0, 0, 0, 0, 0, 1]`, mirroring the `IntoOrePlaintext<u64>` impl in cipherstash-suite::ope_indexer (`OrePlaintext(*x as u64)`).

The module now hosts a `bool` impl alongside the integer and float impls; `primitive` describes the contents more accurately than `numeric`.

Extends the `primitive` module with four more impls: - `u8` → `[u8; 8]`, zero-extended to `u64` BE. - `i8` → `[u8; 8]`, sign-flipped at `u8` width then zero-extended. - `u128` → `[u8; 16]`, native BE (already lex-ordered, no sign-flip). - `i128` → `[u8; 16]`, sign-flipped at `u128` width then native BE. The 8-bit pair shares the `[u8; 8]` width with `bool`/`i16`/`i32`/ `i64`/`f64` so they all route through the same downstream ORE ciphertext shape. The 128-bit pair uses native width since there's no wider standard integer type to pad to.

Padding to a fixed `[u8; 8]` is the consumer's concern, not the encoding's: ORE constructions that need a uniform 8-byte plaintext should zero-extend upstream of the encrypter (widening is monotonic on lex order and preserves the encoding's guarantees), while OPE schemes can consume the native width directly. Reverts the earlier widen-to-`[u8; 8]` decision for `bool`, `u8`, `i8`, `i16`, `i32`. New widths: - `bool`, `u8`, `i8` → `[u8; 1]` - `i16` → `[u8; 2]` - `i32` → `[u8; 4]` `i64`, `u128`, `i128`, `f64` were already at native width.

Adds the remaining native unsigned integer widths so the trait covers every primitive listed in the module doc. Each impl is the no-op big-endian path used for u8/u128 (already in lex order). Also adds a dedicated `bool` subsection to the module doc for symmetry with the other per-type encoding-strategy paragraphs.

`char` encodes as the big-endian bytes of its `u32` Unicode scalar value — Rust's `Ord` for `char` is by code point and surrogates aren't representable, so the native u32 lex order is exactly what we need. `f32` reuses the f64 monotonic mapping (sign-bit flip for positives, all-bits flip for negatives, -0.0 canonicalised to +0.0) narrowed to u32. Module doc gains a `char` section and folds the float docs into a single IEEE 754 section covering both widths.

Migrates the existing `u32`/`u64`/`f64` `OreEncrypt` impls to call `orderable_bytes::ToOrderableBytes::to_orderable_bytes()` and adds impls for the remaining 11 primitives covered by the trait: `bool`, `u8`/`u16`/`u128`, `i8`/`i16`/`i32`/`i64`/`i128`, `char`, and `f32`. The new `f64` byte path is bit-for-bit identical to the previous `ToOrderedInteger::map_to::<u64>` mapping (same sign-bit canonicalisation, same XOR mask, same `-0.0 → +0.0` collapse), so on-disk ciphertexts remain compatible. Each impl follows the same trait-driven shape used by the chrono and decimal consumers, with encoded lengths lifted into module-level `const`s because stable Rust won't accept `<Self as ToOrderableBytes>::ENCODED_LEN` directly in const-generic position.

`ToOrderedInteger` / `FromOrderedInteger` were the legacy bridge from `f64` to a lex-orderable `u64` plaintext. Now that `OreEncrypt for f64` goes through `orderable_bytes::ToOrderableBytes`, the trait and its sole impl are unused. Remove the module and its `mod convert;` entry.

freshtonic changed the title ~~feat: support more primities~~ feat: support more primitives May 4, 2026

freshtonic marked this pull request as draft May 4, 2026 23:55

freshtonic marked this pull request as ready for review May 5, 2026 00:39

auxesis approved these changes May 5, 2026

View reviewed changes

freshtonic added 11 commits May 5, 2026 11:08

refactor(orderable-bytes): rename numeric module to primitive

c5fea8c

The module now hosts a `bool` impl alongside the integer and float impls; `primitive` describes the contents more accurately than `numeric`.

freshtonic force-pushed the feat/support-more-primitives branch from 1b067cf to 3d27b74 Compare May 5, 2026 01:09

freshtonic merged commit 246b82d into main May 5, 2026
2 checks passed

freshtonic deleted the feat/support-more-primitives branch May 5, 2026 01:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support more primitives#72

feat: support more primitives#72
freshtonic merged 11 commits intomainfrom
feat/support-more-primitives

freshtonic commented May 4, 2026 •

edited

Loading

Uh oh!

auxesis left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

freshtonic commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Encoding strategies

Compatibility

Test plan

Uh oh!

auxesis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

freshtonic commented May 4, 2026 •

edited

Loading