Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ Several modules are compiled and registered in the `seqpro` PyO3 module (`lib.rs
- All sequence arrays use `|S1` (single-byte ASCII) as the canonical in-memory format for string sequences, and `uint8` for OHE.
- Axis arguments (`length_axis`, `ohe_axis`) are always integers referring to the NumPy axis index; negative indexing is supported.
- The `transforms/` module wraps functional ops into callable objects with `__call__` for use in data pipelines.
- **Ragged operations: free function is canonical, method delegates.** Public
Ragged/sequence operations live as free functions in the relevant `_ops`-style
module (the implementation home, e.g. `rag/_ops.py`); the corresponding
`Ragged` method is a thin one-line delegator to it (e.g. `Ragged.hash` →
`rag._ops.hash`). Add new operations this way. This matches
`reverse_complement` and `concatenate`.
- Conventional commits are enforced — use `feat:`, `fix:`, `ci:`, `bump:`, `refactor:`, `docs:`, etc. prefixes.
- **Validation is opt-in and front-loaded.** Add fast-fail/input validation via a `validate=` flag (or equivalent single opt-in), not per-feature `error` modes. There must be one obvious way to ask "is this input clean?" — don't duplicate the check across parameters.
- **No naive NumPy in hot paths.** Never use raw Python loops or naive NumPy (e.g. per-segment `np.concatenate`, Python `for` over sequences) where a Numba kernel is faster and leaner — unless the NumPy version is *verifiably* comparable in time and memory. When Numba is a poor fit (graph algorithms like k-shuffle), use the Rust/PyO3 extension (`src/`).
Expand Down
94 changes: 94 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@ anyhow = "1.0.99"
ndarray = { version = "0.17", features = ["rayon"] }
numpy = "0.28.0"
rand = { version = "0.8.5", features = ["small_rng"] }
digest = "0.10"
md-5 = "0.10"
rapidhash = "4"
rayon = "1.11.0"
sha2 = "0.10"
thiserror = "1.0.69"
xxhash-rust = { version = "0.8.15", features = ["xxh3"] }

Expand Down
Loading
Loading