Skip to content

Releases: eric9n/Kun-peng

v0.7.12

04 Nov 08:05

Choose a tag to compare

add hash capacity

v0.7.11

01 Nov 06:34

Choose a tag to compare

  • Fixed FASTQ tail trimming so quality lines retain valid @ characters, preventing accidental base masking.
  • Enhanced interleaved FASTQ detection to recognize both /1 /2 and Illumina 1:N (line 0, column 1) style headers.

v0.7.10

01 Nov 03:44

Choose a tag to compare

This release fixes critical bugs in the FastaReader and BufferFastaReader that caused incorrect sequence parsing and data loss. Both readers now use a new, robust parsing engine that correctly handles multi-line sequences, comments, and whitespace. FASTA file reading is now stable, accurate, and reliable.

v0.7.8

31 Oct 08:53

Choose a tag to compare

  • This release introduces a major database construction workflow improvement with the new add_library command, allowing users to safely add local FASTA files while automatically preventing duplicates.

  • The database build commands have been refactored into build (for the full process) and build_db (for building hashes only) to make their intent clearer. Additionally, build_db commands now support a new -c flag, allowing advanced users to specify an exact hash table capacity to skip the time-consuming estimation step.

v0.7.7

30 Oct 07:48

Choose a tag to compare

fix(build): Resolve critical data loss and UB in parallel data processing

This commit addresses two critical, independent bugs that together caused
non-deterministic data loss and corrupted data reads during the
database build process.

1. Fixed Critical Data Loss in set_page_cell (The Logic Bug)

  • Problem: The set_page_cell function had a fatal logic error in its
    loop exit condition (if result.is_ok() || idx == first_idx).
  • Consequence: When par_iter caused a key conflict (a non-matching
    key in an occupied slot), the result was Err but idx == first_idx
    was true. The loop would immediately break, silently discarding
    the data
    instead of performing linear probing. This led to massive,
    random data loss depending on thread race conditions.
  • Fix: The loop logic was rewritten to correctly match on the
    fetch_update result. It now only breaks on Ok (success) or after
    a full circular probe, and correctly increments idx on Err to
    perform linear probing as intended.

2. Fixed Data Corruption (UB) in read_first_block_from_file (The I/O Bug)

  • Problem: The function previously allocated a Vec<u8> (1-byte
    alignment) and unsafely cast its pointer to *const u32 (4-byte
    alignment), causing Undefined Behavior (UB).
  • Consequence: This alignment mismatch resulted in non-deterministic,
    corrupted data being read from the file, which was the root cause of
    seeing different values (e.g., 3995202615 vs 2133185527) at the
    same index on different runs.
  • Fix: Replaced the implementation to first allocate a Vec<u32>
    (guaranteeing alignment) and then used bytemuck to safely cast to
    &mut [u8]. This enables high-performance, zero-copy reads directly
    into the buffer while being memory-safe.

These two fixes together ensure the database build is now correct,
deterministic, and robust against race conditions, fully resolving
the data loss and corruption issues.

v0.7.6

29 Oct 16:09

Choose a tag to compare

Add support for auto-detecting interleaved FASTQ

v0.7.5

29 Jan 10:07

Choose a tag to compare

multiple gzip members

v0.7.4

29 Sep 10:37

Choose a tag to compare

  • input files

v0.7.3

29 Sep 01:38

Choose a tag to compare

  • input files

v0.7.2

24 Sep 11:33

Choose a tag to compare

  • jemalloc