dicts: Add option to use the *_byReference variants at creation time#60
Open
coxley wants to merge 4 commits intovalyala:masterfrom
Open
dicts: Add option to use the *_byReference variants at creation time#60coxley wants to merge 4 commits intovalyala:masterfrom
coxley wants to merge 4 commits intovalyala:masterfrom
Conversation
coxley
commented
Apr 18, 2024
Author
|
@valyala: You seem to have been busy with bigger things, so sorry to bug you with yet another PR! Have you thought about roping in another maintainer to help field changes? Do you still want contributions for missing surface area with the main |
Author
|
@valyala: Is there interest in these kind of contributions? I'd prefer to not do an internal fork, but there are some capabilities missing here when compared to the upstream lib. Happy to make changes to the PR however you see fit if you have other implementation preferences. |
GrigoryEvko
added a commit
to GrigoryEvko/gozstd
that referenced
this pull request
Aug 2, 2025
…ng codebase Integrated community contributions: - PR valyala#49: CGO wrapper improvements for 5-7% performance gain on large buffers - Use void* instead of uintptr_t to avoid memory allocations - Direct Go slice usage via reflect.SliceHeader - PR valyala#25: Advanced Compression API with checksum support - Added CCtx type for advanced compression contexts - Added SetParameter/GetParameter methods - Added Reset and Compress2 methods - Full support for all ZSTD compression parameters - PR valyala#63: Exposed CompressDictLevel as public API - Allows fine-grained control over dictionary compression levels - PR valyala#66: RISC-V 64-bit architecture support - Updated Zig builder to 0.13.0 - Added linux_riscv64 target - PR valyala#60: Memory-optimized dictionary functions - Added NewCDictByRef/NewDDictByRef to avoid data copying - Reduces memory usage for large dictionaries Infrastructure improvements: - Created modern Dockerfile with Alpine Linux and latest Zig - Fixed build process issues with clean target - Updated minimum Go version to 1.24 Code organization: - Moved Docker configs to build/docker/ - Moved scripts to scripts/ - Moved upstream zstd to contrib/ - Moved test data to test/ - Created comprehensive examples in examples/ - Kept all Go source files in root for package compatibility Testing enhancements: - Added Silesia Corpus compression tests with speed measurements - Created 33 aggressive fuzz tests targeting known vulnerabilities - Added comprehensive tests for Advanced API - Added benchmarks comparing raw zstd vs wrapper performance The wrapper now shows 6-10% performance improvements for compression while maintaining identical compression ratios.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Related issue: #59
This adds two additional functions to create a
CDictorDDict:NewCDictByRef(dict []byte) (*CDict, error)NewCDictLevelByRef)NewDDictByRef(dict []byte) (*DDict, error)My particular use-case is having hundreds (~500-2000) of dictionaries cached locally in-memory, and not wanting to potentially have three copies of each.
CDictDDictTo mitigate the risk of the input bytes being garbage collected or moved, they get pinned and unpinned once
gozstd.CDictandgozstd.DDictare released. There's still risk of users mutating the input, but this is advised against in the documentation. Users of this feature implicitly accept the risks of optimization.Test Plan
I've added two types of tests:
gozstd_timing_testbenchmarksTestCompressDecompressDistinctConcurrentDictsByRefI'm happy to add more if you think it reasonable, but I think this stresses the right things: making sure that the same underlying bytes can be used concurrently with
CDictandDDict.Command Output
Full Benchmark Output