Skip to content

UPSTREAM PR #1277: feat: add OptionFlags for ArgOptions; complete SDGenerationParams to_json/from_json conversion#54

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-1277-options_json_conv
Open

UPSTREAM PR #1277: feat: add OptionFlags for ArgOptions; complete SDGenerationParams to_json/from_json conversion#54
loci-dev wants to merge 1 commit intomainfrom
loci/pr-1277-options_json_conv

Conversation

@loci-dev
Copy link

Note

Source pull request: leejet/stable-diffusion.cpp#1277

This PR adds the functionality necessary to convert generic option param types to and from json using the ArgOptions. To achieve this, it adds a number of features:

  • adds OptionFlags to all of the Option types. This implements the ability to mark options as assigned during command line parsing, making it easier to iterate through options manually set by the caller. This also adds a no network tag to mark options that are not meant to be transferred over the network.
  • Utilizing this, I added options_to_json and options_from_json. These are designed to be generic, able to take an option parameter class and it's corresponding parsed ArgOptions. The to_json variant makes use of the assigned flag to only focus on parameters that were specifically declared by the caller. These functions operate on all of the basic ArgOptions types dynamically, converting the long_name into a json friendly expression matching the old to_json_str formatting (underscores instead of dashes). To use this function with an option parameter class, it will need to implement manual_options_to_json(std::vector<ManualOptions> const&, json&) and manual_options_from_json(std::vector<ManualOptions>&, json const&) to finalize the conversion for the manual options.
  • Finally, I added the manual options conversions into SDGenerationParams, as these are the parameters that are most relevant for network transfer to/from the client and server. I also added json to_json(ArgOptions const&) and void from_json(json const&) convenience methods.

@loci-dev loci-dev temporarily deployed to stable-diffusion-cpp-prod February 14, 2026 04:15 — with GitHub Actions Inactive
@loci-review
Copy link

loci-review bot commented Feb 14, 2026

Overview

Analysis of 48,438 functions across stable-diffusion.cpp reveals mixed performance impacts from a single architectural commit introducing OptionFlags and refactoring JSON serialization. Modified: 204 functions; New: 131; Removed: 45; Unchanged: 48,058.

Binaries analyzed:

  • build.bin.sd-server: 515,491 nJ → 519,436 nJ (+0.765%)
  • build.bin.sd-cli: 480,110 nJ → 483,732 nJ (+0.754%)

Power consumption increases remain under 1%, indicating maintained energy efficiency.

Function Analysis

SDGenerationParams::from_json_str (build.bin.sd-server) shows exceptional improvement: response time decreased 1,522,872 ns (34.4% faster: 4,428,253 ns → 2,905,380 ns), throughput time decreased 507 ns (77.7% faster: 652 ns → 145 ns). The refactoring replaced ~95 lines of manual JSON field deserialization with template-based options_from_json(), directly benefiting server request processing latency.

std::_Hashtable::end for CacheDitConditionState (build.bin.sd-server) regressed significantly: response time increased 162 ns (138.5%: 117 ns → 280 ns), throughput time increased 162 ns (194.6%: 83 ns → 245 ns). Despite no source changes, this cache validation function shows compiler optimization failure. Called during DiT condition caching in inference paths, though absolute impact remains minimal (<0.001% of typical inference time).

Standard library improvements: Multiple iterator functions (std::__advance variants) improved consistently by 43.2% (44.57 ns → 25.32 ns). Vector allocators showed divergent patterns: std::vector<BoolOption>::_S_max_size improved 49-55% while std::vector<ManualOption>::_S_max_size regressed 52-71%, correlating with struct size changes from added OptionFlags members.

Other analyzed functions (make_move_iterator, json::create, shared_ptr operations) showed compiler-driven changes in standard library code with minimal real-world impact on initialization paths.

Additional Findings

The commit focused on architectural improvements (eliminating code duplication, network-aware serialization) rather than performance optimization. No GPU operations or ML inference kernels were modified. The JSON parsing improvement directly benefits server deployments, while cache hashtable regression has negligible impact given its sub-microsecond absolute time relative to multi-second inference operations. Compiler version differences (likely GCC 13) drove most performance variations in standard library template instantiations.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants