UPSTREAM PR #1217: feat(server): add generation metadata to png images by loci-dev · Pull Request #41 · auroralabs-loci/stable-diffusion.cpp

loci-dev · 2026-02-02T10:47:16Z

Note

Source pull request: leejet/stable-diffusion.cpp#1217

loci-review · 2026-02-02T11:52:23Z

No summary available at this time. Visit Loci Inspector to review detailed analysis.

loci-review · 2026-02-21T05:06:55Z

Overview

Analysis of 48,320 functions across two binaries reveals minimal performance impact. Modified functions: 111 (0.23%), new: 11, removed: 6, unchanged: 48,192 (99.73%).

Binaries analyzed:

build.bin.sd-cli: +0.708% power consumption (+3,398.65 nJ)
build.bin.sd-server: +0.721% power consumption (+3,717.22 nJ)

Changes stem from PNG metadata embedding feature additions across 5 files. Performance impacts are concentrated in C++ standard library functions rather than application code, likely due to compiler optimization differences between builds.

Function Analysis

Significant regressions (200-316% throughput increases):

__iter_equals_val (sd-cli): +316.56% throughput (+184.66ns), +233.86% response (+184.65ns). Used in std::find operations during tokenization and parameter validation. No source changes; STL implementation affected by compiler differences.
std::_Rb_tree::end/begin (both binaries, 3 instances): +289-307% throughput (+182-183ns), +222-228% response. Used in std::map iterations for configuration, embeddings, and parameter lookups. No source changes; red-black tree accessor functions affected by inlining decisions.
std::vector::end for MountPointEntry (sd-server): +306.60% throughput (+183.29ns), +227.57% response. Used in HTTP file request handling. Likely lost inlining optimization.
__val_comp_iter (sd-server): +260.22% throughput (+221.99ns), +186.75% response. Compiler-generated comparator for HTTP range coalescing. No source changes.
_M_bucket_index (sd-cli): +54.48% throughput (+40.52ns), +20.86% response. Hash table operations for CacheDitConditionState::cache_diffs.
make_shared<Conv2d> (sd-cli): +51.56% throughput (+44.10ns), +1.92% response. Affects model initialization, not inference.

Significant improvements:

std::vector<std::thread>::end (sd-cli): -75.41% throughput (-183.30ns), -69.13% response. Improves thread synchronization during model loading.
make_move_iterator (sd-server): -68.40% throughput (-168.52ns), -58.61% response. Better move semantics optimization.
Iterator operator+ for LoraModel (sd-server): -48.19% throughput (-69.31ns), -42.12% response. Improves LoRA weight patching.

Other analyzed functions showed negligible changes.

Additional Findings

All affected functions are in initialization, configuration, or post-processing paths—not in the critical ML inference loop. Core GPU operations (GGML tensor computations, diffusion steps, VAE decoding) remain unaffected. Cumulative worst-case overhead across all regressions is ~1µs, negligible compared to typical inference time (2-10 seconds). The 0.7% power increase is acceptable for the added PNG metadata embedding functionality. Changes justify performance trade-offs as they enable reproducibility features without impacting inference quality or speed.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

loci-dev temporarily deployed to stable-diffusion-cpp-prod February 2, 2026 10:47 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from c0dc6dd to 473a170 Compare February 2, 2026 11:20

loci-dev force-pushed the main branch 27 times, most recently from 68f62a5 to 342c73d Compare February 9, 2026 04:49

loci-dev force-pushed the main branch from 342c73d to 8c51734 Compare February 10, 2026 04:52

wbruna added 2 commits February 10, 2026 14:49

feat(server): add generation metadata to png images

2b19c83

feat: add flag to disable the embedding of generation metadata

be6f95b

loci-dev force-pushed the main branch 2 times, most recently from 3ad80c4 to 74d69ae Compare February 12, 2026 04:47

loci-dev force-pushed the main branch from 74d69ae to 10ea7dd Compare February 20, 2026 04:16

loci-dev force-pushed the loci/pr-1217-sd_server_png_metadata branch from 9533c5e to be6f95b Compare February 21, 2026 04:12

loci-dev temporarily deployed to stable-diffusion-cpp-prod February 21, 2026 04:12 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #1217: feat(server): add generation metadata to png images#41

UPSTREAM PR #1217: feat(server): add generation metadata to png images#41
loci-dev wants to merge 2 commits intomainfrom
loci/pr-1217-sd_server_png_metadata

loci-dev commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 21, 2026

Overview

Function Analysis

Additional Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants