Mest/feat/report generation behavior#20
Merged
Konohana0608 merged 2 commits intoMay 20, 2026
Merged
Conversation
JavierGOrdonnez
pushed a commit
that referenced
this pull request
May 21, 2026
Hunk 2 (export-button enabled check): restore Melanie's design that enables the button when workflow_results_json exists, not when a PDF glob matches. Hunk 3 (workflow CLI args): remove --report* args accidentally added to the workflow subprocess; report is generated only on Export click (separate report_cli.py call), per PR #20. 4 spec-driven changes (min_inscribed_square_mm 50, partial update_images, criteria display ≤ ±25, update_images partial method) are untouched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Konohana0608
added a commit
that referenced
this pull request
May 22, 2026
… behavior, legend size (#21) * updated the voila notebook to reflect new git path and to fix the matplotlib backend error. * added makefile which can be used in the osparc service to maintain the service easily. * Added the no-data-transparent png to lsf * excluded the assets folder from the png ignoring setting. * updated path to the no data transparent png in the voila since it is now part of the repo. * updated readme with osparc workflow. * feat: add auto-provisioning to voila notebook - ensure_git_lfs(): auto-download git-lfs binary if missing - ensure_project_repo(): clone repo + LFS pull if not present - run_command(), build_subprocess_env(), read_project_version() - run_sarsample() uses local PROJECT_ROOT instead of remote spec - PACKAGE_VERSION read from pyproject.toml * refactor: replace voila Python provisioning with Makefile targets - Add install-repo target to Makefile (clone if absent, pull if present, warn-and-continue on network failure) - lfs-pull also now warns-and-continues instead of hard-failing - Add REPO_URL variable to Makefile - Simplify voila Cell 4: drop ensure_git_lfs/ensure_project_repo/run_command in favour of a minimal bootstrap clone + make install-repo install-git-lfs lfs-pull * simplify: assume Makefile is pre-deployed at workspace top level - Notebook: remove Python bootstrap clone; call 'make' directly from WORKSPACE_ROOT (no path to Makefile needed) - README: collapse first-time setup to a single 'make setup' step; document that Makefile ships with the oSPARC service; update setup target description to reflect install-repo behaviour * [FEAT] further simplifications * [FEAT] further simplify the voila * [FEAT] clarify setup in README * Revert "Additional Fixes" * minor random change to trigger pull request popup on github. * [FEAT] basic playwright testing in jupyter-math + voila wired and green * [FEAT] e2e test suite implemented - skipping features not-yet cherrypicked * [CI] CI runs only on PR * dummy push * [CI] disable slow tests * [FIX] Scan for buttons grid + make both CI stages parallel (#5) * [FIX] Filter toggle button grid: fix PROJECT_ROOT discovery + un-skip 3 tests Root cause: the notebook hardcoded PROJECT_ROOT = WORKSPACE_ROOT / "SAR-Pattern-Validation" but both the pytest fixture (conftest.py voila_server symlink) and the container harness (scripts/run_in_jupyter_math.sh bind mount) name the directory "sar-pattern-validation" (lowercase). SimulatedFilesDB.create_simulated_files_db() called Path.glob("**/*.csv") on the non-existent path, got zero files, and the RadioButtonGrid rendered with zero toggle buttons — the UI showed no filter options. Fix: replace the hardcoded name with a two-candidate discovery that checks "SAR-Pattern-Validation" first (osparc production), then "sar-pattern-validation" (test harness + container), then falls back to the first candidate as a best guess. The 161 CSV files in data/database/ now load correctly in all environments. Tests: remove @pytest.mark.skip from test_filter_toggle_buttons_are_visible, test_clicking_filter_button_activates_it, and test_run_button_enables_after_upload_and_unique_filter -- all three were blocked only by the missing toggle buttons in the DOM. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * [FIX] longer wait for button to become active * [FEAT] improve typing PSSARRowValues Co-authored-by: Copilot <copilot@github.com> * [CI] make both CI stages parallel --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Copilot <copilot@github.com> * [FEAT] Task 6.1 - reverse registration direction (gamma in measured frame) (#6) Per MGD 2026-04-24 feedback: register simulated sSAR onto measured (was the reverse), keep measured in measurement coordinates (no centering), and compute gamma in the measured frame so failing regions are visible in original measurement coordinates. Cherry-picked from rename branch (commit 9af6d42), with conflicts resolved to exclude the unrelated _select_registration_mask refactor. Backend - workflows.py: Rigid2DRegistration uses fixed=measured, moving=reference; registration overlay built against measured_db; GammaMapEvaluator receives reference_to_measured_transform. - gamma_eval.py: rename measured_to_reference_transform -> reference_to_measured_transform; resample reference onto measured grid; evaluation_mask = measured_mask ∩ resampled(reference_mask) on measured grid. - image_loader.py: drop peak-centering of measured plot axes; rename "Measured, After Registration" panel to "Simulated, After Registration". - plotting.py: swap overlay legend (red=Measured, blue=Reference); axis labels back to non-primed measured coords (x_e, y_e). Tests (42 pass, 1 skip — green) - test_gamma_map_evaluator + workflow tests updated for renamed kwarg. - test_tutorial_validation regenerated artifacts (pass_rate stays 100%, evaluated_pixel_count 13884 -> 11452 because gamma now runs on the smaller measured grid). Tutorial notebook - tutorial_gamma_pattern_validation_notebook.ipynb: registration cell flipped; kwarg + variable renames; markdown updated. voila.ipynb: no changes needed (does not reference the old plot panel names directly; uses workflow at higher level). Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * [FEAT] Task 6.4 - Feedback banners (#7) * [FEAT] Task 6.1 - reverse registration direction (gamma in measured frame) Per MGD 2026-04-24 feedback: register simulated sSAR onto measured (was the reverse), keep measured in measurement coordinates (no centering), and compute gamma in the measured frame so failing regions are visible in original measurement coordinates. Cherry-picked from rename branch (commit 9af6d42), with conflicts resolved to exclude the unrelated _select_registration_mask refactor. Backend - workflows.py: Rigid2DRegistration uses fixed=measured, moving=reference; registration overlay built against measured_db; GammaMapEvaluator receives reference_to_measured_transform. - gamma_eval.py: rename measured_to_reference_transform -> reference_to_measured_transform; resample reference onto measured grid; evaluation_mask = measured_mask ∩ resampled(reference_mask) on measured grid. - image_loader.py: drop peak-centering of measured plot axes; rename "Measured, After Registration" panel to "Simulated, After Registration". - plotting.py: swap overlay legend (red=Measured, blue=Reference); axis labels back to non-primed measured coords (x_e, y_e). Tests (42 pass, 1 skip — green) - test_gamma_map_evaluator + workflow tests updated for renamed kwarg. - test_tutorial_validation regenerated artifacts (pass_rate stays 100%, evaluated_pixel_count 13884 -> 11452 because gamma now runs on the smaller measured grid). Tutorial notebook - tutorial_gamma_pattern_validation_notebook.ipynb: registration cell flipped; kwarg + variable renames; markdown updated. voila.ipynb: no changes needed (does not reference the old plot panel names directly; uses workflow at higher level). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * [FEAT] few improvements * [FEAT] add feedback banners * [CI] save playwright videos * [CI] record final screenshot + video for every e2e test Local (make test-voila-e2e): - scripts/run_in_jupyter_math.sh: add ffmpeg to apt-get (video encoding); export PLAYWRIGHT_ARTIFACTS_DIR pointing into the bind-mounted repo dir so artifacts survive container exit at test-artifacts/playwright/ on host. - Keep --video on + --tracing retain-on-failure flags. Screenshots (both paths): - tests/test_voila_e2e.py: _capture_final_screenshot autouse fixture calls page.screenshot() after every test (pass and fail). Reads PLAYWRIGHT_ARTIFACTS_DIR env var; falls back to test-artifacts/playwright/. Named after the test function for unambiguous review before committing. CI (.github/workflows/ci.yml): - Change --screenshot only-on-failure to --screenshot on (consistent with explicit fixture; passes for every test so review is always possible). .gitignore: exclude test-artifacts/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] avoid re-running if same files --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [FEAT] Task 6.2 - measurement-area inputs with bounded validation Per MGD 2026-04-24 feedback (slide 3): expose user-controlled measurement-area dimensions (x, y in mm) with hard bounds. When set, the plot canvas is forced to a centered square of side max(x, y) so the rectangular measurement region is inscribed. Config - WorkflowConfig (dataclass): new measurement_area_x_mm and measurement_area_y_mm fields (Optional[float], default None for backward compatibility). - WorkflowConfigSchema (pydantic): Field constraints `gt=22, le=600` on x and `gt=22, le=400` on y. Lower bound is exclusive — a 22 mm × 22 mm 10 g cube face must fit strictly inside the area (ties into Task 6.5). - Both must be set together; model_validator raises if only one is provided. - When both set, model_validator derives plotting.window_mm to (-side/2, side/2, -side/2, side/2) with side = max(x, y), centered on origin. CLI - workflows.py: new --measurement_area_x_mm and --measurement_area_y_mm args. Tests - 9 new tests covering: out-of-range upper/lower bounds on both axes, unpaired-set rejection, square-window derivation when x>y and y>x, and unchanged default window when neither is set. Voila UI integration (text-box + history) is MEST scope; Task 6.6 will route validation errors through the warning channel to the UI banner. Cherry-picked manually from donor 6bafff7 (jgo/feedback-changes-clean) — donor diff dragged in unrelated Pydantic conversion of WorkflowResult, so the changes were ported by hand to preserve the dataclass+schema split. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [FEAT] add measurement area inputs & workflow * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] port two-table results layout from jgo/feedback-changes-clean (M6 T5) Replace single-row sSAR table with two colour-coded tables: - Table 1 (psSAR): Result badge, Measured@power, Measured@30dBm, Reference@30dBm, Scaling Error [%], Criteria [%] - Table 2 (pattern match): Result badge, Pass rate [%], Criteria Add _TH/_TD style constants; replace ResultTableRow/Column enums. Pass badge = #0090D0 (blue), Fail badge = #9B2423 (red). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] update e2e testing dependencies * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change * [CI] increase timeout voila tests * [TEST] fix e2e tests * [TEST] update measurement-validation test artifacts to match the registration direction change (#16) Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job, add GitLFS to get E2E tests to pass in the CI (#8) * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * 6.5 input mask min size (#13) * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Task 6.5 - inscribed 22x22 mm square mask validity check Per MGD 2026-04-24 feedback (slide 7): the gamma comparison is only valid when an axis-aligned 22 mm × 22 mm square — the face of the 10 g averaging cube — fits entirely inside the post-registration, post-noise-filter mask, without rotation. Per-axis bounding-box checks are insufficient (an L-shaped mask whose bounding box is 30×30 mm can pass per-axis but admit no inscribed 22×22 mm square). Source changes - gamma_eval.py: new GammaMapEvaluator.evaluation_mask_fits_axis_aligned_square_mm helper plus a free function _mask_fits_axis_aligned_square_mm that uses binary_erosion with a rectangular structuring element sized so its physical extent is at least side_mm on each axis. - workflow_config.py: new DEFAULT_MIN_INSCRIBED_SQUARE_MM = 22.0 constant and configurable WorkflowConfig.min_inscribed_square_mm field. - workflow_schema.py: Pydantic Field(gt=0) constraint on the new config field. - workflows.py: after evaluator.compute(), check whether the configured inscribed square fits inside evaluation_mask, log a WARNING when it does not, and surface the result on WorkflowResult as mask_fits_min_inscribed_square (boolean) plus min_inscribed_square_mm (the threshold actually used). UI/error-channel surfacing is Task 6.6. - workflows.py: new --min_inscribed_square_mm CLI arg. Tests (3 new in test_gamma_map_evaluator.py) - 22×22 mm square mask at 1 mm spacing → passes the 22 mm check - 21×21 mm square mask at 1 mm spacing → fails the 22 mm check - L-shape (30×10 mm horizontal arm + 10×30 mm vertical arm) whose bounding box passes per-axis checks → fails the 22 mm inscribed check, but a 10 mm inscribed check still passes All fast (58) and workflow + CLI slow (25) tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change * [FIX] wrong merged schema fields --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Bump actions/upload-artifact from 4 to 7 (#17) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 7. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/v4...v7) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump actions/checkout from 4 to 6 (#18) Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v4...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [FEAT] add user-configurable noise floor input (0 ≤ noise_floor ≤ 0.1 W/kg) (#15) * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * feat: add user-configurable noise floor input (0 ≤ noise_floor ≤ 0.1 W/kg) - Add NOISE_FLOOR_MAX = 0.1 constant to workflow_config.py - Add BoundedFloatText widget (min=0, max=0.1, step=0.001) to voila UI - Wire noise_floor into run-key cache invalidation, subprocess --noise_floor arg, state JSON persistence (_on_noise_floor_change + save_workflow_state), and restore - Add 4 Playwright E2E tests: visible, default value, clamp at max, persist on reload Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change * [CI] fix voila tests --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * [FIX] Fix NameError in _update_analytical_results - Fixed undefined 'pass_rate' variable on line 1013: Changed {pass_rate:.1f} to {sarsample_results.pass_rate_percent:.1f} - Added missing result_cell() local function definition - Removed dead code block with undefined variables (result_badge, values, table_html) This fixes the E2E test failures that were introduced after merging main-melanie branch. The bug was caused by incomplete refactoring during merge conflict resolution. * [FEAT] Task 6.6 - ValidationIssue channel: MASK_TOO_SMALL + CSV_FORMAT_ERROR emit sites - Add `ValidationIssue` dataclass (severity/code/message/details) to errors.py - `WorkflowExecutionError` now carries an optional `.issue` for fatal errors - `WorkflowResult` gains `issues: list[ValidationIssue]` field - `_complete_workflow` emits MASK_TOO_SMALL warning issue instead of just logging - `CsvFormatError` is caught specifically and wrapped as CSV_FORMAT_ERROR issue - `workflow_cli.py` error JSON payload includes issue dict when present - `voila.ipynb` WorkflowResults model gets issues/mask fields; success banner checks result.issues and shows warning/error banners for non-fatal issues - Two new tests covering MASK_TOO_SMALL and CSV_FORMAT_ERROR issue codes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Fix banner error path and add MASK_TOO_SMALL E2E test - Fix run_sarsample to always read stdout (CLI emits JSON there on both success and error, not stderr) - Guard against missing 'result' key when workflow status is 'error'; extract curated message from structured issue when available - Add 'Warning:' as a terminal state in _wait_for_workflow_cycle - Add test_mask_too_small_shows_warning_banner: generates a tiny 15 mm Gaussian CSV inline, uploads it, and asserts the MASK_TOO_SMALL warning banner appears Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.1+§B.2 + §V.1+§V.2: empty fixed mask crashes ITK B1/V1: when noise_floor ≥ measured peak, the registration fixed mask is all-zero. SimpleITK crashes with "VirtualSampledPointSet must have 1 or more points" — a raw ITK traceback in the Voila banner. Fix: guard in _complete_workflow after make_metric_masks(); raises ValidationIssue(EMPTY_MEASURED_MASK) with actionable message before registration is attempted. Add measured_raw_peak attribute to SARImageLoader so the guard can report the actual peak value. B2/V2: the generic `except Exception` handler in _complete_workflow re-wrapped any WorkflowExecutionError raised inside the try block, discarding the structured .issue payload. Fix: add `except WorkflowExecutionError: raise` as first handler. Add SPEC.md with §I/§V/§B sections. Add test test_complete_workflow_v1_empty_measured_mask_raises_issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] fix widget notation issue that did not allow voila to start * [SPEC] add §M merge log for squash-merge tracking Records every branch merged into main-melanie with tip commit hash, date, and content summary. Needed because squash-merges rewrite tip hashes, making the original branch tip the only reliable provenance anchor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] add debugging tooling * backprop §B.3 + §V.3: voila widget TraitError at kernel startup caused E2E timeout widgets.Layout(align_items="flex_start") — CSS underscore instead of hyphen — caused voila to fail at startup; all Playwright tests timed out with no useful signal. §V3 enforces that the notebook must execute in a Jupyter kernel without exception before Playwright starts, surfaced via a new notebook_smoke pytest step in the e2e-tests CI job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.3 + §V.3: MASK_TOO_SMALL pre-registration check missing V3: measured_mask_u8 after make_metric_masks() now checked against min_inscribed_square_mm before registration runs. Fires independently of (and in addition to) the existing post-registration check on evaluator.evaluation_mask. Test updated: 1000 mm threshold now expects 2 issues (both checkpoints fire). New test_complete_workflow_v3_pre_registration_mask_too_small uses a large grid with a narrow Gaussian (σ=4 mm, noise_floor=0.05) to drive the pre-registration check with a realistic 22 mm threshold. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.4 + §V.3 correction: MASK_TOO_SMALL must be a hard error Both pre-registration (measured_mask_u8) and post-registration (evaluator.evaluation_mask) checks now raise WorkflowExecutionError with severity="error" instead of appending a warning to issues. Workflow stops at the first failing check; the 22 mm inscribed-square rule is a hard validity gate, not an advisory. Tests updated to use pytest.raises(WorkflowExecutionError). E2E test updated: banner is now "Error:" not "Warning:". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Gamma eval mask excludes sub-cutoff (noise-filtered) pixels * [FEAT] Regenerate measurement validation artifacts after noise-mask fix (Task 6.4) Evaluated pixel count drops ~65% across all 9 cases (sub-cutoff pixels correctly excluded). Pass rate remains 100% on all cases. Cites V3. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(smoke-test): drop nbconvert --execute, run notebook cells as Python script nbconvert --execute fails with "No such kernel named python-maths" because the jupyter-math container registers the kernel under /home/jovyan/ which is not visible when GitHub Actions runs the container as root. Instead, extract the notebook's code cells and execute them as a plain Python subprocess — no kernel infrastructure required, same class of errors caught (TraitError, ImportError, SyntaxError). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T1: Create jgo/ui-adjustments branch from main-melanie HEAD Branch point: 228a89f (post jgo/6.6 merge). Ready for T2 cherry-picks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] adaptive voila port * [FEAT] plotting: rename 'Simulated' to 'Reference' in registration plot title Aligns the third top-row plot title with the rest of the UI (which already uses 'Reference' rather than 'Simulated' for the same data). * T2: Port plotting overlays from develop (cropped-area + noise-floor) - workflow_config.py: DEFAULT_PLOT_{NOT_EVALUATED,CROPPED_DATA,NOISE_FLOOR}_COLOR constants; PlottingConfig adds not_evaluated_color, cropped_data_color, noise_floor_color, measurement_area_x/y_mm, center_x/y_mm fields. - plotting.py: replace with develop final — adds _overlay_measurement_limit_mask, _compute_cropped_data_mask, _overlay_cropped_measurement_data, _overlay_noise_floor, _apply_overlay_legend; updates show_registration_overlay / plot_loaded_images / plot_sar_image / plot_gamma_results to accept noise_floor_mask + support_mask. - image_loader.py: cache _measured/_reference_noise_floor_mask in make_metric_masks(); thread support_mask + noise_floor_mask through plot_loaded / plot_aligned; add reference_plotting_config (centre=0,0) override; import dataclasses.replace. - gamma_eval.py: add noise_floor_mask param to show(). - workflows.py: pass loader._measured_noise_floor_mask to show_registration_overlay and evaluator.show(). Cites: C2, V5. 49 tests pass, ty clean on src/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T3: notebook layout + center plots (d774c11, aed839e, 264c2d6) - workflows.py: center plot window on measured data centroid; uses PlottingConfig.measurement_area_{x,y}_mm if set, else 10%-padded span - workflows.py: fast-path rescale when only power level changes - notebooks/voila.ipynb: result table below images (d774c11) - notebooks/voila.ipynb: inline feedback banner next to run button, SAR pattern match LEFT / psSAR RIGHT, drop redundant Pass/Fail indicator button and pass_rate_label (aed839e) - notebooks/voila.ipynb: radio button grid wrapped in scrollable Box with min_height=400px + flex 1 1 auto; left column stretches to match right column height (264c2d6) Cites: C1, V5 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.4: restore 5 noise_floor lines dropped in 84ae861 merge The jgo/m6-results-table branch merge silently lost the entire noise_floor feature wiring while keeping only the widget instantiation and observe() call: - def _on_noise_floor_change: AttributeError at UI init (caught by §V3 smoke test) - self.noise_floor.value in _run_key: cache not invalidated on floor change - noise_floor read + set in restore_state: value lost across page reloads - flex_item(self.noise_floor) in top_row: widget never visible in the UI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T4: boxed log widget from 86d7889 (6.3-noise-floor) - OutputWidgetHandler replaced with HTML-rendering approach: bounded _lines list (max 200), single display_data output that replaces itself on each emit() — thread-safe, height capped at 300px - Add _MAX_LOG_LINES = 200 constant - Remove clear_logs() (unused); simplify show_logs() - All py3.9 compatible (list[str] annotation valid since 3.9) Cites: C1, V5 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fix window_mm auto-center; all tests pass Auto-centering now only fires when window_mm is at its DEFAULT value. An explicit window_mm in PlottingConfig is preserved unchanged. This restores the test_complete_workflow_passes_shared_plotting_config assertion (window_mm == user-supplied value). 49 tests pass, 26 skipped (measurement-validation artifacts). ty check: All checks passed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 6.6 - Explicitely show issues to User (#20) * [FEAT] Task 6.6 - ValidationIssue channel: MASK_TOO_SMALL + CSV_FORMAT_ERROR emit sites - Add `ValidationIssue` dataclass (severity/code/message/details) to errors.py - `WorkflowExecutionError` now carries an optional `.issue` for fatal errors - `WorkflowResult` gains `issues: list[ValidationIssue]` field - `_complete_workflow` emits MASK_TOO_SMALL warning issue instead of just logging - `CsvFormatError` is caught specifically and wrapped as CSV_FORMAT_ERROR issue - `workflow_cli.py` error JSON payload includes issue dict when present - `voila.ipynb` WorkflowResults model gets issues/mask fields; success banner checks result.issues and shows warning/error banners for non-fatal issues - Two new tests covering MASK_TOO_SMALL and CSV_FORMAT_ERROR issue codes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Fix banner error path and add MASK_TOO_SMALL E2E test - Fix run_sarsample to always read stdout (CLI emits JSON there on both success and error, not stderr) - Guard against missing 'result' key when workflow status is 'error'; extract curated message from structured issue when available - Add 'Warning:' as a terminal state in _wait_for_workflow_cycle - Add test_mask_too_small_shows_warning_banner: generates a tiny 15 mm Gaussian CSV inline, uploads it, and asserts the MASK_TOO_SMALL warning banner appears Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.1+§B.2 + §V.1+§V.2: empty fixed mask crashes ITK B1/V1: when noise_floor ≥ measured peak, the registration fixed mask is all-zero. SimpleITK crashes with "VirtualSampledPointSet must have 1 or more points" — a raw ITK traceback in the Voila banner. Fix: guard in _complete_workflow after make_metric_masks(); raises ValidationIssue(EMPTY_MEASURED_MASK) with actionable message before registration is attempted. Add measured_raw_peak attribute to SARImageLoader so the guard can report the actual peak value. B2/V2: the generic `except Exception` handler in _complete_workflow re-wrapped any WorkflowExecutionError raised inside the try block, discarding the structured .issue payload. Fix: add `except WorkflowExecutionError: raise` as first handler. Add SPEC.md with §I/§V/§B sections. Add test test_complete_workflow_v1_empty_measured_mask_raises_issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.3 + §V.3: MASK_TOO_SMALL pre-registration check missing V3: measured_mask_u8 after make_metric_masks() now checked against min_inscribed_square_mm before registration runs. Fires independently of (and in addition to) the existing post-registration check on evaluator.evaluation_mask. Test updated: 1000 mm threshold now expects 2 issues (both checkpoints fire). New test_complete_workflow_v3_pre_registration_mask_too_small uses a large grid with a narrow Gaussian (σ=4 mm, noise_floor=0.05) to drive the pre-registration check with a realistic 22 mm threshold. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.4 + §V.3 correction: MASK_TOO_SMALL must be a hard error Both pre-registration (measured_mask_u8) and post-registration (evaluator.evaluation_mask) checks now raise WorkflowExecutionError with severity="error" instead of appending a warning to issues. Workflow stops at the first failing check; the 22 mm inscribed-square rule is a hard validity gate, not an advisory. Tests updated to use pytest.raises(WorkflowExecutionError). E2E test updated: banner is now "Error:" not "Warning:". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Results Table (#14) * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] port two-table results layout from jgo/feedback-changes-clean (M6 T5) Replace single-row sSAR table with two colour-coded tables: - Table 1 (psSAR): Result badge, Measured@power, Measured@30dBm, Reference@30dBm, Scaling Error [%], Criteria [%] - Table 2 (pattern match): Result badge, Pass rate [%], Criteria Add _TH/_TD style constants; replace ResultTableRow/Column enums. Pass badge = #0090D0 (blue), Fail badge = #9B2423 (red). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] update e2e testing dependencies * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change * [CI] increase timeout voila tests * [FIX] Fix NameError in _update_analytical_results - Fixed undefined 'pass_rate' variable on line 1013: Changed {pass_rate:.1f} to {sarsample_results.pass_rate_percent:.1f} - Added missing result_cell() local function definition - Removed dead code block with undefined variables (result_badge, values, table_html) This fixes the E2E test failures that were introduced after merging main-melanie branch. The bug was caused by incomplete refactoring during merge conflict resolution. * [FEAT] fix widget notation issue that did not allow voila to start * [FEAT] add debugging tooling --------- Co-authored-by: Javier Garcia Ordonez <ordonez@zmt.swiss> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fix test regressions — hatchling build backend + remove power fast-path Switch pyproject.toml from setuptools to hatchling: avoids the root-owned build/ directory that blocked uvx wheel builds (test_cli_via_uvx_like_frontend). Remove the power-level-only fast-path from handle_button_click: the fast-path skipped button cycling, which broke test_same_session_rerun_updates_results_after_power_change (the E2E suite uses button disable→enable as the only reliable cycle signal). All make ci stages now pass: lint, typecheck, fast, slow+validation, E2E (17/17). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fast-path for power-level-only change (SPEC V6) + E2E test update Re-add the power-level fast-path to handle_button_click: when only power_level changes (same measured file hash, reference path, noise_floor) and a prior WorkflowResult is cached, skip registration and rescale psSAR immediately with a "Power level updated" banner (no button cycle). Update test_same_session_rerun_updates_results_after_power_change to detect fast-path completion via the unique banner text instead of button cycling; banner is cleared at every click start so cannot be a false positive from stale DOM. Cited in SPEC as V6; V6→V7 renumber for the artifact-regeneration invariant. All make ci stages pass: lint, typecheck, fast, slow+validation, E2E (17/17). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.5-7 + §V.4: nth() locators + measurement_area_row dropped in merges §B5/V4: _set_meas_area and two upper-bound tests used positional nth(1/2) locators; adding noise_floor to top_row (§B4 fix) shifted DOM order causing inputs to resolve to the wrong widget. All measurement-area inputs now use label-anchored .widget-text selectors. §B6: test_workflow_produces_square_plots unpacked voila_server as 2-tuple but fixture yields 3-tuple; fixed to _, workspace_root, _. §B7: 84ae861 merge dropped measurement_area_row from left_setup_section in create_ui(); x/y widgets were defined but never added to the DOM so Playwright locators timed out. Restored the row. All 25 E2E tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T7-T9: Add multi-band measurements, expanded test suite, HTML report generator T7: Add 130 measurement CSVs for 900/1950/5GHz bands (LFS-tracked) from prior measurement campaign. 9 dipole_2450MHz files remain as baseline. T8: Expand test_measurement_validation.py to auto-discover all measurements, group by frequency/power, generate case IDs with frequency+power in name. Adds BASELINE_CASES, ROBUSTNESS_CASES, DISCOVERED_CASES; dynamically creates per-group test functions. T9: Add generate_measurement_validation_report_html.py with filterable HTML dashboard (combined pass/fail verdict: gamma + scaling error thresholds). Add tests/test_measurement_validation_report.py to verify dashboard logic. Artifacts require regeneration with REGENERATE_MEASUREMENT_VALIDATION_ARTIFACTS=1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Add 36 D2450 HSL measurements; no-colorbar flag for test artifacts; SPEC §MV + V10 + C7 - Add `PlottingConfig.save_colorbars` flag (default True); gate all three colorbar-save sites in plotting.py behind it - Pass `PlottingConfig(save_colorbars=False)` in test _compute_case so regen plots are compact (no separate colorbar PNGs) - Stage 36 new D2450_Flat HSL power-sweep CSVs (0–17 dBm, 1g/10g) - SPEC: add §MV measurement-validation overview, C7 adaptive noise floor (planned), V10 (planned), T12, flip T7-T9 → x, T10 → ~ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] T10/T12: Regenerate artifacts; adaptive noise floor; V11; LFS for npz Regen HEAD: $HEAD REGENERATE_MEASUREMENT_VALIDATION_ARTIFACTS=1 SAVE_MEASUREMENT_VALIDATION_PLOTS=1 Result: 112 passed, 54 failed (plots on disk, not committed per .gitignore) Adaptive noise floor (C7/V10): LOW_POWER_THRESHOLD_DBM=9 0.01 W/kg for power_level_dbm ≤ 9, else 0.05 W/kg V11: 100% gamma pass rate is the hard criterion (zero failed pixels) Remaining 54 genuine gamma failures (plots inspectable on disk): 900MHz 10dBm: 19, 5GHz 1dBm: 15, 5GHz 10dBm: 12, 1950MHz 10dBm: 2, 2450MHz 10g 0-2dBm: 3, robustness: 3 - .gitattributes: add *.npz → LFS - .gitignore: PNGs stay excluded; add log/ exception for debug logs - 110 passing artifact npz (LFS) + 110 metrics.json committed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] T11: Generate measurement validation HTML dashboard (110 passing cases) - Generate per-band HTML reports (2450mhz, 1950mhz, 5800mhz, 900mhz) - Generate combined HTML report (all bands) - Generate summary dashboard with band-level stats - Delete stale top-level 2450_10mm_1g_*_metrics.json artifacts (pre-new-format) - SPEC §MV: measurement validation overview section - SPEC V11: 100% gamma pass rate is the only pass criterion Artifacts generated under main-melanie HEAD 8e54b78. Pass criterion: failed_pixel_count == 0 (V11). 110 passing / 54 genuine failures. Combined verdict: gamma_pass_rate == 100% AND |scaling_error| ≤ 10%. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Jgo/UI adjustments (#22) * [FEAT] Vectorize gamma, add --output-dir, lock deps, add lint+type CI job - Vectorize _gamma_2d_peak_normalized: replace per-offset Python loop with (N, H, W) NumPy broadcast + np.min, eliminating repeated element-wise iterations while keeping identical numerical output - Add --output-dir CLI flag: writes results.json, gamma_map.npy, gamma_map.png, and failure_map.png for pipeline/batch use - Commit uv.lock and switch CI syncs to --frozen for reproducible builds - Add .github/dependabot.yml with weekly pip + github-actions groups - Add parallel CI job "Lint & type check" running ruff check (blocking) and ty check (non-blocking) against src/; add [tool.ty] to pyproject.toml * [FEAT] make ty pass * [CI] Divide tests into standard (fast), slow, and/or validation. Git LFS pull for the latest. * [CI] further tyring to fix CI * [CI] move test which needs LFS to "validation" set; include slow test artifact uploading * [CI] move LFS pull after checkout * [CI] try fixing Git LFS pulling * [CI] safe installation of Git LFS * [CI] further try fix GitLFS in CI * [CI] try fixing Git LFS pull yet again * [CI] add Coverage to validation tests + proper skip in voila tests * [CHORE] update gitignore * [FEAT] fix CI typo * [CI] pull example data also in Voila E2E tests * [CI] minor CI edits * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] port two-table results layout from jgo/feedback-changes-clean (M6 T5) Replace single-row sSAR table with two colour-coded tables: - Table 1 (psSAR): Result badge, Measured@power, Measured@30dBm, Reference@30dBm, Scaling Error [%], Criteria [%] - Table 2 (pattern match): Result badge, Pass rate [%], Criteria Add _TH/_TD style constants; replace ResultTableRow/Column enums. Pass badge = #0090D0 (blue), Fail badge = #9B2423 (red). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] update e2e testing dependencies * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test All 12 voila e2e tests now pass (was 7 failing after table port). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "[TEST] update e2e assertions for two-table results layout" This reverts commit cad825cf8b9393cba70048b6e0d207abb71eeb81. * [TEST] update e2e assertions for two-table results layout - Update _extract_pssar_row_values regex: anchor on 'Reference, 30 dBm' header, skip result badge + measured@power cells, extract measured@30dBm / reference@30dBm / scaling_error from new column order - Replace all 'Reference 30 dBm' string checks with 'Reference, 30 dBm' (Python asserts, JS wait_for_function, and DOM wait condition) - Replace 'sSAR' absence check with 'psSAR' in upload-clears-results test * Fix table headers to match test expectations: add comma after Reference/Measured * Fix table headers to match test expectations: add comma after Reference/Measured * [CI] trying to fix voila E2E * Stabilize Voila E2E workflow-cycle wait on CI * [CI] Git LFS pull the validation database file * [CI] identifying right database sample * [CI] another fix to e2e voila * [CHORE] remove extra configs of ty tool * [TEST] update measurement-validation test artifacts to match the registration direction change * [CI] increase timeout voila tests * [FIX] Fix NameError in _update_analytical_results - Fixed undefined 'pass_rate' variable on line 1013: Changed {pass_rate:.1f} to {sarsample_results.pass_rate_percent:.1f} - Added missing result_cell() local function definition - Removed dead code block with undefined variables (result_badge, values, table_html) This fixes the E2E test failures that were introduced after merging main-melanie branch. The bug was caused by incomplete refactoring during merge conflict resolution. * [FEAT] Task 6.6 - ValidationIssue channel: MASK_TOO_SMALL + CSV_FORMAT_ERROR emit sites - Add `ValidationIssue` dataclass (severity/code/message/details) to errors.py - `WorkflowExecutionError` now carries an optional `.issue` for fatal errors - `WorkflowResult` gains `issues: list[ValidationIssue]` field - `_complete_workflow` emits MASK_TOO_SMALL warning issue instead of just logging - `CsvFormatError` is caught specifically and wrapped as CSV_FORMAT_ERROR issue - `workflow_cli.py` error JSON payload includes issue dict when present - `voila.ipynb` WorkflowResults model gets issues/mask fields; success banner checks result.issues and shows warning/error banners for non-fatal issues - Two new tests covering MASK_TOO_SMALL and CSV_FORMAT_ERROR issue codes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Fix banner error path and add MASK_TOO_SMALL E2E test - Fix run_sarsample to always read stdout (CLI emits JSON there on both success and error, not stderr) - Guard against missing 'result' key when workflow status is 'error'; extract curated message from structured issue when available - Add 'Warning:' as a terminal state in _wait_for_workflow_cycle - Add test_mask_too_small_shows_warning_banner: generates a tiny 15 mm Gaussian CSV inline, uploads it, and asserts the MASK_TOO_SMALL warning banner appears Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.1+§B.2 + §V.1+§V.2: empty fixed mask crashes ITK B1/V1: when noise_floor ≥ measured peak, the registration fixed mask is all-zero. SimpleITK crashes with "VirtualSampledPointSet must have 1 or more points" — a raw ITK traceback in the Voila banner. Fix: guard in _complete_workflow after make_metric_masks(); raises ValidationIssue(EMPTY_MEASURED_MASK) with actionable message before registration is attempted. Add measured_raw_peak attribute to SARImageLoader so the guard can report the actual peak value. B2/V2: the generic `except Exception` handler in _complete_workflow re-wrapped any WorkflowExecutionError raised inside the try block, discarding the structured .issue payload. Fix: add `except WorkflowExecutionError: raise` as first handler. Add SPEC.md with §I/§V/§B sections. Add test test_complete_workflow_v1_empty_measured_mask_raises_issue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] fix widget notation issue that did not allow voila to start * [SPEC] add §M merge log for squash-merge tracking Records every branch merged into main-melanie with tip commit hash, date, and content summary. Needed because squash-merges rewrite tip hashes, making the original branch tip the only reliable provenance anchor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] add debugging tooling * backprop §B.3 + §V.3: MASK_TOO_SMALL pre-registration check missing V3: measured_mask_u8 after make_metric_masks() now checked against min_inscribed_square_mm before registration runs. Fires independently of (and in addition to) the existing post-registration check on evaluator.evaluation_mask. Test updated: 1000 mm threshold now expects 2 issues (both checkpoints fire). New test_complete_workflow_v3_pre_registration_mask_too_small uses a large grid with a narrow Gaussian (σ=4 mm, noise_floor=0.05) to drive the pre-registration check with a realistic 22 mm threshold. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * backprop §B.4 + §V.3 correction: MASK_TOO_SMALL must be a hard error Both pre-registration (measured_mask_u8) and post-registration (evaluator.evaluation_mask) checks now raise WorkflowExecutionError with severity="error" instead of appending a warning to issues. Workflow stops at the first failing check; the 22 mm inscribed-square rule is a hard validity gate, not an advisory. Tests updated to use pytest.raises(WorkflowExecutionError). E2E test updated: banner is now "Error:" not "Warning:". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] Gamma eval mask excludes sub-cutoff (noise-filtered) pixels * [FEAT] Regenerate measurement validation artifacts after noise-mask fix (Task 6.4) Evaluated pixel count drops ~65% across all 9 cases (sub-cutoff pixels correctly excluded). Pass rate remains 100% on all cases. Cites V3. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T1: Create jgo/ui-adjustments branch from main-melanie HEAD Branch point: 228a89f (post jgo/6.6 merge). Ready for T2 cherry-picks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [FEAT] adaptive voila port * [FEAT] plotting: rename 'Simulated' to 'Reference' in registration plot title Aligns the third top-row plot title with the rest of the UI (which already uses 'Reference' rather than 'Simulated' for the same data). * T2: Port plotting overlays from develop (cropped-area + noise-floor) - workflow_config.py: DEFAULT_PLOT_{NOT_EVALUATED,CROPPED_DATA,NOISE_FLOOR}_COLOR constants; PlottingConfig adds not_evaluated_color, cropped_data_color, noise_floor_color, measurement_area_x/y_mm, center_x/y_mm fields. - plotting.py: replace with develop final — adds _overlay_measurement_limit_mask, _compute_cropped_data_mask, _overlay_cropped_measurement_data, _overlay_noise_floor, _apply_overlay_legend; updates show_registration_overlay / plot_loaded_images / plot_sar_image / plot_gamma_results to accept noise_floor_mask + support_mask. - image_loader.py: cache _measured/_reference_noise_floor_mask in make_metric_masks(); thread support_mask + noise_floor_mask through plot_loaded / plot_aligned; add reference_plotting_config (centre=0,0) override; import dataclasses.replace. - gamma_eval.py: add noise_floor_mask param to show(). - workflows.py: pass loader._measured_noise_floor_mask to show_registration_overlay and evaluator.show(). Cites: C2, V5. 49 tests pass, ty clean on src/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T3: notebook layout + center plots (d774c11, aed839e, 264c2d6) - workflows.py: center plot window on measured data centroid; uses PlottingConfig.measurement_area_{x,y}_mm if set, else 10%-padded span - workflows.py: fast-path rescale when only power level changes - notebooks/voila.ipynb: result table below images (d774c11) - notebooks/voila.ipynb: inline feedback banner next to run button, SAR pattern match LEFT / psSAR RIGHT, drop redundant Pass/Fail indicator button and pass_rate_label (aed839e) - notebooks/voila.ipynb: radio button grid wrapped in scrollable Box with min_height=400px + flex 1 1 auto; left column stretches to match right column height (264c2d6) Cites: C1, V5 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T4: boxed log widget from 86d7889 (6.3-noise-floor) - OutputWidgetHandler replaced with HTML-rendering approach: bounded _lines list (max 200), single display_data output that replaces itself on each emit() — thread-safe, height capped at 300px - Add _MAX_LOG_LINES = 200 constant - Remove clear_logs() (unused); simplify show_logs() - All py3.9 compatible (list[str] annotation valid since 3.9) Cites: C1, V5 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fix window_mm auto-center; all tests pass Auto-centering now only fires when window_mm is at its DEFAULT value. An explicit window_mm in PlottingConfig is preserved unchanged. This restores the test_complete_workflow_passes_shared_plotting_config assertion (window_mm == user-supplied value). 49 tests pass, 26 skipped (measurement-validation artifacts). ty check: All checks passed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fix test regressions — hatchling build backend + remove power fast-path Switch pyproject.toml from setuptools to hatchling: avoids the root-owned build/ directory that blocked uvx wheel builds (test_cli_via_uvx_like_frontend). Remove the power-level-only fast-path from handle_button_click: the fast-path skipped button cycling, which broke test_same_session_rerun_updates_results_after_power_change (the E2E suite uses button disable→enable as the only reliable cycle signal). All make ci stages now pass: lint, typecheck, fast, slow+validation, E2E (17/17). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * T5: fast-path for power-level-only change (SPEC V6) + E2E test update Re-add the power-level fast-path to handle_button_click: when only power_level changes (same measured file hash, reference path, noise_floor) and a prior WorkflowResult is cached, skip registration and rescale psSAR immediately with a "Power level updated" banner (no button cycle). Update test_same_session_rerun_updates_results_after_power_change to detect fast-path completion via the unique banner text instead of button cycling; banner is cleared at every click start so cannot be a false positive from stale DOM. Cited in SPEC as V6; V6→V7 renumber for the artifact-regeneration invariant. All make ci s…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.