Skip to content

fix(postprocessing): correct boolean parsing in MemoryFileIterator CSV reader#417

Open
awanawana wants to merge 1 commit into
Toni-SM:developfrom
awanawana:fix/csv-bool-parsing-and-postprocessing-tests
Open

fix(postprocessing): correct boolean parsing in MemoryFileIterator CSV reader#417
awanawana wants to merge 1 commit into
Toni-SM:developfrom
awanawana:fix/csv-bool-parsing-and-postprocessing-tests

Conversation

@awanawana

Copy link
Copy Markdown

Summary

This PR fixes a real bug in MemoryFileIterator._format_csv() and adds the first test coverage for the skrl.utils.postprocessing module.

Bug Fix

File: skrl/utils/postprocessing.py

In Python, bool("False") evaluates to True because any non-empty string is truthy. This caused the CSV memory reader to silently corrupt boolean data — every "False" value in an exported memory CSV was read back as True.

# Before (buggy)
float(item) if item not in ["True", "False"] else bool(item)
# bool("False") == True  ← silent data corruption

# After (fixed)
float(item) if item not in ["True", "False"] else item == "True"
# item == "True" correctly maps "True" → True, "False" → False

This affects any user who exports a memory with boolean tensors (e.g. terminated, truncated) to CSV format and then reads it back with MemoryFileIterator.

Tests Added

File: tests/utils/test_postprocessing.py (new — 17 tests)

The skrl.utils.postprocessing module previously had zero test coverage. This PR adds comprehensive tests for MemoryFileIterator covering:

  • NumPy (.npz) format: single file, multiple files (sorted order), multiple tensors
  • PyTorch (.pt) format: single file, multiple files (sorted order)
  • CSV format: float values, boolean True/False parsing (including regression tests for the bug), mixed booleans, invalid header, empty data rows
  • Edge cases: empty glob, unsupported file extension, iterator exhaustion, for-loop usage, mixed-format directories

All 17 tests pass after the fix. The two boolean regression tests fail against the original code, confirming the bug.

…V reader

In Python, bool("False") evaluates to True because any non-empty string
is truthy. This caused MemoryFileIterator._format_csv() to incorrectly
parse "False" values as True when reading exported memory CSV files.

Fix: replace bool(item) with item == "True" so that only the string
"True" maps to True and "False" maps to False.

Also add tests/utils/test_postprocessing.py with 17 tests covering:
- MemoryFileIterator with NumPy (.npz), PyTorch (.pt), and CSV formats
- Boolean True/False parsing (including the bug regression tests)
- Empty glob, unsupported format, iterator protocol, and edge cases
- The postprocessing module previously had zero test coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant