refactor(iast): prevent out-of-bounds read in slice aspect with negative step#18129
Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intoMay 19, 2026
Conversation
BenchmarksBenchmark execution time: 2026-05-18 15:21:27 Comparing candidate commit 12c71cd in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 79 metrics, 0 unstable metrics. scenario:iastaspectsospath-ospathbasename_aspect
|
63fa06a to
36f686a
Compare
Codeowners resolved as |
avara1986
approved these changes
May 19, 2026
Member
avara1986
left a comment
There was a problem hiding this comment.
LGTM, thanks a lot for your findings and for improving the code 😉
Member
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
vlad-scherbich
pushed a commit
that referenced
this pull request
May 20, 2026
…ive step (#18129) ## Description `build_index_range_map` executes a loop: ```cpp for (auto i = start_int; i < stop_int; i += step_int) { index_range_map_result.emplace_back(index_range_map[i]); } ``` ... where `start_int`, `stop_int`, and `step_int` are all signed longs read directly from caller-controlled `PyLong` objects (via `PyLong_AsLong`) with no normalization through `PySlice_AdjustIndices`. When the application code uses a negative-step slice on a tainted string (e.g. `tainted[::-1]`), the AST visitor at `ddtrace/appsec/_iast/_ast/visitor.py` rewrites the expression to `slice_aspect(s, None, None, -1)`. Inside `build_index_range_map`, `start_int=0`, `stop_int=length_text`, `step_int=-1`. The first iteration uses `i=0` (valid); the second iteration uses `i=-1`, then `-2`, etc. Each `index_range_map[i]` call invokes `std::vector<TaintRangePtr>::operator[](size_t)`, which converts the negative long to `(size_t)-1 = SIZE_MAX`, reading out of bounds. The result is then `emplace_back`'ed into the result vector and used to build a `shared_ptr<TaintRange>` by dereferencing the obtained pointer. Reproducer that crashes pre-fix/ <details> ```py #!/usr/bin/env python3 """ Reproducer for: out-of-bounds read in slice aspect with negative step. With the buggy code, build_index_range_map looped: for (auto i = start_int; i < stop_int; i += step_int) With step=-1: i = 0, -1, -2, ... all satisfy i < stop_int, so the loop never terminates and reads past the beginning of the vector (UB / crash). Run with: DD_IAST_ENABLED=1 python repro.py """ import os os.environ.setdefault("DD_IAST_ENABLED", "1") from ddtrace.appsec._iast._taint_tracking import OriginType from ddtrace.appsec._iast._taint_tracking import initialize_native_state from ddtrace.appsec._iast._taint_tracking._taint_objects import taint_pyobject from ddtrace.appsec._iast._taint_tracking._taint_objects_base import get_tainted_ranges from tests.appsec.iast.iast_utils import ( _end_iast_context_and_oce, _iast_patched_module, _start_iast_context_and_oce, ) # initialize_native_state allocates the C++ TaintEngineContext singleton; # without it start_request_context() always returns None. initialize_native_state() # The module must be loaded patched so that slicing goes through the aspect. mod = _iast_patched_module("benchmarks.bm.iast_fixtures.str_methods") def main() -> None: _start_iast_context_and_oce() try: s: str = taint_pyobject( pyobject="hello", source_name="input", source_value="hello", source_origin=OriginType.PARAMETER, ) # All three cases use a negative step → triggered the OOB read. # do_slice(text, first, second, third) maps to text[first:second:third]; # passing None uses the default (same as omitting the index). cases: list[tuple[str, object, object, int]] = [ ("s[::-1]", None, None, -1), ("s[::-2]", None, None, -2), ("s[4:0:-1]", 4, 0, -1), ] for label, first, second, step in cases: print(f"trying {label} ...", flush=True) result = mod.do_slice(s, first, second, step) ranges = get_tainted_ranges(result) print(f" result={result!r}, tainted_ranges={ranges}") assert len(ranges) == 1, f"expected 1 tainted range, got {ranges}" print("OK - all slices returned correct tainted ranges") finally: _end_iast_context_and_oce() if __name__ == "__main__": main() ``` </details> Co-authored-by: alberto.vara <alberto.vara@datadoghq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
build_index_range_mapexecutes a loop:... where
start_int,stop_int, andstep_intare all signed longs read directly from caller-controlledPyLongobjects (viaPyLong_AsLong) with no normalization throughPySlice_AdjustIndices. When the application code uses a negative-step slice on a tainted string (e.g.tainted[::-1]), the AST visitor atddtrace/appsec/_iast/_ast/visitor.pyrewrites the expression toslice_aspect(s, None, None, -1).Inside
build_index_range_map,start_int=0,stop_int=length_text,step_int=-1. The first iteration usesi=0(valid); the second iteration usesi=-1, then-2, etc.Each
index_range_map[i]call invokesstd::vector<TaintRangePtr>::operator[](size_t), which converts the negative long to(size_t)-1 = SIZE_MAX, reading out of bounds. The result is thenemplace_back'ed into the result vector and used to build ashared_ptr<TaintRange>by dereferencing the obtained pointer.Reproducer that crashes pre-fix/
Details