Skip to content

[uss_qualifier/reports] Add test artifacts obfuscator#1477

Open
BenjaminPelletier wants to merge 7 commits into
interuss:mainfrom
BenjaminPelletier:report-obfuscator
Open

[uss_qualifier/reports] Add test artifacts obfuscator#1477
BenjaminPelletier wants to merge 7 commits into
interuss:mainfrom
BenjaminPelletier:report-obfuscator

Conversation

@BenjaminPelletier
Copy link
Copy Markdown
Member

Some users may be hesitant to include full test artifacts when reporting issues or requesting help because those artifacts may contain semi-sensitive information. This PR adds a tool that attempts to ease this problem somewhat by obfuscating sensitive information while still retaining nearly-full diagnostic value for the artifacts.

Written mostly with Gemini.

@BenjaminPelletier BenjaminPelletier marked this pull request as ready for review June 4, 2026 04:01
Detailed information on command-line options and toggles can be retrieved using the `--help` flag:

```bash
PYTHONPATH=. uv run --index https://pypi.org/simple python monitoring/uss_qualifier/reports/obfuscate.py --help
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This could be simplified to PYTHONPATH=. uv run monitoring/uss_qualifier/reports/obfuscate.py --help

(python is not needed, --index is environment-specific)

Comment thread monitoring/uss_qualifier/reports/obfuscate.md Outdated
elif k == "manager" and isinstance(v, str):
participant_ids.add(v)
elif isinstance(v, str):
for url in find_urls(v):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reuse scan_text bellow? it do the same loop



def test_find_urls():
text = "Check http://dss1.uss1.localutm/dss/v1, or go to https://github.com/interuss. Also see (http://localhost:8082/status)."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good.

# 2. Obfuscate hostnames
if config.obfuscate_hostnames:
for h, mapped_h in hostname_map.items():
s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That an edge case, but if we have url in uppercase for some reason, urlparse normalize it to lowercase and it won't be replaced there.

 urlparse("http://TEST").hostname
'test'
Suggested change
s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s)
s = re.sub(rf"\b{re.escape(h)}\b", mapped_h, s, flags=re.IGNORECASE)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added failing test and fixed; nice esoteric catch and I learned something new


# 1. Obfuscate tokens
if config.obfuscate_tokens:
s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s)
s = re.sub(r"(?i)\bBearer\s+\S+", "Bearer REDACTED", s, flags=re.IGNORECASE)

To handle cases like bEarER that may be valid

# Generate maps
participant_map = {}
for idx, pid in enumerate(
sorted(sorted(participant_ids), key=len, reverse=True), start=1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the inner sort useful?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; that's where length-equal keys are alphabetized

scan_json(v, participant_ids, hostnames)
elif isinstance(obj, list):
for item in obj:
scan_json(item, participant_ids, hostnames)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should scan_json do something on strings ?

There is an isinstance(v, str) in dict branch, but not in list, meaning str in dict and list are not handled the same way

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That isinstance is checking whether the dict contains {"Authorization": "Bearer xxx"}. A list cannot contain this key-value pair.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the other loop, that one is collecting participants and urls

And it will only collect urls that are single items of dict.

With a file like that (tested with test_obfuscate_directory):

        report_data = {
            "participant_id": "uss1_core",
            "participants": ["uss1_core", "uss2_core"],
            "url": "http://dss1.uss3.localutm/dss/v1",
            "urls": ["http://dss1.uss4.localutm/dss/v1"],
            "report": {
                "queries": [
                    {
                        "request": {
                            "url": "http://dss1.uss1.localutm/dss/v1",
                            "urls": ["http://dss1.uss2.localutm/dss/v1"],
                            "headers": {"Authorization": "Bearer mytoken"},
                        }
                    }
                ]
            },
        }

dss1.uss2.localutm and dss1.uss4.localutm won't be detected.

I don't know if it's a possibility (and I would guess in a report the probably to see the hostname in a simple location is high), but that probably better to handle this case no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants