Skip to content

feat: perry install — secure wrapper around bun/npm with offline malware scan#738

Open
proggeramlug wants to merge 1 commit into
mainfrom
perry-install
Open

feat: perry install — secure wrapper around bun/npm with offline malware scan#738
proggeramlug wants to merge 1 commit into
mainfrom
perry-install

Conversation

@proggeramlug
Copy link
Copy Markdown
Contributor

Summary

perry install wraps bun install --ignore-scripts (or npm install --ignore-scripts as fallback) so no package code executes during install, then scans the inert node_modules/ with bundled offline rules and refuses to proceed on any P0 hit. Lifecycle scripts only run for packages on a curated trust allowlist (~40 well-known native-binding packages: esbuild, sharp, prisma family, swc, biome, lightningcss, @next/swc-, @esbuild/, @napi-rs/*, etc.) or the user's explicit opt-in via package.json -> perry.allowScripts / --run-scripts <pkg>.

Works on any standard npm project — no Perry-specific config required.

Why

After-the-fact scanners (Socket, GuardDog, npm audit) run too late on the postinstall-exfil vector that the 2024–2026 supply-chain wave (Shai-Hulud, SANDWORM_MODE, QIX maintainer compromise) leans on. Plugging the gate between extract and script execution catches the attack before any code runs.

Architecture

perry install
  ├─ detect installer        (bun preferred, npm fallback)
  ├─ <installer> install --ignore-scripts   (resolver/fetch/extract — battle-tested)
  ├─ scan node_modules/      (4 rule modules, offline, ~ms)
  │    └─ block here if any P0 hit; node_modules/ on disk but inert
  └─ run lifecycle scripts   (only for trust-gate packages)

Scan rules (P0 — block by default)

Module Catches
scripts.rs curl
patterns.rs Same exfil shapes + hardcoded webhook URLs in declared entry-point JS (main / module / bin / exports leaves)
typosquat.rs Levenshtein ≤ 2 with length-diff ≤ 1 against ~300 popular npm names (expres/epxress/reactnative flagged; expressjs not)
obfuscation.rs Dropper-shape: small (< 50 KB) non-min entry with > 5 K-char line; OR ≥ 1 K-char base64 blob in any entry file

CLI surface

perry install [pkg...]                  # install from package.json or named packages
perry install --installer=bun|npm       # force backend (auto: bun → npm)
perry install --allow-risky <pkg>       # bypass scan for one package
perry install --allow-risky-all         # CI/emergency bypass
perry install --run-scripts <pkg>       # allow scripts beyond bundled allowlist
perry install --run-scripts-all         # npm-equivalent (unsafe)
perry install --skip-scan               # just wrap installer
perry install --json                    # machine-readable

End-to-end validation

  • Clean install of lodash + chalk → verdict clean, exit 0.
  • --installer=npm fallback → works, npm output passes through.
  • Inject lodass (typosquat of lodash) with malicious postinstall into node_modules/ → exit 1, two P0 findings cited (lifecycle-shell-pipe-exec + typosquat-close-to-popular), no scripts ran.
  • Same with --allow-risky lodass → exit 0, findings annotated overridden, scripts still skipped because not on allowlist.
  • Real-world: project depending on esbuild@^0.21 → bun-install + scan-clean + allowlist runs esbuild's postinstall → 9.4 MB platform binary lands in node_modules/esbuild/bin/esbuild.
  • --skip-scan → installer runs, scanner skipped, no scripts (because we still pass --ignore-scripts to the underlying installer).
  • .perry/install-report.json written on every run with timestamp, package count, findings array (each with overridden: bool), verdict.

Test plan

  • 65 unit tests under crates/perry/src/commands/install/ (installer detection, package walker, each scan rule with positive + negative fixtures, allow-risky matcher, report writer, trust gate, real-shell lifecycle execution)
  • Workspace cargo build --release -p perry clean (44 warnings, all pre-existing)
  • End-to-end smokes above all pass against the release binary
  • CI: lint, cargo-test, parity, compile-smoke, api-docs-drift, security-audit

Out of scope (filed as follow-ups)

  • Sandboxed lifecycle execution (macOS sandbox-exec / Linux bubblewrap+seccomp / Windows AppContainer with redacted env + filesystem confinement + network allowlist). v1 runs allowlisted scripts un-sandboxed; v2 will add the jail. This is the strongest reason to ship v2 soon — a compromised allowlisted package's postinstall still runs with full privileges today.
  • Scoped-package typosquat detection.
  • P1 freshness / maintainer-drift / SLSA-provenance checks (CLI flag --check-freshness reserved, not implemented).
  • Private registries / .npmrc auth / yarn / pnpm backends.

Design rationale

Full plan + alternatives considered (full-installer-from-scratch vs. wrapper) lives in the linked design doc.

Adds the `Install` subcommand that wraps `bun install --ignore-scripts`
(or `npm install --ignore-scripts` as fallback) so no package code
executes during install, statically scans `node_modules/` with bundled
offline rules, refuses to proceed on any P0 hit, and only then runs
lifecycle scripts — and only for packages on a curated trust
allowlist or the user's explicit opt-in.

Works on any standard npm project; the only Perry-specific behavior
is reading `package.json -> perry.allowScripts` when present.

## Architecture

```
perry install [pkg...]
  ├─ detect installer            (bun preferred, npm fallback)
  ├─ <installer> install --ignore-scripts   (extract, no scripts run)
  ├─ scan node_modules/          (4 rule modules, offline, ~ms)
  │    └─ block if any P0; node_modules/ on disk but inert
  └─ run lifecycle scripts       (only trust-gate packages)
```

Five modules under `crates/perry/src/commands/install/`:
- `detect.rs` — `which bun` → bun, else npm; `--installer=` override
- `runner.rs` — shells out with `--ignore-scripts`, translates flags
- `scanner/` — walks node_modules (handles scoped, nested, AND bun's
  `.bun/<pkg>@<ver>/node_modules/<pkg>/` isolated-mode + `.pnpm/`),
  runs rules, writes `.perry/install-report.json`
- `allowlist.rs` — ~40 well-known native-binding packages
  (esbuild, sharp, prisma family, swc, biome, lightningcss,
  @next/swc-*, @esbuild/*, @napi-rs/*, ...)
- `lifecycle.rs` — runs preinstall/install/postinstall via sh -c
  (or cmd /C on Windows), with PATH augmented and standard npm_* env

## Scan rules (P0 — block by default)

- `scripts.rs` — 12 sub-rules on lifecycle-script bodies: curl|sh,
  eval+atob / Function+Buffer.from(base64), ~/.ssh / ~/.aws / ~/.npmrc
  / ~/.config/gh reads, *_TOKEN/*_KEY/*_SECRET env reads, Discord /
  Telegram / OAST IOC channels, bare-IP HTTP hosts, child_process
  dyn-arg whose argument reads process.env
- `patterns.rs` — same exfil shapes + hardcoded webhook URLs in the
  package's declared entry-point JS (main / module / bin /
  exports leaves); files > 512 KB skipped as likely-bundled
- `typosquat.rs` — Levenshtein ≤ 2 with length-diff ≤ 1 against a
  bundled list of ~300 popular npm names (`data/top_packages.txt`);
  both names must be ≥ 5 chars (avoids FPs on short legitimate
  names like `bl`, `pump`); catches `expres`, `lodass`, `epxress`,
  `reactnative`; lets through compound names like `expressjs`
- `obfuscation.rs` — quoted-string ≥ 1,000 chars of base64-alphabet
  content in any entry file (catches embedded payloads even when
  the surrounding code looks innocuous)

## CLI surface

```
perry install [pkg...]                  # install from package.json or named pkgs
perry install -D|--save-dev <pkg>       # add to devDependencies
perry install -g|--global <pkg>         # install globally
perry install --production              # skip devDependencies
perry install --installer=bun|npm       # force backend
perry install --allow-risky <pkg>       # bypass scan for one package
perry install --allow-risky-all         # CI / emergency bypass
perry install --run-scripts <pkg>       # allow scripts beyond bundled allowlist
perry install --run-scripts-all         # npm-equivalent (unsafe)
perry install --skip-scan               # just wrap installer
perry install --json                    # machine-readable output
```

## Tests

66 unit tests + an integration matrix against real fixtures:
- hono, axios-get, redis-pubsub, ws-echo — all scan clean
- drizzle-sqlite (39 transitive deps) — scan clean (postinstall
  failure on better-sqlite3 against node 25 is env-level, same as
  `npm install` would give in the same environment)
- @aws-sdk/client-s3 (56 transitive deps) — scan clean
- bun workspace monorepo — scans through `.bun/` isolated store,
  workspace package imports work
- Block-then-override path tested with synthetic `lodass` typosquat
  + malicious postinstall: exits 1 with two P0 findings cited,
  `--allow-risky` overrides to exit 0 with `overridden: true` in
  the report
- esbuild's real postinstall runs via the bundled allowlist and the
  9.4 MB platform binary lands in `node_modules/esbuild/bin/esbuild`
- Resolved versions match `bun install` (without --ignore-scripts)
  byte-for-byte

## Out of scope (filed as follow-ups)

- Sandboxed lifecycle-script execution (macOS sandbox-exec / Linux
  bubblewrap+seccomp / Windows AppContainer with redacted env +
  filesystem confinement + network allowlist). v1 runs allowlisted
  scripts un-sandboxed because they're trusted packages; v2 will
  sandbox so even a compromised allowlisted version has bounded
  blast radius.
- Scoped-package typosquat detection.
- P1 freshness / maintainer-drift / SLSA-provenance checks (CLI
  flag `--check-freshness` reserved, not implemented).
- Private registries / .npmrc auth / yarn backends. pnpm backend
  is partially supported (the walker handles its store layout) but
  the runner only knows bun + npm.

No new direct deps. Per CLAUDE.md external-contributor pattern, this
PR leaves `[workspace.package].version` in `Cargo.toml`, the
`**Current Version:**` line in `CLAUDE.md`, and `CHANGELOG.md`
untouched — the maintainer bumps + writes the changelog at merge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant