Skip to content

feat(docker): add python3 + data-analysis libs to bash image#65

Closed
mani-muon wants to merge 1 commit into
aron-muon:mainfrom
mani-muon:mani/bash-image-add-python-stack
Closed

feat(docker): add python3 + data-analysis libs to bash image#65
mani-muon wants to merge 1 commit into
aron-muon:mainfrom
mani-muon:mani/bash-image-add-python-stack

Conversation

@mani-muon
Copy link
Copy Markdown
Contributor

Summary

Adds python3 plus the common data-analysis libraries (numpy/pandas/matplotlib/openpyxl/Pillow) to the bash image so bash_tool calls of the form python3 -c "..." resolve successfully.

Fixes # (no existing issue — happy to file one if preferred)

Type of change

  • New feature (non-breaking change which adds functionality)

Why

After deploying the bash image, LLM-driven bash_tool calls were observed invoking python3 (often with import pandas, numpy, matplotlib) for arithmetic, parsing, and plotting. Without a Python interpreter in the bash image these calls return command not found, and with only stdlib Python they would return ModuleNotFoundError on the first import.

Package selection

Apt-installed for fast rebuilds and stable deps:

  • python3 — interpreter
  • python3-numpy, python3-pandas, python3-matplotlib — the data-analysis trio observed in LLM-generated scripts
  • python3-openpyxl — Excel read/write
  • python3-pil — image manipulation

Skipped scipy and scikit-learn (each ~150-300 MB, narrower use cases). Easy follow-up to add if needed.

The dedicated kubecoderun-python image with the full pip-managed scientific stack still serves lang: "py" callers that need bleeding-edge versions.

Image size impact

Roughly +400 MB.

How Has This Been Tested?

  • Dockerfile builds locally would validate the apt package names, but the test rig depends on dhi.io registry auth which I don't have. CI build will catch any typo immediately.
  • End-to-end smoke (scripts/test-images.sh -l bash after the build) — depends on CI

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

LLM-generated bash scripts overwhelmingly invoke python3 (often
with numpy/pandas/matplotlib) for any arithmetic, parsing, or
plotting work. Without an interpreter in the bash image these
calls return command not found or ModuleNotFoundError.

Adds python3, python3-numpy, python3-pandas, python3-matplotlib,
python3-openpyxl, python3-pil via apt to cover the common
LLM-generated patterns. Apt-installed versions keep the rebuild
fast and the dependency surface stable; scripts that need
bleeding-edge versions can still target lang: py.

Image size increases roughly 400 MB but matches the expected
shape of a shell + scripting sandbox.
@mani-muon mani-muon marked this pull request as ready for review May 19, 2026 20:05
@mani-muon mani-muon requested a review from aron-muon as a code owner May 19, 2026 20:05
Copy link
Copy Markdown
Owner

@aron-muon aron-muon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have dedicated python images, if an LLM is using the wrong tool, the prompting needs updated instead

@aron-muon
Copy link
Copy Markdown
Owner

Introduced fix in this PR #66

@aron-muon aron-muon closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants