A production-ready Python API for registering edge/IoT devices and storing their firmware/config blobs. Demonstrates Docker, Terraform IaC, Kubernetes/Helm deployment, GitHub Actions CI/CD, and a built-in observability stack.
Disclaimer: This repository was built with the assistance of Claude Code (Anthropic's AI coding tool). All generated code, configuration, and documentation has been reviewed and manually verified end-to-end by the author.
┌──────────────┐ POST /devices ┌──────────────┐
│ Client │ ─────────────────────▶│ FastAPI │
│ │ ◀─────────────────────│ (uvicorn) │
│ │ GET /devices └──────┬───────┘
└──────────────┘ │
┌─────────┴──────────┐
│ │
┌────▼─────┐ ┌─────▼─────┐
│ DynamoDB │ │ S3 │
│(devices) │ │(firmware) │
└──────────┘ └───────────┘
Stack
| Layer | Technology |
|---|---|
| API | Python 3.12 · FastAPI · uvicorn |
| Persistence | AWS DynamoDB (PAY_PER_REQUEST) |
| Object store | AWS S3 (SSE-KMS CMK, versioned) |
| Encryption | AWS KMS (CMK, automatic key rotation) |
| Observability | Prometheus /metrics · JSON logs · Grafana |
| IaC | Terraform ≥ 1.5 (hashicorp/aws) |
| Container | Docker multi-stage, non-root user |
| Orchestration | Kubernetes · Helm 3 |
| Auth (K8s) | IRSA (IAM Roles for Service Accounts) |
| CI/CD | GitHub Actions |
| Local AWS | LocalStack CE |
| Tool | Version |
|---|---|
| Python | ≥ 3.12 |
| Poetry | ≥ 1.8 |
| Docker + Compose | ≥ 24 |
| Terraform | ≥ 1.5 |
| Helm | ≥ 3.14 |
| kubectl | any |
| minikube | any |
| AWS CLI | any |
On Ubuntu (native or WSL2) all prerequisites can be installed automatically:
bash setup.shbash setup.shSkip this step if all tools are already installed.
make installmake testmake upmake infra-init # first time only
make infra-applyThe Terraform state is stored locally. For production, configure an S3 backend.
make smokeThis runs five sequential checks — health, register device, list devices, DynamoDB scan,
S3 objects — and prints a pass/fail summary table. The script is at
scripts/smoke.sh and can also be called directly:
scripts/smoke.sh <app-url> [localstack-url].
make down
make infra-destroyRequires LocalStack already running (make up && make infra-apply).
make minikube-up # start minikube + build image + deploy chart
make minikube-smoke # smoke-test the app running inside the clusterIndividual targets:
make minikube-start # start minikube only
make minikube-build # build the image inside minikube's Docker daemon
make minikube-deploy # install/upgrade the Helm chart
make minikube-delete # remove the Helm release
make minikube-stop # stop minikubeThe local chart override (helm/fleet-api/values-local.yaml)
sets imagePullPolicy: Never, NodePort service, single replica, and routes AWS calls to
host.minikube.internal:4566 (LocalStack on the host).
FastAPI app with automatic OpenAPI docs at /docs:
| Method | Path | Description |
|---|---|---|
| GET | /devices | Scan DynamoDB, return all registered devices |
| POST | /devices | Upload firmware to S3, write device to DynamoDB |
| GET | /healthz | Liveness/readiness probe target |
| GET | /metrics | Prometheus metrics |
| GET | /docs | Swagger UI (OpenAPI) |
Configuration is read from environment variables (see app/.env-example)
via pydantic-settings.
Multi-stage build: the builder stage installs Poetry and exports a plain
requirements.txt; the runtime stage copies only the requirements and application source —
Poetry is not present in the final image. The container runs as a non-root user (uid 1000)
enforced via USER in the Dockerfile, with gunicorn (uvicorn worker class) serving the
ASGI app on port 8000.
Resources created:
- DynamoDB table
devices-{environment}—device_idas hash key, PAY_PER_REQUEST billing, PITR enabled - S3 bucket
fleet-firmware-{environment}— versioning, SSE-KMS with CMK, public access blocked, bucket key enabled - KMS key
fleet-api-s3-cmk-{environment}— CMK for S3 encryption, automatic annual key rotation - IAM role
fleet-api-role-{environment}— least-privilege policy (DynamoDB Scan/PutItem/GetItem, S3 PutObject/GetObject, KMS GenerateDataKey/Decrypt)
The IAM role uses an IRSA trust policy when eks_oidc_provider_arn and
eks_oidc_provider_url variables are set, allowing Kubernetes pods to authenticate via OIDC
without static credentials. When those variables are empty, it falls back to an EC2 trust
policy for local testing.
To target LocalStack instead of real AWS:
terraform apply -var="localstack_endpoint=http://localhost:4566"Key reliability features:
| Feature | Implementation |
|---|---|
| Liveness probe | GET /healthz — restarts stuck pods |
| Readiness probe | GET /healthz — removes unhealthy pods from Service |
| Autoscaling | HPA: 2–10 replicas, CPU target 70% |
| Non-root container | securityContext.runAsNonRoot: true |
| Read-only filesystem | readOnlyRootFilesystem: true |
| IRSA | ServiceAccount annotated with IAM role ARN |
| Config separation | Non-sensitive config in ConfigMap; secrets via IRSA |
| Metrics scraping | Optional Prometheus-Operator ServiceMonitor |
| Dashboards | Optional Grafana dashboard ConfigMap (sidecar import) |
- Metrics:
prometheus-fastapi-instrumentatorexposes request rate, latency, and in-progress counts atGET /metrics. - Structured logs:
python-json-loggeremits single-line JSON logs (timestamp, level, logger, message) on stdout — ready for Loki/CloudWatch/ELK. - Prometheus: enable scraping with
--set serviceMonitor.enabled=true(requires the Prometheus Operator CRDs). - Grafana: enable
--set grafanaDashboard.enabled=trueto ship the dashboard at helm/fleet-api/dashboards/fleet-api.json as a sidecar-discovered ConfigMap (request rate, error rate, p95 latency, in-progress requests).
Six jobs in .github/workflows/ci.yml:
| Job | Trigger | Description |
|---|---|---|
| lint-and-test | push / PR | ruff lint + pytest with moto (no real AWS) |
| docker-build | push to main | Build + push to GHCR; tagged with SHA + latest |
| security-scan | after docker-build | Trivy scan; fails pipeline on CRITICAL CVEs |
| helm-lint | push / PR | helm lint + helm template (no cluster needed) |
| trivy-helm | push / PR | Trivy misconfig scan on Helm chart (HIGH, CRITICAL) |
| trivy-terraform | push / PR | Trivy misconfig scan on Terraform (HIGH, CRITICAL) |
No AWS credentials are required for CI — moto intercepts all boto3 calls in-process.
This section walks through deploying the application to a real EKS cluster end-to-end. It assumes the cluster already exists; cluster creation is out of scope (use a Terraform EKS module).
Before provisioning any infrastructure, set up an S3 backend so state is shared and consistent across runs.
Create a dedicated state bucket (versioning + SSE enabled), then add a backend block to
terraform/providers.tf or a separate backend.tf:
terraform {
backend "s3" {
bucket = "<state-bucket-name>"
key = "fleet-api/terraform.tfstate"
region = "<region>"
use_lockfile = true # S3-native locking, Terraform ≥ 1.10
}
}Reinitialise to migrate local state to the bucket:
terraform init -migrate-stateaws eks update-kubeconfig --name <cluster-name> --region <region>Associate the cluster OIDC provider (idempotent):
eksctl utils associate-iam-oidc-provider --cluster <cluster-name> --approveRetrieve the values needed by Terraform:
OIDC_URL=$(aws eks describe-cluster --name <cluster-name> \
--query "cluster.identity.oidc.issuer" --output text | sed 's|https://||')
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
OIDC_ARN="arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_URL}"Apply Terraform against real AWS (no localstack_endpoint):
terraform apply \
-var="environment=prod" \
-var="eks_oidc_provider_arn=${OIDC_ARN}" \
-var="eks_oidc_provider_url=${OIDC_URL}" \
-var="k8s_namespace=default" \
-var="k8s_service_account_name=fleet-api"Note the resulting IAM role ARN for the Helm step:
terraform output iam_role_arnInstall an Ingress controller and cert-manager if not already present in the cluster:
# AWS Load Balancer Controller (or substitute nginx-ingress)
helm repo add eks https://aws.github.io/eks-charts
helm upgrade --install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system --set clusterName=<cluster-name>
# cert-manager
helm repo add jetstack https://charts.jetstack.io
helm upgrade --install cert-manager jetstack/cert-manager \
-n cert-manager --create-namespace --set installCRDs=trueCreate a ClusterIssuer for Let's Encrypt, then enable the Ingress when deploying the chart
(step 7).
kubectl create secret docker-registry ghcr-creds \
--docker-server=ghcr.io \
--docker-username=<github-username> \
--docker-password=<github-pat-with-read-packages>helm upgrade --install fleet-api helm/fleet-api/ \
--set image.repository=ghcr.io/<owner>/fleet-api \
--set image.tag=<sha> \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=<IAM_ROLE_ARN> \
--set imagePullSecrets[0].name=ghcr-creds \
--set config.dynamodbTable=devices-prod \
--set config.s3Bucket=fleet-firmware-prod \
--set serviceMonitor.enabled=true \
--set grafanaDashboard.enabled=true \
--set ingress.enabled=true \
--set ingress.hosts[0].host=<your-domain>No config.awsEndpointUrl — pods authenticate via IRSA and reach real AWS endpoints
directly.
scripts/smoke.sh https://<your-domain>