AWS Lambda

AWS Lambda function for processing ICESat-2 ATL06 data by morton cell.

Overview

The Lambda function processes a single morton cell (order 6) by:

Reading HDF5 files directly from S3 using h5coro (no downloads)
Spatial filtering using morton indexing
Calculating summary statistics for child cells (order 12)
Writing xdggs-enabled Zarr to S3

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Lambda Function (process-morton-cell)                      │
│  ──────────────────────────────────────────────────────────  │
│  Runtime: Python 3.12                                       │
│  Memory: 2048 MB (2 GB)                                     │
│  Timeout: 720s (12 minutes)                                 │
│  ──────────────────────────────────────────────────────────  │
│  Code (~5 MB):                                              │
│    - deployment/aws/lambda_handler.py (AWS wrapper)         │
│    - src/magg/ package (processing, auth, catalog)          │
│  ──────────────────────────────────────────────────────────  │
│  Layer (~70 MB compressed, ~240 MB uncompressed):           │
│    - numpy, pandas, h5coro, mortie, healpy                  │
│    - fastparquet, cramjam, shapely, astropy, earthaccess    │
│    - pydantic-zarr, zarr, obstore, pyarrow                  │
└─────────────────────────────────────────────────────────────┘

Files

File	Purpose
`deployment/aws/lambda_handler.py`	AWS Lambda wrapper function
`src/magg/processing.py`	Cloud-agnostic core processing logic
`src/magg/auth.py`	NASA Earthdata authentication helper
`src/magg/catalog.py`	CMR granule catalog builder
`deployment/aws/invoke_lambda.py`	Orchestration script
`deployment/aws/build_arm64_layer.sh`	ARM64 Lambda layer build script

Event Payload

{
  "parent_morton": 123456,
  "parent_order": 6,
  "child_order": 12,
  "granule_urls": [
    "s3://nsidc-cumulus-prod-protected/ATLAS/ATL06/007/2023/12/18/...",
    "s3://nsidc-cumulus-prod-protected/ATLAS/ATL06/007/2023/12/19/..."
  ],
  "store_path": "s3://your-output-bucket/atl06/production.zarr",
  "s3_credentials": {
    "accessKeyId": "ASIA...",
    "secretAccessKey": "...",
    "sessionToken": "..."
  }
}

Parameters

Parameter	Type	Required	Description
`parent_morton`	int	Yes	Morton index of parent cell (order 6)
`parent_order`	int	Yes	Order of parent cell (typically 6)
`child_order`	int	Yes	Order of child cells for statistics (typically 12)
`granule_urls`	list	Yes	Pre-computed list of S3 URLs from catalog
`store_path`	str	Yes	Output Zarr store path (e.g. `s3://bucket/prefix.zarr`)
`s3_credentials`	dict	Yes	NSIDC S3 credentials for reading source data

S3 Credentials

Credentials are obtained by the orchestrator once before invoking Lambda functions:

from magg.auth import get_nsidc_s3_credentials

# Get credentials (valid for ~1 hour)
s3_creds = get_nsidc_s3_credentials()

# Pass to each Lambda invocation
event = {
    "parent_morton": -6134114,
    "parent_order": 6,
    "child_order": 12,
    "granule_urls": [...],
    "store_path": "s3://output-bucket/atl06/production.zarr",
    "s3_credentials": s3_creds,
}

This approach avoids rate limiting from 1,872 simultaneous NASA logins and eliminates an AWS Secrets Manager dependency.

Deployment

Step 1: Create the function package

cd /path/to/magg

# Create function.zip with handler and magg package
zip -j deployment/aws/function.zip deployment/aws/lambda_handler.py && \
  cd src && zip -ur ../deployment/aws/function.zip magg/ -i "*.py" && cd ..

Step 2: Build and deploy the Lambda layer

See ARM64 Layer for building and deploying the Lambda layer.

Step 3: Create the Lambda function

aws lambda create-function \
  --function-name process-morton-cell \
  --runtime python3.12 \
  --architectures arm64 \
  --role arn:aws:iam::ACCOUNT_ID:role/lambda-execution-role \
  --handler deployment.aws.lambda_handler.lambda_handler \
  --zip-file fileb://deployment/aws/function.zip \
  --timeout 720 \
  --memory-size 2048 \
  --layers arn:aws:lambda:REGION:ACCOUNT_ID:layer:magg-layer-arm64:VERSION

Updating function code

# Re-create the zip
zip -j deployment/aws/function.zip deployment/aws/lambda_handler.py && \
  cd src && zip -ur ../deployment/aws/function.zip magg/ -i "*.py" && cd ..

# Update the Lambda function
aws lambda update-function-code \
  --function-name process-morton-cell \
  --zip-file fileb://deployment/aws/function.zip

Testing

# Build a granule catalog
uv run python -m magg.catalog --cycle 22 --parent-order 6

# Test locally first (no Lambda required)
uv run python -m magg --config atl06.yaml --catalog catalog.json \
  --store ./test.zarr --max-cells 1

# Dry run with the Lambda orchestrator
uv run python deployment/aws/invoke_lambda.py \
  --config atl06.yaml --catalog catalog.json --dry-run

Performance

Metric	Value
Average execution time	2--3 minutes per cell
Maximum execution time	10 minutes
Lambda timeout	12 minutes (720s)
Configured memory	2048 MB
Typical memory usage	1--1.5 GB
Cold start	3--5 seconds

Cost Estimate

Per invocation (180s average, 2 GB memory): ~$0.006

Full run (~1,300 cells at order 6): ~$2 including S3 and CloudWatch costs.

Troubleshooting

!!! warning "Missing s3_credentials" Ensure your orchestrator script calls [get_nsidc_s3_credentials][magg.auth.get_nsidc_s3_credentials] and passes the credentials to each Lambda invocation.

!!! info "No granules found" This is normal for cells outside the data coverage area. The function returns gracefully with error: "No granules found".

!!! warning "S3 write permission denied" Check that the Lambda execution role has s3:PutObject permission for the output bucket.

!!! warning "Too many open files" Decrease max workers (e.g., --max-workers 50) or increase ulimit (ulimit -n 10000).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS Lambda

Overview

Architecture

Files

Event Payload

Parameters

S3 Credentials

Deployment

Step 1: Create the function package

Step 2: Build and deploy the Lambda layer

Step 3: Create the Lambda function

Updating function code

Testing

Performance

Cost Estimate

Troubleshooting

FilesExpand file tree

lambda.md

Latest commit

History

lambda.md

File metadata and controls

AWS Lambda

Overview

Architecture

Files

Event Payload

Parameters

S3 Credentials

Deployment

Step 1: Create the function package

Step 2: Build and deploy the Lambda layer

Step 3: Create the Lambda function

Updating function code

Testing

Performance

Cost Estimate

Troubleshooting