magg - Multi-resolution Aggregation

Aggregate point observations to multi-resolution grids using HEALPix spatial indexing and serverless compute.

Overview

magg aggregates sparse point data (e.g., ICESat-2 ATL06 elevation measurements) to gridded products using HEALPix/morton spatial indexing. Processing runs in parallel on AWS Lambda — each worker handles one spatial cell independently, writing to a shared Zarr v3 store following the DGGS convention.

Features

Pre-computed granule catalogs — query CMR once, process many times
Morton-based spatial indexing — HEALPix nested scheme for hierarchical grids
Massive parallelism — tested with up to 1,700 concurrent Lambda workers
Direct S3 access — h5coro reads HDF5 via byte-range requests, no downloads
Cost-effective — ~~$0.006/cell (~~$2 per full Antarctica run on ARM64)

End-to-End Workflow

Step 1: Build a Granule Catalog

Query NASA's CMR to build a mapping of spatial cells to granule S3 URLs.

# ICESat-2 convenience — cycle number computes dates automatically:
uv run python -m magg.catalog --cycle 22 --parent-order 6

# General — explicit date range and spatial polygon:
uv run python -m magg.catalog \
    --start-date 2024-01-06 --end-date 2024-04-07 \
    --short-name ATL06 \
    --polygon my_region.geojson \
    --parent-order 6

When --polygon is provided, the bounding box for the CMR query is computed automatically from the polygon's extent, and morton_coverage uses the polygon for cell discovery. When no polygon is given, Antarctic drainage basins are used as the default.

Output: catalog_ATL06_2024-01-06_2024-04-07_order6.json

See Catalog API for full options.

Step 2: Deploy the Lambda Function

Build and deploy the Lambda function and its dependency layer.

# Build the function package
bash deployment/aws/build_function.sh

# Build the dependency layer (ARM64)
bash deployment/aws/build_arm64_layer.sh

# Deploy
bash deployment/aws/deploy.sh

See Lambda Deployment and ARM64 Build Guide.

Step 3: Run Processing

Processing reads a pipeline config YAML (data source, aggregation, output store) and a granule catalog. Run locally or dispatch to Lambda.

# Local processing (write to local Zarr):
uv run python -m magg --config atl06.yaml --catalog catalog.json --store ./output.zarr

# Local processing (write to S3):
uv run python -m magg --config atl06.yaml --catalog catalog.json --store s3://bucket/output.zarr

# Lambda dispatch (requires deployed Lambda function):
uv run python deployment/aws/invoke_lambda.py \
    --config atl06.yaml --catalog catalog.json

# Test with a few cells:
uv run python -m magg --config atl06.yaml --catalog catalog.json --max-cells 5

# Dry run:
uv run python -m magg --config atl06.yaml --catalog catalog.json --dry-run

The store path and output grid parameters are defined in the YAML config (output.store, output.grid.child_order) and can be overridden via --store on the command line.

Step 4: Visualize Results

The output Zarr is a public DGGS dataset. The included notebook rasterizes HEALPix cells to a polar stereographic grid for fast rendering with imshow.

uv run jupyter notebook notebooks/rasterized_zarr.ipynb

Adjust GRID_SPACING in the notebook to control output resolution (default 2 km).

Project Structure

magg/
├── src/magg/              # Main package (cloud-agnostic)
│   ├── __main__.py        # Local processing runner (python -m magg)
│   ├── config.py          # YAML pipeline configuration
│   ├── processing.py      # Core aggregation pipeline
│   ├── catalog.py         # CMR query + catalog building
│   ├── schema.py          # Output schema + Zarr template
│   ├── store.py           # Store factory (local or S3)
│   ├── auth.py            # NASA Earthdata authentication
│   └── configs/           # Built-in pipeline configs (atl06.yaml)
├── deployment/            # Cloud-specific deployment
│   └── aws/               # Lambda handler, orchestrator, build scripts
├── notebooks/             # Visualization
├── docs/                  # Documentation
└── tests/                 # Test suite

Documentation

Architecture — design philosophy, end-to-end flow diagram, key decisions
Schema — aggregation dispatch, extending with new statistics
API Reference — catalog, processing, schema, auth modules
Lambda Deployment — AWS setup and production use
ARM64 Build Guide — building Lambda layers for ARM64

Development

# Install
uv sync --all-groups

# Run tests
uv run pytest

# Lint
uv run ruff check src/

Requires Python >= 3.12, uv, AWS credentials (for Lambda), and a NASA Earthdata account (for data access).

Performance

Metric	Value
Execution time	2–3 min average per cell
Memory	2 GB configured, 1–1.5 GB typical
Throughput	Tested with up to 1,700 concurrent workers
Cost	~~$0.006/cell (~~$2 per full Antarctica run on ARM64)

License

MIT — see LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github/workflows		.github/workflows
deployment		deployment
docs		docs
notebooks		notebooks
src/magg		src/magg
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

magg - Multi-resolution Aggregation

Overview

Features

End-to-End Workflow

Step 1: Build a Granule Catalog

Step 2: Deploy the Lambda Function

Step 3: Run Processing

Step 4: Visualize Results

Project Structure

Documentation

Development

Performance

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

magg - Multi-resolution Aggregation

Overview

Features

End-to-End Workflow

Step 1: Build a Granule Catalog

Step 2: Deploy the Lambda Function

Step 3: Run Processing

Step 4: Visualize Results

Project Structure

Documentation

Development

Performance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages