Skip to content

vishalworkdatacommon/climate_dasboard

Repository files navigation

County-Level Climate Analysis Dashboard

Streamlit App

Overview

This repository contains an interactive web application for performing detailed climate trend analysis on county-level data for the United States. The initial implementation focuses on the Standardized Precipitation Index (SPI).

The project is architected as a fully self-contained, automated system. It integrates a data pipeline that automatically fetches, processes, and updates the application's data on a monthly schedule using GitHub Actions.

Live Application

The live, interactive dashboard is deployed on Streamlit Cloud and is available at the following URL: https://climatedasboard-m8jnmrjrd6ltxhgnbet8x3.streamlit.app/


Features

  • Interactive County Selection: Users can select any US County by its FIPS code from a searchable dropdown menu to load its specific data.
  • Comprehensive Analysis Suite: The application provides a suite of standard time-series analyses for the selected county:
    • Trend Analysis: Visualization of long-term trends using a 12-month rolling average.
    • Anomaly Detection: Identification of statistical anomalies (defined as >2 standard deviations from the rolling mean).
    • Seasonal Decomposition: Decomposition of the time series into observed, trend, seasonal, and residual components.
    • Autocorrelation Analysis: Generation of ACF and PACF plots to inspect the data's correlation structure.
    • Forecasting: Predictive 24-month forecasting using an ARIMA model.
  • Fully Automated: The underlying dataset is automatically updated monthly via a GitHub Actions workflow.

System Architecture and Workflow

This project is designed as a single, self-sustaining repository ("monorepo") that handles both the data pipeline and the user-facing application. It utilizes Git LFS to manage large data files and GitHub Actions for full automation.

The workflow is as follows:

  1. Scheduled Trigger: A GitHub Actions workflow is scheduled to run on the first day of every month.
  2. Data Pipeline Execution: The workflow executes a series of scripts within a cloud-based runner:
    • download_script.py: Fetches the latest raw data from the source (CDC).
    • parse_precipitation_index.py: Cleans and processes the raw data into a standardized format.
    • The processed data is then converted into the efficient Parquet format (spi_data.parquet).
  3. Data Versioning and Update: The workflow commits the new Parquet data file back to the repository. Git LFS handles the storage of this large file.
  4. Continuous Deployment: Streamlit Cloud detects the new commit in the repository and automatically redeploys the application, making the fresh data immediately available to users.

Repository Structure

climate_dasboard/
├── .github/
│   └── workflows/
│       └── update_data.yml   # The instruction manual for the GitHub Actions automation
├── .gitattributes            # Configures which files are handled by Git LFS
├── app.py                    # The main Streamlit application code
├── download_script.py        # Pipeline script to download raw data
├── parse_precipitation_index.py  # Pipeline script to clean raw data
├── import_configs.json       # Configuration for the pipeline
├── index/                    # Contains the raw input data for the pipeline (managed by LFS)
├── spi_data.parquet          # The final, clean data file used by the app (managed by LFS)
└── requirements.txt          # A list of all Python libraries required

Local Development

To run this application on a local machine, follow these steps.

Prerequisites

  • Git
  • Git LFS (sudo apt-get install git-lfs)
  • Python 3.8+ and pip

Installation

  1. Clone the repository:

    git clone https://github.com/vishalworkdatacommon/climate_dasboard.git
    cd climate_dasboard
  2. Pull LFS data: Download the large data files tracked by Git LFS.

    git lfs pull
  3. Set up a virtual environment (recommended):

    python3 -m venv venv
    source venv/bin/activate
  4. Install dependencies:

    pip install -r requirements.txt

Running the Application

Launch the Streamlit application with the following command:

streamlit run app.py

The application will open in your default web browser.

About

CDC_SPI data to deployed with full automated pipeline

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages