Skip to content

usm-aiir/cloud-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cloud Detection for Optical Satellite Imagery

Comparative study of three deep learning segmentation architectures for cloud detection in Landsat-8 imagery.

Implemented and evaluated for an independent study at the University of Southern Maine, conducted under the supervision of Dr. Behrooz Mansouri at the AIIR Lab. All three models are trained and evaluated on the 38-Cloud dataset.


Models

Model Architecture Backbone Pretrained
Cloud-Net U-Net variant with multi-scale skip connections Custom CNN No
DeepLabV3+ ASPP encoder-decoder ResNet-50 ImageNet
SegFormer Hierarchical Vision Transformer MIT-B2 No

Results Summary

Model Val IoU Test IoU Test F1 Train Time
Cloud-Net 0.939 0.608 0.714 3.5 hrs
DeepLabV3+ 0.941 0.606 0.716 16.8 hrs
SegFormer 0.943 26.2 hrs

Note: SegFormer achieved the highest validation IoU but its test evaluation is affected by a normalisation pipeline mismatch. The checkpoint is intact and the model can be re-evaluated once the fix described below is applied. See the paper for full diagnosis.


Dataset

Download the 38-Cloud dataset from Kaggle: https://www.kaggle.com/datasets/sorour/38cloud-cloud-segmentation-in-satellite-images

Expected directory structure after downloading:

data/
├── Training/
│   ├── train_red/
│   ├── train_green/
│   ├── train_blue/
│   ├── train_nir/
│   ├── train_gt/
│   └── training_patches_38-Cloud.csv
└── Test/
    ├── test_red/
    ├── test_green/
    ├── test_blue/
    ├── test_nir/
    ├── Entire_scene_gts/
    └── test_patches_38-Cloud.csv

Installation

pip install torch torchvision tifffile pillow numpy pandas scikit-learn tqdm segmentation-models-pytorch

segmentation-models-pytorch is required for DeepLabV3+ only.


File Structure

.
├── train_cloudnet.py            # Cloud-Net training + test evaluation
├── train_deeplabv3plus.py       # DeepLabV3+ training + test evaluation
├── train_transformer_fixed.py   # SegFormer training + test evaluation
├── visualize_cloudnet.py        # Visualisation for Cloud-Net outputs
├── visualize_deeplabv3plus.py   # Visualisation for DeepLabV3+ outputs
├── visualize_transformer.py     # Visualisation for SegFormer outputs
├── cloud_net_model.py           # Original Keras Cloud-Net architecture (reference)
├── generators.py                # Original Keras data generators (reference)
├── losses.py                    # Original Keras Jaccard loss (reference)
├── augmentation.py              # Original Keras augmentation functions (reference)
├── main_train.py                # Original Keras training entry point (reference)
└── main_test.py                 # Original Keras test entry point (reference)

Files prefixed with cloud_net_model, generators, losses, augmentation, main_train, and main_test are the original Keras/TensorFlow reference implementation. The three train_*.py scripts are the full PyTorch reimplementations used for this study.


Training

Each script is self-contained — just point it at your data directory in the Config class and run.

Cloud-Net:

python train_cloudnet.py

DeepLabV3+:

python train_deeplabv3plus.py

SegFormer:

python train_transformer_fixed.py

All three scripts will:

  1. Compute per-channel mean and standard deviation from the training set
  2. Train the model with early stopping
  3. Save the best checkpoint (including mean/std) to the model's output directory
  4. Automatically run test-set evaluation at the end

Checkpoints are saved to:

  • outputs_cloudnet/checkpoints/best_model.pth
  • outputs_deeplabv3plus/checkpoints/best_model.pth
  • outputs_segformer/checkpoints/best_model.pth

Smoke Tests

All three scripts include a smoke test mode that runs the full pipeline on synthetic data in ~30 seconds without requiring the dataset. Run these first to verify your environment is set up correctly:

python train_cloudnet.py --smoketest
python train_deeplabv3plus.py --smoketest
python train_transformer_fixed.py --smoketest

Visualisation

After training and evaluation, generate patch visualisations and coverage maps:

python visualize_cloudnet.py
python visualize_deeplabv3plus.py
python visualize_transformer.py

Outputs are saved to outputs_<model>/visualisations/ and include:

  • best_patches.png — top 12 patches by IoU
  • worst_patches.png — bottom 12 patches by IoU
  • random_patches.png — 12 randomly sampled patches
  • cloud_coverage_map.png — predicted cloud fraction per patch across the test set
  • training_history.png — loss, IoU, and F1 curves

Known Issue: SegFormer Test Evaluation

SegFormer's test-set results (IoU 0.025) do not reflect the model's true capability. The model trained correctly to a validation IoU of 0.943 — the highest of all three models — but the test evaluation was affected by a normalisation mismatch between training and inference.

To fix and re-evaluate without retraining:

First, confirm the issue by checking whether the checkpoint contains the normalisation statistics:

import torch
ckpt = torch.load("outputs_segformer/checkpoints/best_model.pth", map_location="cpu")
print(list(ckpt.keys()))  # should include 'mean' and 'std'

If mean and std are present, create a file called run_test_eval.py:

import sys
sys.path.insert(0, '.')
from train_transformer_fixed import Config, SegFormer, evaluate_test
import torch

cfg  = Config()
ckpt = torch.load("outputs_segformer/checkpoints/best_model.pth", map_location=cfg.DEVICE)

mean = ckpt["mean"]
std  = ckpt["std"]

model = SegFormer(cfg).to(cfg.DEVICE)
model.load_state_dict(ckpt["state_dict"])

test_metrics, test_df = evaluate_test(model, cfg, mean, std)
print(test_df)

Then run:

python run_test_eval.py

If mean and std are not in the checkpoint, the model needs to be retrained. Running python train_transformer_fixed.py from scratch will produce a correct checkpoint and automatically run test evaluation at the end.


Configuration

Key hyperparameters are set in the Config class at the top of each training script. Notable defaults:

Setting Cloud-Net DeepLabV3+ SegFormer
Image size 192 × 192 128 × 128 192 × 192
Batch size 6 4 8
Learning rate 1e-4 Enc: 1e-4 / Dec: 1e-3 1e-4
Max epochs 50 50 50
Early stop patience 10 8 8
Val split 20% 10% 10%

References

  • Mohajerani, S., & Saeedi, P. (2019). Cloud-Net: An end-to-end cloud detection algorithm for Landsat 8 imagery. IGARSS 2019.
  • Chen, L.-C., et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. ECCV 2018.
  • Xie, E., et al. (2021). SegFormer: Simple and efficient design for semantic segmentation with transformers. NeurIPS 2021.
  • Zhu, Z., & Woodcock, C. E. (2012). Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sensing of Environment, 118, 83–94.

Author

Kristina Zbinden — University of Southern Maine, Independent Study, 2025

Research Conducted Under Supervision of Lab Director: Dr. Behrooz Mansouri — University of Southern Maine, AIIR Lab

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages