Note
What's new in the GeoPlant ecosystem
- New downloader tool. Python and CLI access for the newly structured dataset, so you can download only the components you need. See
dataset/README.md. - Data refreshed and fixed. The update adds 30m OpenStreetMap-derived Human Footprint rasters, corrected/re-extracted SoilGrids values, and upgraded Sentinel-2 TIFF patches with RGB+NIR bands.
- New evaluation protocols. GeoPlant now includes IID, OOD, and GLC25 presence-absence test sets, with leaderboards designed to measure spatial generalization and rare-species performance.
GeoPlant is a large-scale, multimodal dataset for spatial plant species prediction across Europe.
It integrates expert-verified species observations with rich environmental predictors and enables research, benchmarking, and applications in biodiversity, earth observation, and deep learning.
Figure 1. GeoPlant combines 5M Presence-Only and 90k Presence-Absence records with Sentinel-2 imagery, Landsat time series, CHELSA climate, and environmental rasters for 10k+ European plant species.
- Dataset Overview: Learn about provided presence–absence and presence–only species data.
- Environmental Predictors: Explore different variables, e.g., satellite imagery, time series, climate, soil, land cover, and human footprint.
- Baselines & Benchmarking: See benchmark tasks, metrics, and baseline models.
- Resources & Download: Links to Kaggle, Seafile, Hugging Face, and the NeurIPS 2024 paper.
See the downloader guide in dataset/README.md.
| Resource | Description | Link |
|---|---|---|
| 📄 Dataset Paper | NeurIPS 2024 proceedings paper (Datasets & Benchmarks track) | Proceedings |
| 📄 Extended Version | arXiv preprint with supplementary details | arXiv:2408.13928 |
| 🚀 Starter Notebooks | Baseline models, pipelines, and scripts | GeoPlant Code on Kaggle |
| 📦 Full Dataset | Full data including PO and environmental rasters | GeoPlant Seafile |
| 🤗 Pretrained Models | Hugging Face collection of baselines | Hugging Face |
| Branch | What is inside |
|---|---|
main |
Stable version of the project. |
dev |
Refactoring and better accessibility. |
docs |
Sources for the website documentation. |
If you use GeoPlant, please cite the NeurIPS proceedings:
NeurIPS 2024 (Datasets & Benchmarks Track)
@inproceedings{picek2024geoplant_neurips,
title = {GeoPlant: Spatial Plant Species Prediction Dataset},
author = {Picek, Lukas and Botella, Christophe and Servajean, Maximilien and Leblanc, C{\'e}sar and Palard, R{\'e}mi and Larcher, Th{\'e}o and Deneu, Benjamin and Marcos, Diego and Bonnet, Pierre and Joly, Alexis},
booktitle = {NeurIPS 2024 Datasets and Benchmarks Track},
year = {2024}
}- Issues & feature requests: GitHub Issues
- Kaggle discussion: GeoPlant on Kaggle

