Skip to content

shengchaochen82/FeDaL

Repository files navigation

FeDaL

FeDaL

Federated Dataset Learning for General Time Series Foundation Models

ICLR 2026 arXiv OpenReview

Python PyTorch CUDA License

Dataset-level heterogeneity introduces significant domain biases that fundamentally degrade generalization on general Time Series Foundation Models (TSFMs), yet this challenge remains underexplored. This paper rethinks the from-scratch training of TSFMs using the paradigm of federated learning. We propose a novel Federated Dataset Learning (FeDaL) approach to tackle heterogeneous time series by learning dataset-agnostic temporal representations. Specifically, the distributed architecture of federated learning is a natural solution to decompose heterogeneous TS datasets into shared generalized knowledge and preserved personalized knowledge. Moreover, based on the TSFM architecture, FeDaL explicitly mitigates both local and global biases by adding two complementary mechanisms: Domain Bias Elimination (DBE) and Global Bias Elimination (GBE). FeDaL's cross-dataset generalization has been extensively evaluated in real-world datasets spanning eight tasks (including various regression and classification), against 54 baselines. We further analyze federated scaling behavior, showing how data volume, client count, and join rate affect model performance under decentralization.

FeDaL architecture: DBE on clients, GBE on the server

News

  • 🛠️  2026-05  ·  We have refactored and polished the entire codebase, and released a set of tutorials.
  • 📦  2026-02  ·  We have released the full FeDaL training pipeline.
  • 🎉  2026-01  ·  Our paper FeDaL: Federated Dataset Learning for General Time Series Foundation Models has been accepted to ICLR 2026 — we sincerely thank the reviewers and the broader time-series community for their thoughtful feedback throughout the process.

Installation

git clone https://github.com/shengchaochen82/FeDaL.git
cd FeDaL
conda create -n fedal python=3.10 -y && conda activate fedal
pip install -r requirements.txt
pip install -e .

Requirements  ·  PyTorch 2.0+  ·  CUDA 11.8+  ·  Python 3.10+

Dataset Preparation

FeDaL pretraining is evaluated on three public corpora. All loaders consume the same on-disk format, so adding a new corpus is just a matter of writing a converter.

Layout
datasets/crossdomain/
├── UTSD.npz          # one .npz per corpus
├── CTSD.npz
└── LOTSA/            # LOTSA is sharded by sub-dataset (one .npz per client)
    ├── beijing_pm25.npz
    ├── electricity.npz
    └── ...

Each .npz is a flat dictionary whose keys are series identifiers and whose values are arrays of shape [T, n_features] (float32). The framework filters series shorter than seq_len, channel-splits them, builds sliding windows, and StandardScaler-normalises each channel.

UTSD  ·  ~1 B time points · 7 domains  ·  huggingface.co/datasets/thuml/UTSD
# scripts/data/convert_utsd.py
import numpy as np
from datasets import load_dataset

ds = load_dataset("thuml/UTSD", split="train")          # ~1 GB
series = {}
for i, row in enumerate(ds):
    arr = np.asarray(row["target"], dtype=np.float32).reshape(-1, 1)
    series[f"utsd_{i:07d}"] = arr
np.savez_compressed("datasets/crossdomain/UTSD.npz", **series)

UTSD is used with domain-mixed (DM) partitioning. Two heterogeneity levels (UTSD_unbalance_H1, UTSD_unbalance_H2) are pre-registered in FedRepresentationDatasetConfig; control the Dirichlet imbalance via the imbalance_factor field.

CTSD  ·  ~500 M time points · 6 domains  ·  forecastingdata.org (Monash repository)
# scripts/data/convert_ctsd.py
import numpy as np
from sktime.datasets import load_from_tsfile_to_dataframe

CTSD_TSF_FILES = [          # cherry-picked from Monash to mirror the paper split
    "covid_deaths.tsf", "tourism_monthly.tsf", "traffic_hourly.tsf",
    "fred_md.tsf",      "kdd_cup_2018.tsf",   "weather_monash.tsf", ...
]

series = {}
for fname in CTSD_TSF_FILES:
    df, _ = load_from_tsfile_to_dataframe(f"datasets/raw/monash/{fname}")
    for i, row in df.iterrows():
        arr = np.asarray(row["series_value"], dtype=np.float32).reshape(-1, 1)
        series[f"{fname[:-4]}_{i:06d}"] = arr
np.savez_compressed("datasets/crossdomain/CTSD.npz", **series)

CTSD is used with domain-independent (DI) partitioning — each client is tied to a single Monash domain, sequences within a client are non-overlapping but the same domain may appear across clients.

LOTSA  ·  ~231 B time points · 174 sub-datasets  ·  huggingface.co/datasets/Salesforce/lotsa_data
# scripts/data/convert_lotsa.py
import os, numpy as np
from datasets import load_dataset

OUT = "datasets/crossdomain/LOTSA"
os.makedirs(OUT, exist_ok=True)

ds = load_dataset("Salesforce/lotsa_data", split="train", streaming=True)
buckets = {}
for row in ds:
    domain = row["source_dataset"]
    arr = np.asarray(row["target"], dtype=np.float32)
    if arr.ndim == 1:
        arr = arr.reshape(-1, 1)
    buckets.setdefault(domain, []).append(arr)

for domain, series_list in buckets.items():
    payload = {f"{domain}_{i:06d}": s for i, s in enumerate(series_list)}
    np.savez_compressed(os.path.join(OUT, f"{domain}.npz"), **payload)

LOTSA naturally aligns with the federated setting: 174 sub-datasets → 174 clients. Datasets that overlap with downstream benchmarks (e.g. Traffic, ECL) are excluded at conversion time.

Adding your own corpus  ·  emit an .npz of {series_key: float32[T, C]} and register it in data_provider/fed_representation_data.py::FedRepresentationDatasetConfig.SINGLE_DATASET_CONFIGS. No other code changes are needed.

Quick Start

# Single-GPU pretraining
python main.py --algorithm FeDaL --config configs/EXP_BASIC.yaml

# Two-GPU DataParallel
python main.py --algorithm FeDaL --config configs/EXP_BASIC.yaml \
       --multi_gpus --device_id 0,1

torchrun --standalone --nproc_per_node=16 main.py \
         --algorithm FeDaL --config configs/EXP_BASIC.yaml

Important

FeDaL is highly sensitive to its hyperparameters and dataset configuration. Please make sure your data preparation and YAML config match the paper exactly before launching a run. Minor numerical drift across runs is also expected — it comes from random patch masking, client sampling, and the inherent stochasticity of federated optimisation, not from a bug.

Note

Development Status: We are continuously improving the codebase. Some interfaces may change as we enhance the framework.

Citation

If you find FeDaL useful for your research, please cite:

@inproceedings{
chen2026fedal,
title={FeDaL: Federated Dataset Learning for General Time Series Foundation Models},
author={Shengchao Chen and Guodong Long and Michael Blumenstein and Jing Jiang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=HK6t5x5gJq}
}

Acknowledgements

This project builds on the shoulders of several excellent open-source efforts:

  • TSLib  —  a comprehensive PyTorch toolbox covering forecasting, imputation, classification, and anomaly detection on standard time-series benchmarks.
  • PatchTST  —  a channel-independent Transformer for long-term forecasting built on patch tokenisation of univariate series.
  • MOMENT  —  an open family of time-series foundation models pretrained for general analytical tasks across domains.
  • Moirai  —  a universal forecasting Transformer accompanied by the large-scale LOTSA pretraining corpus.
  • Chronos  —  a language-modelling-style framework that tokenises numerical time series for probabilistic zero-shot forecasting.
  • Time-MoE  —  a billion-scale, sparse Mixture-of-Experts foundation model designed for time-series forecasting at scale.
  • FedML  —  an open research and production platform for federated learning across edge, cloud, and cross-silo settings.

License

This repository is released under the MIT License.

About

[ICLR'26] Official implementation for Federated Dataset Learning for General Time Series Foundation Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors