This repository is a PyTorch/GPU implementation of Fast Spatial Memory with Elastic Test-Time Training, as well as a self-reimplemented (non-official!) version of 4D-LRM.
Please note, this repository is not distributed under a single uniform license. The license terms applicable to a given file or implementation path depend on that file’s provenance. Users must comply with the license terms applicable to each file, directory, or implementation path, as described in LICENSE.md and in the corresponding license files and file headers.
We recommend using a virtual environment to manage your dependencies. You can create one using the following command to create a virtual environment under
virtualenv --no-download "venv/fsm" --prompt "fsm" # Or "python3.10 -m venv venv/fsm"
source venv/fsm/bin/activateThen, install the required dependencies:
pip install --upgrade pip
pip install -r envs/requirements.txtAlternatively, use conda to create an environment:
conda env create -f envs/environment.ymlPretrained weights are available on Hugging Face.
- Release the
res2564D-LVSM models - Release the
res2564D-LRM models - Release the
res128base models - Add detailed model cards
import os
import shutil
from huggingface_hub import hf_hub_download
repo_id = "marstin/fast-spatial-mem"
local_path = "static/weights"
path_in_repo = "lvsm_checkpoints/fsm_4dlvsm_patch8_res256.pth"
# Download (cached under ~/.cache/huggingface/hub)
cached_path = hf_hub_download(
repo_id=repo_id,
filename=path_in_repo,
repo_type="model"
)
# Copy to your desired local folder
os.makedirs(os.path.dirname(local_path), exist_ok=True)
target_path = os.path.join(local_path, os.path.basename(path_in_repo))
shutil.copy(cached_path, target_path)Run this and should output "OK."
This will be used by fsm/model/losses/perceptual_loss.py
mkdir -p static/weights
wget -O static/weights/imagenet-vgg-verydeep-19.mat https://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-19.mat
cd weights
actual="$(md5sum imagenet-vgg-verydeep-19.mat)"; [[ "$actual" == "106118b7cf60435e6d8e04f6a6dc3657 imagenet-vgg-verydeep-19.mat" ]] && echo "OK" || { echo "Mismatch: $actual"; exit 1; }We provide quickstart_training.ipynb and quickstart_inference.ipynb for quick start.
See static/datasets/example for the data structure.
- Provide data processing scripts
- Add detailed data cards
To pretrain FSM-LVSM from scratch, run
bash scripts/launch_fsm_lvsm_pretrain.shTo pretrain FSM-LRM from scratch, run
bash scripts/launch_fsm_lrm_pretrain.shTo upscale the resolution of FSM-LVSM from an existing checkpoint, run
bash scripts/launch_fsm_lvsm_finetune.shTo upscale the resolution of FSM-LRM from an existing checkpoint, run
bash scripts/launch_fsm_lrm_finetune.shTo evaluate the FSM-LVSM model on Steoro4D test set:
bash scripts/launch_fsm_lvsm_eval.shTo evaluate the FSM-LRM model on Steoro4D test set:
bash scripts/launch_fsm_lrm_eval.shZiqiao Ma*, Xueyang Yu*, Haoyu Zhen, Yuncong Yang, Joyce Chai, Chuang Gan
@article{ma2026fast,
title={Fast Spatial Memory with Elastic Test-Time Training},
author={Ma, Ziqiao and Yu, Xueyang and Zhen, Haoyu and Yang, Yuncong and Chai, Joyce and Gan, Chuang},
journal={arXiv preprint arXiv:2604.07350},
year={2026}
}Ziqiao Ma, Xuweiyi Chen, Shoubin Yu, Sai Bi, Kai Zhang, Chen Ziwen, Sihan Xu, Jianing Yang, Zexiang Xu, Kalyan Sunkavalli, Mohit Bansal, Joyce Chai, Hao Tan
@inproceedings{ma20254dlrm,
title={4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time},
author={Ma, Ziqiao and Chen, Xuweiyi and Yu, Shoubin and Bi, Sai and Zhang, Kai and Ziwen, Chen and Xu, Sihan and Yang, Jianing and Xu, Zexiang and Sunkavalli, Kalyan and others},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}