Echo-Memory: A Controlled Study of Memory in Action World Models
Paper • 2606.09803 • Published • 32
Paper-aligned epoch-0 fine-tunes for Echo-Memory.
Paper | Project Page | GitHub
Backbone: Wan-AI/Wan2.1-T2V-1.3B
Training: static in-domain pool · 1 epoch · 30,000 steps · 640×352 · 81-frame chunks
Layout: {row_id}/epoch-0.safetensors
| Family | Paper row | HF path | Steps |
|---|---|---|---|
| Raw context | Context K=1 | context_k1/epoch-0.safetensors |
30,000 |
| Raw context | Context K=20 | context_k20/epoch-0.safetensors |
30,000 |
| Spatial | Spatial Memory | spatial_mem/epoch-0.safetensors |
30,000 |
| State-space | Block-wise SSM | block_wise_ssm_two_chunk/epoch-0.safetensors |
30,000 |
| State-space | Legacy Hybrid (VideoSSM) | videossm_hybrid/epoch-0.safetensors |
30,000 |
| Spatial | concat text (ablation) | spatial_concat_text_two_chunk/epoch-0.safetensors |
30,000 |
| Spatial | inject none (ablation) | spatial_inject_none_two_chunk/epoch-0.safetensors |
30,000 |
| Spatial | cross-attn t32 (ablation) | spatial_cross_attn_readout_t32_g4_two_chunk/epoch-0.safetensors |
30,000 |
| State-space | SSM ctx1 / every4 / hint21 | ssm_ablation_ctx1_every4_hint21/epoch-0.safetensors |
30,000 |
| State-space | SSM ctx5 / every1 / hint21 | ssm_ablation_ctx5_every1_hint21/epoch-0.safetensors |
30,000 |
| State-space | SSM ctx5 / every4 / hint81 | ssm_ablation_ctx5_every4_hint81/epoch-0.safetensors |
30,000 |
Context K=5 and FramePack compression rows are not yet released as epoch-0 weights.
pip install -U "huggingface_hub[cli]"
# one row
huggingface-cli download Echo-Team/Echo-Memory context_k1/epoch-0.safetensors --local-dir ./ckpts
# all rows
huggingface-cli download Echo-Team/Echo-Memory --local-dir ./ckpts
Keep the row subdirectory in the local path (e.g. ./ckpts/spatial_mem/epoch-0.safetensors).
Clone Echo-Memory, install the environment, then:
export WAN_BASE_MODEL=/path/to/Wan2.1-T2V-1.3B
export DATASET_BASE_PATH=data/Context-as-Memory-Dataset
export PYTHONPATH=$PWD:${PYTHONPATH:-}
export CKPT=./ckpts/spatial_mem/epoch-0.safetensors
# in-domain replay + revisit
bash eval/v2/run_static_consistency_loop_and_revisit.sh
bash eval/v2/run_basic_replay_gt.sh
# open-domain revisit (first frames in repo)
PHASE=stage1 OOD_DIR=assets/opendomain_revisit \
bash eval/v2/revisit_suite/run_one_click_revisit_eval.sh
Memory runtime flags are inferred from the checkpoint path via env/memory_baseline_runtime.py — use the HF folder names above.
If you use this repository or the Echo-Memory paper, please cite:
@article{king2026echomemory,
title={Echo-Memory: A Controlled Study of Memory in Action World Models},
author={King, Wayne and Xue, Zeyue and Bian, Yuxuan and Huang, Jie and Li, Haoran and Li, Yaowei and Su, Yaofeng and Li, Yuming and Wang, Haoyu and Zhang, Shiyi and Zhang, Songchun and Niu, Yuwei and Xu, Sihan and Zhuang, Junhao and Huang, Haoyang and Duan, Nan},
journal={arXiv preprint arXiv:2606.09803},
year={2026},
month={jun},
eprint={2606.09803},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.09803}
}