The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding
This repository contains the weights for Unified Autoencoding (UAE), introduced in the paper The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding.
- Github Repository: https://github.com/WeichenFan/UAE
- Paper: arXiv:2512.19693
Introduction
Unified Autoencoding (UAE) is a novel model architecture that harmonizes semantic structure and pixel details via an innovative frequency-band modulator, enabling their seamless coexistence. It is based on the "Prism Hypothesis," which suggests that different data modalities (semantic vs. pixel) can be viewed as projections of the natural world onto a shared feature spectrum. UAE effectively unifies semantic abstraction and pixel-level fidelity within a single latent space, achieving state-of-the-art performance in both reconstruction and representation learning.
Citation
@misc{fan2025uae,
title={The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding},
author={Weichen Fan and Haiwen Diao and Quan Wang and Dahua Lin and Ziwei Liu},
year={2025},
eprint={2512.19693},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.19693},
}