[CVPR 2026] π¨ PosterOmni
Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback
β¨ Overview
PosterOmni is a unified image-to-poster framework that bridges two regimes in poster creation:
- Poster Local Editing: Rescaling, Filling, Extending, Identity-driven
- Poster Global Creation: Layout-driven, Style-driven
- Unified Training: Task distillation + unified reward feedback.
This Hugging Face repository currently provides PosterOmni-v1 transformer weights (component-only). Other components (VAE / text encoder / tokenizer / scheduler / processor) should be loaded from a compatible base pipeline.
π₯ News
- π [2026.02] Paper available on arXiv.
- π€ [2026.02] PosterOmni-v1 transformer weights released on Hugging Face.
π Quick Start
1) Installation
git clone https://github.com/Ephemeral182/PosterOmni.git
cd PosterOmni
conda create -n posteromni python=3.11 -y
conda activate posteromni
pip install -r requirements.txt
2) Load with QwenImageEditPlusPipeline (Transformer from this repo)
This repo provides PosterOmni-v1 transformer weights (Diffusers component-only).
Please load a compatible base pipeline (e.g., Qwen/Qwen-Image-Edit-Plus) and replace its transformer with our weights.
β οΈ Component-only: this repo does NOT include
model_index.jsonand other pipeline components,
so...Pipeline.from_pretrained("MeiGen-AI/PosterOmni_v1")will NOT work.
import torch
from PIL import Image
from diffusers import QwenImageEditPlusPipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" else torch.float32
# 1) Load full base pipeline
base_model = "Qwen/Qwen-Image-Edit-Plus" # change to your compatible base
pipe = QwenImageEditPlusPipeline.from_pretrained(base_model, torch_dtype=dtype).to(device)
pipe.tokenizer_max_length = 1024 # optional
# 2) Plug PosterOmni transformer from this repo
posteromni_id = "MeiGen-AI/PosterOmni_v1"
pipe.transformer = pipe.transformer.__class__.from_pretrained(posteromni_id, torch_dtype=dtype).to(device)
# 3) Run inference
img = Image.open("your_input.jpg").convert("RGB")
# recommended: make width/height multiples of 16
w, h = img.size
w, h = (w // 16) * 16, (h // 16) * 16
prompt = "Rescale image to 1:1" # for rescaling, include "to W:H"
generator = torch.Generator(device=device).manual_seed(42)
out = pipe(
image=[img],
prompt=prompt,
negative_prompt="",
width=w,
height=h,
num_inference_steps=40,
true_cfg_scale=4.0,
guidance_scale=1.0,
generator=generator,
).images[0]
out.save("posteromni_test.png")
print("Saved: posteromni_test.png")
Notes
- Rescaling prompts should include
to W:H, e.g.Rescale image to 16:9. - For full multi-task CLI examples (rescaling/filling/extending/layout/style/ID-driven), please refer to the GitHub repo.
π§ Method (High-level)
PosterOmni is trained with a four-stage workflow:
- Task-specific SFT: train specialized experts for local editing and global creation tasks.
- Task Distillation: distill expert knowledge into a single multi-task model.
- Unified Reward Training: learn a universal reward for text fidelity, visual consistency, and aesthetics.
- Omni-Edit Reinforcement Learning: further align the model with unified reward feedback.
π PosterOmni Dataset
We introduce a unified data suite with PosterOmni-200K (training) and PosterOmni-Bench (evaluation) for image-to-poster generation. PosterOmni-200K contains 200K+ paired samples covering six tasksβlocal editing (Rescaling, Filling, Extending, Identity-driven) and global creation (Layout-driven, Style-driven)βand spans six poster themes: Products, Food, Events/Travel, Nature, Education, Entertainment. PosterOmni-Bench provides 540 Chinese and 480 English prompts, evenly distributed across the same six themes for consistent evaluation across tasks.
π Performance Benchmarks
π§© Supported Tasks
| Regime | Tasks |
|---|---|
| Poster Local Editing | Rescaling Β· Filling Β· Extending Β· Identity-driven |
| Poster Global Creation | Layout-driven Β· Style-driven |
π Related Project
We also have another text-to-poster work that may interest you:
[ICLR 2026] PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
![]()
![]()
![]()
π Model Files
This repository provides:
config.jsondiffusion_pytorch_model-*.safetensorsdiffusion_pytorch_model.safetensors.index.json
(i.e., Transformer2DModel weights in Diffusers format.)
π¬ Contact
Sixiang Chen: schen691@connect.hkust-gz.edu.cn
Jianyu Lai: jlai218@connect.hkust-gz.edu.cn
Jialin Gao: gaojialin04@meituan.com
Hengyu Shi: qq1842084@gmail.com
Zhongying Liu: liuzhongying@meituan.com
π Citation
If you find PosterOmni useful for your research, please cite:
@article{chen2026posteromni,
title={PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback},
author={Chen, Sixiang and Lai, Jianyu and Gao, Jialin and Shi, Hengyu and Liu, Zhongying and Ye, Tian and Luo, Junfeng and Wei, Xiaoming and Zhu, Lei},
journal={arXiv preprint arXiv:2602.12127},
year={2026}
}
- Downloads last month
- 132