YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Mac M3 GRPO Model
Pure PyTorch implementation of GRPO (Generative Reinforcement Learning with Preference Optimization) for Mac M3.
Model Details
- Model Type: GRPO (DreamerV3-inspired)
- Framework: PyTorch
- Vocabulary Size: 102
- Embedding Dimension: 64
- Latent Dimension: 32
- Compatible With: Mac M3, MPS acceleration
Usage
import torch
from examples.mac_m3_grpo import WorldModel, PolicyNetwork
# Initialize the model
world_model = WorldModel(
vocab_size=102,
embed_dim=64,
latent_dim=32
)
# Load the weights
world_model.load_state_dict(torch.load("world_model.pt"))
# Create policy
policy = PolicyNetwork(world_model)
# Generate text
# [Your generation code here]
Training Details
This model was trained using reinforcement learning with preference optimization, similar to the approach used in DreamerV3 but adapted for text generation.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support