---
tags:
- reinforcement-learning
- world-models
- atari
- space-invaders
- deep-learning
library_name: pytorch
---

# World Models for Space Invaders

This is a World Models agent trained on the `SpaceInvadersNoFrameskip-v4` environment.

## Model Description

World Models is a model-based reinforcement learning approach that learns a compressed representation 
of the environment and trains a controller to maximize reward in the learned model.

The architecture consists of three components:
- **V (Vision)**: Variational Autoencoder that compresses 64x64 RGB frames to 32-dimensional latent vectors
- **M (Memory)**: MDN-RNN that predicts the next latent state given current state and action
- **C (Controller)**: Linear policy trained with CMA-ES evolution strategy

## Training Details

### Hyperparameters
- VAE Latent Dimension: 32
- RNN Hidden Dimension: 256
- Number of Gaussian Mixtures: 5
- Population Size (CMA-ES): 64
- Training Episodes: 100
- VAE Epochs: 10
- RNN Epochs: 20
- Controller Generations: 10

## Evaluation Results

- **Mean Reward**: 195.00 ± 0.00
- **Max Reward**: 195.00
- **Mean Episode Length**: 1000.00

## Usage

```python
import torch
import gymnasium as gym

# Load models
vae = VAE(latent_dim=32)
vae.load_state_dict(torch.load('vae_model.pt'))

rnn = MDNRNN(latent_dim=32, action_dim=6)
rnn.load_state_dict(torch.load('mdnrnn_model.pt'))

controller = Controller(latent_dim=32, hidden_dim=256)
controller.load_state_dict(torch.load('controller_model.pt'))

# Run agent
env = gym.make('SpaceInvadersNoFrameskip-v4')
# ... (see repository for full inference code)
```

## References

- Paper: [World Models (Ha & Schmidhuber, 2018)](https://worldmodels.github.io/)
- Code: Based on the original World Models implementation

## Citation

```bibtex
@article{ha2018worldmodels,
  title={World Models},
  author={Ha, David and Schmidhuber, J{\"u}rgen},
  journal={arXiv preprint arXiv:1803.10122},
  year={2018}
}
```