Instructions to use malteos/gpt2-xl-wechsel-german with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use malteos/gpt2-xl-wechsel-german with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="malteos/gpt2-xl-wechsel-german")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("malteos/gpt2-xl-wechsel-german")
model = AutoModelForCausalLM.from_pretrained("malteos/gpt2-xl-wechsel-german")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use malteos/gpt2-xl-wechsel-german with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "malteos/gpt2-xl-wechsel-german"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "malteos/gpt2-xl-wechsel-german",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/malteos/gpt2-xl-wechsel-german

SGLang

How to use malteos/gpt2-xl-wechsel-german with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "malteos/gpt2-xl-wechsel-german" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "malteos/gpt2-xl-wechsel-german",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "malteos/gpt2-xl-wechsel-german" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "malteos/gpt2-xl-wechsel-german",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use malteos/gpt2-xl-wechsel-german with Docker Model Runner:
```
docker model run hf.co/malteos/gpt2-xl-wechsel-german
```

German GPT2-XL (1.5B)

trained with BigScience's DeepSpeed-Megatron-LM code base
word embedding initialized with WECHSEL and all other weights taken from English gpt2-xl
~ 3 days on 16xA100 GPUs (~ 80 TFLOPs / GPU)
stopped after 100k steps
26.2B tokens
less than a single epoch on oscar_unshuffled_deduplicated_de (excluding validation set; original model was trained for 75 epochs on less data)
bf16
zero stage 0
tp/pp = 1

How to use

You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:

>>> from transformers import pipeline, set_seed
>>> generator = pipeline('text-generation', model='malteos/gpt2-xl-wechsel-german')
>>> set_seed(42)
>>> generator("Hello, I'm a language model,", max_length=30, num_return_sequences=5)

[{'generated_text': "Hello, I'm a language model, a language for thinking, a language for expressing thoughts."},
 {'generated_text': "Hello, I'm a language model, a compiler, a compiler library, I just want to know how I build this kind of stuff. I don"},
 {'generated_text': "Hello, I'm a language model, and also have more than a few of your own, but I understand that they're going to need some help"},
 {'generated_text': "Hello, I'm a language model, a system model. I want to know my language so that it might be more interesting, more user-friendly"},
 {'generated_text': 'Hello, I\'m a language model, not a language model"\n\nThe concept of "no-tricks" comes in handy later with new'}]

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('malteos/gpt2-xl-wechsel-german')
model = GPT2Model.from_pretrained('malteos/gpt2-xl-wechsel-german')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Evaluation

Model (size)	PPL
`gpt2-xl-wechsel-german` (1.5B)	14.5
`gpt2-wechsel-german-ds-meg` (117M)	26.4
`gpt2-wechsel-german` (117M)	26.8
`gpt2` (retrained from scratch) (117M)	27.63

Other German language models

License

MIT

Downloads last month: 1,013

Paper for malteos/gpt2-xl-wechsel-german

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

Paper • 2112.06598 • Published Dec 13, 2021 • 1