Instructions to use dphn/dolphin-2.9.1-yi-1.5-34b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dphn/dolphin-2.9.1-yi-1.5-34b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dphn/dolphin-2.9.1-yi-1.5-34b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dphn/dolphin-2.9.1-yi-1.5-34b")
model = AutoModelForCausalLM.from_pretrained("dphn/dolphin-2.9.1-yi-1.5-34b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dphn/dolphin-2.9.1-yi-1.5-34b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dphn/dolphin-2.9.1-yi-1.5-34b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dphn/dolphin-2.9.1-yi-1.5-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/dphn/dolphin-2.9.1-yi-1.5-34b

SGLang

How to use dphn/dolphin-2.9.1-yi-1.5-34b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dphn/dolphin-2.9.1-yi-1.5-34b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dphn/dolphin-2.9.1-yi-1.5-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dphn/dolphin-2.9.1-yi-1.5-34b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dphn/dolphin-2.9.1-yi-1.5-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use dphn/dolphin-2.9.1-yi-1.5-34b with Docker Model Runner:
```
docker model run hf.co/dphn/dolphin-2.9.1-yi-1.5-34b
```

Wow - best dialog AND internal reasoning yet!

by bdambrosio - opened May 18, 2024

Discussion

bdambrosio

May 18, 2024

AGH - artificial general humanity. Drop agents into a scenario and see what they think and do -
cognitivecomputations/dolphin-2.9.1-yi-1.5-34b is the most realistic yet! (yes, including the closed models)

ehartford

Dolphin org May 18, 2024

can I have a link to this AGH? I wanna try it :)

bdambrosio

May 18, 2024

Its part of a much larger very messy project called Owl http://www.github.com/bdambrosio/Owl
I wouldn't recommend trying to install Owl.
But the worldsim is only a couple of files, and uses very little of the rest, only the LLM server interface.
I'll split it out as a separate project.

Or were you thinking it was a cloud app you could use? No, sorry.

ehartford

Dolphin org May 18, 2024

No I was thinking to run it locally.

Against my tabbyapi or ollama service.

bdambrosio

May 18, 2024

perfect.
Committing https://github.com/bdambrosio/AllTheWorldAPlay.git
pbly take a day to untangle it from Owl.
It uses a small script running stabilityai/sdxl-turbo locally to generate the images, which are updated every couple of cycles. I'll allow disabling that.
cheers.

Very much a work in progress, a spinoff of my Owl work, the issue is AGH (Humanity), much harder than idiot savant AGI. :)
I'll post here when its ready.

ehartford

Dolphin org May 18, 2024

sweet! I'm so excited to try it!
I might try to integrate SadTalker to get the avatars to lip sync

bdambrosio

May 19, 2024

Ok, seems to run (installed on another machine to test clean install)
Doesn't have an installer yet
Got it to work with tabby, but for some reason text quality was poor, so this uses a simple wrapper around exllamav2 with the same interface (almost)
Lots to do, I actually built this in about 2 days. Now that this is up, bugs/functionality should improve pretty quickly.
SadTalker would be great!. I already can do TTS, although haven't integrated that. Character config for voice selection. :)
anyway:

https://github.com/bdambrosio/AllTheWorldAPlay.git

cheers

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment