Instructions to use dphn/dolphin-2.9.1-yi-1.5-34b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dphn/dolphin-2.9.1-yi-1.5-34b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dphn/dolphin-2.9.1-yi-1.5-34b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dphn/dolphin-2.9.1-yi-1.5-34b") model = AutoModelForCausalLM.from_pretrained("dphn/dolphin-2.9.1-yi-1.5-34b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use dphn/dolphin-2.9.1-yi-1.5-34b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dphn/dolphin-2.9.1-yi-1.5-34b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dphn/dolphin-2.9.1-yi-1.5-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/dphn/dolphin-2.9.1-yi-1.5-34b
- SGLang
How to use dphn/dolphin-2.9.1-yi-1.5-34b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dphn/dolphin-2.9.1-yi-1.5-34b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dphn/dolphin-2.9.1-yi-1.5-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dphn/dolphin-2.9.1-yi-1.5-34b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dphn/dolphin-2.9.1-yi-1.5-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use dphn/dolphin-2.9.1-yi-1.5-34b with Docker Model Runner:
docker model run hf.co/dphn/dolphin-2.9.1-yi-1.5-34b
Wow - best dialog AND internal reasoning yet!
can I have a link to this AGH? I wanna try it :)
Its part of a much larger very messy project called Owl http://www.github.com/bdambrosio/Owl
I wouldn't recommend trying to install Owl.
But the worldsim is only a couple of files, and uses very little of the rest, only the LLM server interface.
I'll split it out as a separate project.
Or were you thinking it was a cloud app you could use? No, sorry.
No I was thinking to run it locally.
Against my tabbyapi or ollama service.
perfect.
Committing https://github.com/bdambrosio/AllTheWorldAPlay.git
pbly take a day to untangle it from Owl.
It uses a small script running stabilityai/sdxl-turbo locally to generate the images, which are updated every couple of cycles. I'll allow disabling that.
cheers.
Very much a work in progress, a spinoff of my Owl work, the issue is AGH (Humanity), much harder than idiot savant AGI. :)
I'll post here when its ready.
sweet! I'm so excited to try it!
I might try to integrate SadTalker to get the avatars to lip sync
Ok, seems to run (installed on another machine to test clean install)
Doesn't have an installer yet
Got it to work with tabby, but for some reason text quality was poor, so this uses a simple wrapper around exllamav2 with the same interface (almost)
Lots to do, I actually built this in about 2 days. Now that this is up, bugs/functionality should improve pretty quickly.
SadTalker would be great!. I already can do TTS, although haven't integrated that. Character config for voice selection. :)
anyway:
https://github.com/bdambrosio/AllTheWorldAPlay.git
cheers
