Instructions to use Chat-Error/Mythalion-Kimiko-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Chat-Error/Mythalion-Kimiko-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Chat-Error/Mythalion-Kimiko-v2")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Chat-Error/Mythalion-Kimiko-v2")
model = AutoModelForCausalLM.from_pretrained("Chat-Error/Mythalion-Kimiko-v2")

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Chat-Error/Mythalion-Kimiko-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Chat-Error/Mythalion-Kimiko-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Chat-Error/Mythalion-Kimiko-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Chat-Error/Mythalion-Kimiko-v2

SGLang

How to use Chat-Error/Mythalion-Kimiko-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Chat-Error/Mythalion-Kimiko-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Chat-Error/Mythalion-Kimiko-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Chat-Error/Mythalion-Kimiko-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Chat-Error/Mythalion-Kimiko-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Chat-Error/Mythalion-Kimiko-v2 with Docker Model Runner:
```
docker model run hf.co/Chat-Error/Mythalion-Kimiko-v2
```

Now Available in GGUF, AWG & GPTQ format!

by BoshiAI - opened Dec 14, 2023

Discussion

BoshiAI

Dec 14, 2023

•

edited Dec 14, 2023

Fellow Faraday Fan Amogus made a full set of GGUF quants for this marvellous model:
https://huggingface.co/WhoTookMyAmogusNickname/Mythalion-Kimiko-v2-GGUF

Then The(Marvellous)Bloke Tom made a full set of quants available in GGUF/AWQ/GPTQ and uploaded them too:
https://huggingface.co/TheBloke/Mythalion-Kimiko-v2-GGUF
https://huggingface.co/TheBloke/Mythalion-Kimiko-v2-GPTQ
https://huggingface.co/TheBloke/Mythalion-Kimiko-v2-AWQ

Thank you both for making this marvellous model more accessible to others!

BoshiAI changed discussion title from Now Available in GGUF, GPTQ and AWQ formats! to Now Available in GGUF formats! Dec 14, 2023

BoshiAI changed discussion title from Now Available in GGUF formats! to Now Available in GGUF format! Dec 14, 2023

BoshiAI changed discussion title from Now Available in GGUF format! to Now Available in GGUF, AWG & GPTQ format! Dec 14, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment