Instructions to use DS-Archive/no-robots-y34b-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DS-Archive/no-robots-y34b-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DS-Archive/no-robots-y34b-lora")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DS-Archive/no-robots-y34b-lora")
model = AutoModelForCausalLM.from_pretrained("DS-Archive/no-robots-y34b-lora")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DS-Archive/no-robots-y34b-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DS-Archive/no-robots-y34b-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DS-Archive/no-robots-y34b-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/DS-Archive/no-robots-y34b-lora

SGLang

How to use DS-Archive/no-robots-y34b-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DS-Archive/no-robots-y34b-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DS-Archive/no-robots-y34b-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DS-Archive/no-robots-y34b-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DS-Archive/no-robots-y34b-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use DS-Archive/no-robots-y34b-lora with Docker Model Runner:
```
docker model run hf.co/DS-Archive/no-robots-y34b-lora
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

no-robots-y34b-lora

This model is a Yi-34B-Llama training on the HuggingFaceH4/no_robots. It uses my converted dataset in ShareGPT format with a few minor corrections (https://huggingface.co/datasets/Doctor-Shotgun/no-robots-sharegpt).

The Yi-34B-Llama model is a modified 01-ai/Yi-34B with keys renamed to match those used in Llama models, eliminating the need for remote code and ensuring compatibility with existing training and inference repositories. Architecturally this is similar to a Llama 2 34B model with an expanded vocab size of 64000.

Model description

No Robots is a high-quality dataset of 10,000 instructions and demonstrations created by skilled human annotators. This data can be used for supervised fine-tuning (SFT) to make language models follow instructions better. No Robots was modelled after the instruction dataset described in OpenAI's InstructGPT paper, and is comprised mostly of single-turn instructions across the following categories:

Category	Count
Generation	4560
Open QA	1240
Brainstorm	1120
Chat	850
Rewrite	660
Summarize	420
Coding	350
Classify	350
Closed QA	260
Extract	190

This lora was trained using a modified multi-turn Alpaca prompt format:

### Instruction:
Below is a message that describes a task. Write a response that appropriately completes the request.

### Input:
{human prompt}

### Response:
{bot response}

Some chat examples have alternate system prompts that differ from the default provided above.

Intended uses & limitations

The intended use is to add instruction-following capabilities to the base model based on curated human examples. Outputs may exhibit biases observed in the base model, and have not been filtered for explicit or harmful content and hallucinations.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 3

Framework versions

Transformers 4.34.1
Pytorch 2.0.1+cu118
Datasets 2.14.6
Tokenizers 0.14.1

Citation data

@misc{no_robots,
  author = {Nazneen Rajani and Lewis Tunstall and Edward Beeching and Nathan Lambert and Alexander M. Rush and Thomas Wolf},
  title = {No Robots},
  year = {2023},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/datasets/HuggingFaceH4/no_robots}}
}

Downloads last month: 10

Model tree for DS-Archive/no-robots-y34b-lora

Quantizations

1 model

Datasets used to train DS-Archive/no-robots-y34b-lora

Paper for DS-Archive/no-robots-y34b-lora

Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24