| | --- |
| | datasets: |
| | - HuggingFaceH4/CodeAlpaca_20K |
| | language: |
| | - en |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | tags: |
| | - code |
| | - LLaMa2 |
| | --- |
| | |
| | # LLaMaCoder |
| |
|
| | ## Model Description |
| |
|
| | `LLaMaCoder` is based on LLaMa2 7B language model, finetuned using LoRA adaptors. |
| |
|
| | ## Usage |
| |
|
| | Generate code with LLaMaCoder in 4bit model according to the following python snippet: |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer |
| | import torch |
| | |
| | MODEL_NAME = "Sakuna/LLaMaCoderAll" |
| | device = "cuda:0" |
| | |
| | |
| | bnb_config = BitsAndBytesConfig( |
| | load_in_4bit=True, |
| | bnb_4bit_quant_type="nf4", |
| | bnb_4bit_compute_dtype=torch.float16, |
| | ) |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | MODEL_NAME, |
| | quantization_config=bnb_config, |
| | trust_remote_code=True |
| | ) |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True) |
| | tokenizer.pad_token = tokenizer.eos_token |
| | |
| | model = model.to(device) |
| | model.eval() |
| | |
| | prompt = "Write a Java program to calculate the factorial of a given number k" |
| | input = f"{prompt}\n### Solution:\n" |
| | device = "cuda:0" |
| | |
| | inputs = tokenizer(input, return_tensors="pt").to(device) |
| | outputs = model.generate(**inputs, max_length=256, temperature=0.7) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |