Instructions to use dineth554/legion-coder-8m-10k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dineth554/legion-coder-8m-10k with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dineth554/legion-coder-8m-10k")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dineth554/legion-coder-8m-10k")
model = AutoModelForCausalLM.from_pretrained("dineth554/legion-coder-8m-10k")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use dineth554/legion-coder-8m-10k with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dineth554/legion-coder-8m-10k"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dineth554/legion-coder-8m-10k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/dineth554/legion-coder-8m-10k

SGLang

How to use dineth554/legion-coder-8m-10k with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dineth554/legion-coder-8m-10k" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dineth554/legion-coder-8m-10k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dineth554/legion-coder-8m-10k" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dineth554/legion-coder-8m-10k",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use dineth554/legion-coder-8m-10k with Docker Model Runner:
```
docker model run hf.co/dineth554/legion-coder-8m-10k
```

legion-coder-8m-10k / README.yaml

dineth554

Upload folder using huggingface_hub

0d9979c verified 3 months ago

raw

history blame contribute delete

2.71 kB

	# Model Card for Legion Coder 8M
	# YAML Front Matter for Hugging Face Hub

	base_model: dineth554/legion-coder-8m
	library_name: transformers
	license: mit
	pipeline_tag: text-generation
	language:
	- en
	- code
	tags:
	- transformers
	- pytorch
	- safetensors
	- text-generation
	- code-generation
	- python
	- javascript
	- coding
	- programming
	- sagemaker
	- amazon-sagemaker
	- cpu
	- compact
	- efficient
	- nvdya-kit
	- death-legion

	datasets:
	- the-stack-v2

	metrics:
	- perplexity
	- accuracy

	model-index:
	- name: Legion Coder 8M
	results: []

	inference:
	parameters:
	temperature: 0.8
	top_p: 0.95
	top_k: 50
	max_new_tokens: 200

	sagemaker:
	sdk_version: "2.200.0"
	instance_type: "ml.m5.large"
	instance_count: 1
	container_image: "huggingface-pytorch-inference:2.0.0-transformers4.28.1-cpu-py310-ubuntu20.04-v1.0"

	# Model Details
	model_details:
	name: Legion Coder 8M
	version: 1.0.0
	description: A compact yet powerful 44M parameter transformer model optimized for coding tasks
	developer: DEATH LEGION
	powered_by: nvdya-kit
	architecture: GPT-style Transformer
	parameters: 44,341,632
	model_size: 170MB
	hidden_size: 576
	num_layers: 13
	num_heads: 16
	context_length: 1024
	vocabulary_size: 16000
	format: Safetensors
	precision: float32

	# Training Details
	training_details:
	optimizer: AdamW
	learning_rate: 5e-4
	lr_schedule: cosine_decay
	batch_size: 4
	gradient_accumulation: true
	training_steps: 10000
	precision: float32

	# Intended Use
	intended_use:
	primary_use_cases:
	- Code completion and generation
	- Function generation from descriptions
	- Debugging assistance
	- Code explanation and documentation
	- Programming concept explanations
	- Code scaffolding and prototyping
	target_users:
	- Software developers
	- Students learning to code
	- Data scientists
	- DevOps engineers
	- Technical writers

	# Limitations
	limitations:
	- Limited to 1,024 token context window
	- Trained primarily on Python code
	- May generate code that requires review before production use
	- Not suitable for non-coding tasks

	# Ethical Considerations
	ethical_considerations:
	- Generated code should be reviewed before deployment
	- May reproduce patterns from training data
	- Not a replacement for human code review
	- Users are responsible for compliance with licenses of generated code

	# Citation
	citation: \|
	@misc{legioncoder2026,
	title={Legion Coder 8M: A Compact Transformer for Code Generation},
	author={DEATH LEGION},
	year={2026},
	howpublished={\url{https://huggingface.co/dineth554/legion-coder-8m}}
	}

	# Contact
	contact:
	developer: DEATH LEGION
	powered_by: nvdya-kit
	repository: https://huggingface.co/dineth554/legion-coder-8m

	# Branding
	branding:
	tagline: MADE WITH BY DEATH LEGION
	powered_by: nvdya-kit
	copyright: 2026 DEATH LEGION. All rights reserved.