Text Generation
Transformers
Safetensors
PyTorch
English
code
gpt2
code-generation
python
javascript
coding
programming
sagemaker
amazon-sagemaker
cpu
compact
efficient
nvdya-kit
death-legion
vllm
sglang
llama-cpp
ollama
lm-studio
year-2026
next-gen
text-generation-inference
Instructions to use dineth554/legion-coder-8m-10k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dineth554/legion-coder-8m-10k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dineth554/legion-coder-8m-10k")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dineth554/legion-coder-8m-10k") model = AutoModelForCausalLM.from_pretrained("dineth554/legion-coder-8m-10k") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use dineth554/legion-coder-8m-10k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dineth554/legion-coder-8m-10k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dineth554/legion-coder-8m-10k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/dineth554/legion-coder-8m-10k
- SGLang
How to use dineth554/legion-coder-8m-10k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dineth554/legion-coder-8m-10k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dineth554/legion-coder-8m-10k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dineth554/legion-coder-8m-10k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dineth554/legion-coder-8m-10k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use dineth554/legion-coder-8m-10k with Docker Model Runner:
docker model run hf.co/dineth554/legion-coder-8m-10k
| # Model Card for Legion Coder 8M | |
| # YAML Front Matter for Hugging Face Hub | |
| base_model: dineth554/legion-coder-8m | |
| library_name: transformers | |
| license: mit | |
| pipeline_tag: text-generation | |
| language: | |
| - en | |
| - code | |
| tags: | |
| - transformers | |
| - pytorch | |
| - safetensors | |
| - text-generation | |
| - code-generation | |
| - python | |
| - javascript | |
| - coding | |
| - programming | |
| - sagemaker | |
| - amazon-sagemaker | |
| - cpu | |
| - compact | |
| - efficient | |
| - nvdya-kit | |
| - death-legion | |
| datasets: | |
| - the-stack-v2 | |
| metrics: | |
| - perplexity | |
| - accuracy | |
| model-index: | |
| - name: Legion Coder 8M | |
| results: [] | |
| inference: | |
| parameters: | |
| temperature: 0.8 | |
| top_p: 0.95 | |
| top_k: 50 | |
| max_new_tokens: 200 | |
| sagemaker: | |
| sdk_version: "2.200.0" | |
| instance_type: "ml.m5.large" | |
| instance_count: 1 | |
| container_image: "huggingface-pytorch-inference:2.0.0-transformers4.28.1-cpu-py310-ubuntu20.04-v1.0" | |
| # Model Details | |
| model_details: | |
| name: Legion Coder 8M | |
| version: 1.0.0 | |
| description: A compact yet powerful 44M parameter transformer model optimized for coding tasks | |
| developer: DEATH LEGION | |
| powered_by: nvdya-kit | |
| architecture: GPT-style Transformer | |
| parameters: 44,341,632 | |
| model_size: 170MB | |
| hidden_size: 576 | |
| num_layers: 13 | |
| num_heads: 16 | |
| context_length: 1024 | |
| vocabulary_size: 16000 | |
| format: Safetensors | |
| precision: float32 | |
| # Training Details | |
| training_details: | |
| optimizer: AdamW | |
| learning_rate: 5e-4 | |
| lr_schedule: cosine_decay | |
| batch_size: 4 | |
| gradient_accumulation: true | |
| training_steps: 10000 | |
| precision: float32 | |
| # Intended Use | |
| intended_use: | |
| primary_use_cases: | |
| - Code completion and generation | |
| - Function generation from descriptions | |
| - Debugging assistance | |
| - Code explanation and documentation | |
| - Programming concept explanations | |
| - Code scaffolding and prototyping | |
| target_users: | |
| - Software developers | |
| - Students learning to code | |
| - Data scientists | |
| - DevOps engineers | |
| - Technical writers | |
| # Limitations | |
| limitations: | |
| - Limited to 1,024 token context window | |
| - Trained primarily on Python code | |
| - May generate code that requires review before production use | |
| - Not suitable for non-coding tasks | |
| # Ethical Considerations | |
| ethical_considerations: | |
| - Generated code should be reviewed before deployment | |
| - May reproduce patterns from training data | |
| - Not a replacement for human code review | |
| - Users are responsible for compliance with licenses of generated code | |
| # Citation | |
| citation: | | |
| @misc{legioncoder2026, | |
| title={Legion Coder 8M: A Compact Transformer for Code Generation}, | |
| author={DEATH LEGION}, | |
| year={2026}, | |
| howpublished={\url{https://huggingface.co/dineth554/legion-coder-8m}} | |
| } | |
| # Contact | |
| contact: | |
| developer: DEATH LEGION | |
| powered_by: nvdya-kit | |
| repository: https://huggingface.co/dineth554/legion-coder-8m | |
| # Branding | |
| branding: | |
| tagline: MADE WITH BY DEATH LEGION | |
| powered_by: nvdya-kit | |
| copyright: 2026 DEATH LEGION. All rights reserved. | |