Text Generation
Transformers
Safetensors
deepseek_v3
conversational
custom_code
text-generation-inference
Instructions to use deepcogito/cogito-671b-v2.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepcogito/cogito-671b-v2.1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepcogito/cogito-671b-v2.1", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepcogito/cogito-671b-v2.1", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("deepcogito/cogito-671b-v2.1", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use deepcogito/cogito-671b-v2.1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepcogito/cogito-671b-v2.1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepcogito/cogito-671b-v2.1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/deepcogito/cogito-671b-v2.1
- SGLang
How to use deepcogito/cogito-671b-v2.1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepcogito/cogito-671b-v2.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepcogito/cogito-671b-v2.1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepcogito/cogito-671b-v2.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepcogito/cogito-671b-v2.1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use deepcogito/cogito-671b-v2.1 with Docker Model Runner:
docker model run hf.co/deepcogito/cogito-671b-v2.1
| {# ==================================================================== #} | |
| {# Deepseek v3 template with enable_thinking and tools support #} | |
| {# ==================================================================== #} | |
| {%- if not enable_thinking is defined %}{% set enable_thinking = false %}{% endif -%} | |
| {%- if not tools is defined %}{% set tools = none %}{% endif -%} | |
| {%- if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif -%} | |
| {# --------------------------- Defaults -------------------------------- #} | |
| {%- set default_prompt = "You are Cogito, an AI assistant created by Deep Cogito, which is an AI research lab based in San Francisco." -%} | |
| {# --------------------------- Collect system prompt -------------------- #} | |
| {%- set ns = namespace(system_prompt='', is_last_user=false, outputs_open=false, first_output=true) -%} | |
| {%- if messages and messages[0].role == 'system' -%} | |
| {%- set raw = messages[0].content -%} | |
| {%- set ns.system_prompt = raw if raw is string else raw[0].text -%} | |
| {%- set messages = messages[1:] -%} | |
| {%- endif -%} | |
| {# --------------------------- Inject deep thinking --------------------- #} | |
| {%- set user_prompt = ns.system_prompt -%} | |
| {%- if enable_thinking -%} | |
| {# Thinking enabled #} | |
| {%- set ns.system_prompt = "Enable deep thinking subroutine.\n\n" ~ default_prompt -%} | |
| {%- if user_prompt -%} | |
| {%- set ns.system_prompt = ns.system_prompt ~ "\n\n" ~ user_prompt ~ "\n\n" -%} | |
| {%- endif -%} | |
| {%- else -%} | |
| {# Thinking disabled #} | |
| {%- set ns.system_prompt = default_prompt -%} | |
| {%- if user_prompt -%} | |
| {%- set ns.system_prompt = ns.system_prompt ~ "\n\n" ~ user_prompt -%} | |
| {%- endif -%} | |
| {%- endif -%} | |
| {# --------------------------- Append tools block ----------------------- #} | |
| {%- if tools is not none -%} | |
| {%- if ns.system_prompt -%} | |
| {%- set ns.system_prompt = ns.system_prompt ~ ' | |
| You have the following functions available: | |
| ' -%} | |
| {%- else -%} | |
| {%- set ns.system_prompt = 'You have the following functions available: | |
| ' -%} | |
| {%- endif -%} | |
| {%- for t in tools -%} | |
| {%- set ns.system_prompt = ns.system_prompt ~ "```json | |
| " ~ (t | tojson(indent=4)) ~ " | |
| ``` | |
| " -%} | |
| {%- endfor -%} | |
| {%- endif -%} | |
| {{- bos_token -}}{{- ns.system_prompt -}} | |
| {# --------------------------- Iterate conversation --------------------- #} | |
| {%- for m in messages -%} | |
| {# --------------------------- USER ---------------------------------- #} | |
| {%- if m.role == 'user' -%} | |
| {%- set ns.is_last_user = true -%} | |
| {%- set txt = m.content if m.content is string else m.content | selectattr('type','equalto','text') | map(attribute='text') | join('') -%} | |
| {{- "<|User|>" -}}{{- txt -}}{{- "<|Assistant|>" -}} | |
| {%- endif -%} | |
| {# --------------------------- ASSISTANT with TOOL CALLS -------------- #} | |
| {%- if m.role == 'assistant' and m.tool_calls is defined and m.tool_calls -%} | |
| {%- set ns.is_last_user = false -%} | |
| {%- set lead = m.content is string and m.content|trim or (m.content and m.content | selectattr('type','equalto','text') | map(attribute='text') | join('')) or '' -%} | |
| {{- lead -}}{{- "<|tool▁calls▁begin|>" -}} | |
| {%- for call in m.tool_calls -%} | |
| {{- "<|tool▁call▁begin|>" -}}{{- call.type -}}{{- "<|tool▁sep|>" -}}{{- call.function.name -}} | |
| {{- " | |
| ```json | |
| " -}}{{- call.function.arguments -}}{{- " | |
| ```" -}}{{- "<|tool▁call▁end|>" -}} | |
| {%- if not loop.last -%}{{- " | |
| " -}}{%- endif -%} | |
| {%- endfor -%} | |
| {{- "<|tool▁calls▁end|>" -}}{{- "<|end▁of▁sentence|>" -}} | |
| {%- endif -%} | |
| {# --------------------------- ASSISTANT plain ------------------------ #} | |
| {%- if m.role == 'assistant' and (m.tool_calls is not defined or not m.tool_calls) -%} | |
| {%- set ns.is_last_user = false -%} | |
| {%- set txt = m.content if m.content is string else m.content | selectattr('type','equalto','text') | map(attribute='text') | join('') -%} | |
| {{- txt -}}{{- "<|end▁of▁sentence|>" -}} | |
| {%- endif -%} | |
| {# --------------------------- TOOL output ---------------------------- #} | |
| {%- if m.role == 'tool' -%} | |
| {%- set ns.is_last_user = false -%} | |
| {%- set out_txt = m.content if m.content is string else m.content | selectattr('type','equalto','text') | map(attribute='text') | join('') -%} | |
| {%- if not ns.outputs_open -%} | |
| {{- "<|tool▁outputs▁begin|>" -}} | |
| {%- set ns.outputs_open = true -%} | |
| {%- endif -%} | |
| {{- "<|tool▁output▁begin|>" -}}{{- out_txt -}}{{- "<|tool▁output▁end|>" -}} | |
| {%- if loop.nextitem is defined and loop.nextitem.role == 'tool' -%} | |
| {{- " | |
| " -}} | |
| {%- endif -%} | |
| {%- if loop.nextitem is undefined or loop.nextitem.role != 'tool' -%} | |
| {{- "<|tool▁outputs▁end|>" -}} | |
| {%- set ns.outputs_open = false -%} | |
| {%- endif -%} | |
| {%- endif -%} | |
| {%- endfor -%} | |
| {%- if ns.outputs_open -%} | |
| {{- "<|tool▁outputs▁end|>" -}} | |
| {%- endif -%} | |
| {%- if add_generation_prompt and not ns.is_last_user -%} | |
| {{- "<|Assistant|>" -}} | |
| {%- endif -%} | |
| {%- if add_generation_prompt and enable_thinking -%} | |
| {{- '<think>\n' -}} | |
| {%- endif -%} |