Instructions to use inclusionAI/Ring-1T with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use inclusionAI/Ring-1T with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="inclusionAI/Ring-1T", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("inclusionAI/Ring-1T", trust_remote_code=True, dtype="auto")

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use inclusionAI/Ring-1T with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "inclusionAI/Ring-1T"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "inclusionAI/Ring-1T",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/inclusionAI/Ring-1T

SGLang

How to use inclusionAI/Ring-1T with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "inclusionAI/Ring-1T" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "inclusionAI/Ring-1T",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "inclusionAI/Ring-1T" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "inclusionAI/Ring-1T",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use inclusionAI/Ring-1T with Docker Model Runner:
```
docker model run hf.co/inclusionAI/Ring-1T
```

Responses of Ring-1T available on zenmux.ai often end abruptly after 14-16k tokens without generating complete answer

#13

by sszymczyk - opened Nov 26, 2025

Discussion

sszymczyk

Nov 26, 2025

•

edited Nov 27, 2025

During my experiments with the model on zenmux.ai I can't get it to generate a full 32k range of output tokens. Usually it abruptly ends its reasoning without producing the final answer ("content" is empty) after 15-16k tokens like in the following stream:

...
data: {"id":"02e50ca72fa24fb68609c507bdfd14e9","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"","role":"assistant","reasoning":" includes"},"index":0}],"created":1764176060,"object":"chat.completion.chunk"}
data: {"id":"02e50ca72fa24fb68609c507bdfd14e9","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"","role":"assistant","reasoning":" all"},"index":0}],"created":1764176060,"object":"chat.completion.chunk"}
data: {"id":"02e50ca72fa24fb68609c507bdfd14e9","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"","role":"assistant","reasoning":" these"},"index":0}],"created":1764176060,"object":"chat.completion.chunk"}
data: {"id":"02e50ca72fa24fb68609c507bdfd14e9","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"","role":"assistant","reasoning":" people"},"index":0}],"created":1764176060,"object":"chat.completion.chunk"}
data: {"id":"02e50ca72fa24fb68609c507bdfd14e9","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"","role":"assistant","reasoning":" from"},"index":0}],"created":1764176060,"object":"chat.completion.chunk"}
data: [DONE]

I don't understand why it responds with [DONE] when it clearly hasn't finished its reasoning yet. In zenmux.ai logs I can't find any request that exceeded 16k output tokens. Sometimes it starts generating an answer but also ends abruptly like this:

data: {"id":"31572c3b473a427ebc0495a8f857a637","model":"inclusionai/ring-1t","choices":[{"delta":{"content":" Isabella","role":"assistant"},"index":0}],"created":1764184475,"object":"chat.completion.chunk"}
data: {"id":"31572c3b473a427ebc0495a8f857a637","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"'s","role":"assistant"},"index":0}],"created":1764184475,"object":"chat.completion.chunk"}
data: {"id":"31572c3b473a427ebc0495a8f857a637","model":"inclusionai/ring-1t","choices":[{"delta":{"content":" ancestor","role":"assistant"},"index":0}],"created":1764184475,"object":"chat.completion.chunk"}
data: {"id":"31572c3b473a427ebc0495a8f857a637","model":"inclusionai/ring-1t","choices":[{"delta":{"content":" (","role":"assistant"},"index":0}],"created":1764184475,"object":"chat.completion.chunk"}
data: {"id":"31572c3b473a427ebc0495a8f857a637","model":"inclusionai/ring-1t","choices":[{"delta":{"content":"Angela","role":"assistant"},"index":0}],"created":1764184475,"object":"chat.completion.chunk"}
data: [DONE]

What are the recommended request settings to use the whole available 32k tokens model output range? I tried settings like "reasoning": {"effort":"high"}, did not help. Setting max_tokens in reasoningalso doesn't seem to help. Note that I have max_completion_tokens set to 32000.

Update: it seems that there is some kind of timeout in zenmux.ai or the model provider that results in the generation being abruptly terminated after exactly 10 minutes (600 seconds). Since the model token generation rate is about 25 t/s, the model can only generate about 15k tokens, not the full 32k tokens.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment