Instructions to use infraxa/Qwen3.5-Trading-Agent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use infraxa/Qwen3.5-Trading-Agent with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="infraxa/Qwen3.5-Trading-Agent")

# Load model directly
from transformers import AutoProcessor, AutoModelForCausalLM

processor = AutoProcessor.from_pretrained("infraxa/Qwen3.5-Trading-Agent")
model = AutoModelForCausalLM.from_pretrained("infraxa/Qwen3.5-Trading-Agent")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use infraxa/Qwen3.5-Trading-Agent with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "infraxa/Qwen3.5-Trading-Agent"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "infraxa/Qwen3.5-Trading-Agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/infraxa/Qwen3.5-Trading-Agent

SGLang

How to use infraxa/Qwen3.5-Trading-Agent with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "infraxa/Qwen3.5-Trading-Agent" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "infraxa/Qwen3.5-Trading-Agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "infraxa/Qwen3.5-Trading-Agent" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "infraxa/Qwen3.5-Trading-Agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use infraxa/Qwen3.5-Trading-Agent with Docker Model Runner:
```
docker model run hf.co/infraxa/Qwen3.5-Trading-Agent
```

Qwen3.5-Trading-Agent

By Infraxa — The Execution Layer for Autonomous Finance

A GRPO-finetuned Qwen3.5-35B-A3B Mixture-of-Experts model, trained on Solana on-chain trading data and prediction market signals. Built for autonomous trade execution, swap routing, and market reasoning.

Model Details

Parameter	Value
Base Model	Qwen3.5-35B-A3B (MoE)
Architecture	`Qwen3_5MoeForCausalLM`
Total Parameters	~35B
Active Parameters	~3B per token
Experts	256 total, 8 active per token
Hidden Size	2048
Layers	40 (30 linear attention + 10 full attention)
Context Length	262,144 tokens
Precision	bfloat16
Training Method	GRPO (Group Relative Policy Optimization)

Training Data

This model was GRPO-trained on:

Solana on-chain transaction data — real swap and trade executions across DEXs
Prediction market data — outcomes, odds, and resolution signals
Trading run logs — full execution traces including routing, slippage, and settlement

The training objective optimizes for accurate trade reasoning: identifying optimal swap routes, predicting market movements, and generating executable trade instructions.

Intended Use

Autonomous trading agents on Solana
Swap execution and routing decisions
Prediction market analysis and position sizing
On-chain data interpretation and trade signal generation
Integration with Infraxa's execution layer for gasless, agent-driven finance

Architecture

Qwen3.5-35B-A3B uses a hybrid attention design with both linear and full attention layers in a 3:1 ratio. The MoE architecture (256 experts, 8 active) gives the model high capacity while keeping inference costs low — only ~3B parameters are active per forward pass.

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "infraxaai/Qwen3.5-Trading-Agent"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="bfloat16",
    device_map="auto",
)

prompt = "Analyze the current SOL/USDC liquidity across Orca, Raydium, and Jupiter. Recommend the optimal swap route for 10,000 USDC."
messages = [{"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Disclaimer

This model is provided for research and educational purposes only. It is not financial advice. Do not use this model as the sole basis for any trading or investment decisions. Cryptocurrency trading involves substantial risk of loss. The authors and Infraxa assume no liability for any financial losses incurred through the use of this model. Always do your own research and consult a qualified financial advisor before making any investment decisions.

License

Apache 2.0 — same as the base Qwen3.5 model.

Downloads last month: 91

Safetensors

Model size

35B params

Tensor type

BF16

Model tree for infraxa/Qwen3.5-Trading-Agent

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen3.5-35B-A3B

Finetuned

(130)

this model

Finetunes

1 model

infraxa
/

Qwen3.5-Trading-Agent