Instructions to use infraxa/Qwen3.5-Trading-Agent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use infraxa/Qwen3.5-Trading-Agent with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="infraxa/Qwen3.5-Trading-Agent")# Load model directly from transformers import AutoProcessor, AutoModelForCausalLM processor = AutoProcessor.from_pretrained("infraxa/Qwen3.5-Trading-Agent") model = AutoModelForCausalLM.from_pretrained("infraxa/Qwen3.5-Trading-Agent") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use infraxa/Qwen3.5-Trading-Agent with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "infraxa/Qwen3.5-Trading-Agent" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infraxa/Qwen3.5-Trading-Agent", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/infraxa/Qwen3.5-Trading-Agent
- SGLang
How to use infraxa/Qwen3.5-Trading-Agent with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "infraxa/Qwen3.5-Trading-Agent" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infraxa/Qwen3.5-Trading-Agent", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "infraxa/Qwen3.5-Trading-Agent" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infraxa/Qwen3.5-Trading-Agent", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use infraxa/Qwen3.5-Trading-Agent with Docker Model Runner:
docker model run hf.co/infraxa/Qwen3.5-Trading-Agent
Qwen3.5-Trading-Agent
By Infraxa โ The Execution Layer for Autonomous Finance
A GRPO-finetuned Qwen3.5-35B-A3B Mixture-of-Experts model, trained on Solana on-chain trading data and prediction market signals. Built for autonomous trade execution, swap routing, and market reasoning.
Model Details
| Parameter | Value |
|---|---|
| Base Model | Qwen3.5-35B-A3B (MoE) |
| Architecture | Qwen3_5MoeForCausalLM |
| Total Parameters | ~35B |
| Active Parameters | ~3B per token |
| Experts | 256 total, 8 active per token |
| Hidden Size | 2048 |
| Layers | 40 (30 linear attention + 10 full attention) |
| Context Length | 262,144 tokens |
| Precision | bfloat16 |
| Training Method | GRPO (Group Relative Policy Optimization) |
Training Data
This model was GRPO-trained on:
- Solana on-chain transaction data โ real swap and trade executions across DEXs
- Prediction market data โ outcomes, odds, and resolution signals
- Trading run logs โ full execution traces including routing, slippage, and settlement
The training objective optimizes for accurate trade reasoning: identifying optimal swap routes, predicting market movements, and generating executable trade instructions.
Intended Use
- Autonomous trading agents on Solana
- Swap execution and routing decisions
- Prediction market analysis and position sizing
- On-chain data interpretation and trade signal generation
- Integration with Infraxa's execution layer for gasless, agent-driven finance
Architecture
Qwen3.5-35B-A3B uses a hybrid attention design with both linear and full attention layers in a 3:1 ratio. The MoE architecture (256 experts, 8 active) gives the model high capacity while keeping inference costs low โ only ~3B parameters are active per forward pass.
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "infraxaai/Qwen3.5-Trading-Agent"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="bfloat16",
device_map="auto",
)
prompt = "Analyze the current SOL/USDC liquidity across Orca, Raydium, and Jupiter. Recommend the optimal swap route for 10,000 USDC."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Links
Disclaimer
This model is provided for research and educational purposes only. It is not financial advice. Do not use this model as the sole basis for any trading or investment decisions. Cryptocurrency trading involves substantial risk of loss. The authors and Infraxa assume no liability for any financial losses incurred through the use of this model. Always do your own research and consult a qualified financial advisor before making any investment decisions.
License
Apache 2.0 โ same as the base Qwen3.5 model.
- Downloads last month
- 91