Inference Providers
Active filters: quark
Quark-NPU-Workshop/Hermes-3-Llama-3.2-3B-awq-g128-int4-asym-bf16-onnx-hybrid
0.8B • Updated • 2
Quark-NPU-Workshop/po-phi3-mini-4k-ins
0.6B • Updated • 1
playable/playable1-int4-bfloat16
1B • Updated • 5
• 1
EmbeddedLLM/Qwen3-VL-235B-A22B-Instruct-FP8-PTPC-Quark
236B • Updated • 1
amd/granite-4.0-h-small-fp8
haoyang-amd/Qwen1.5-MoE-A2.7B-ptpc
14B • Updated • 3
amd/Qwen3-30B-A3B-Thinking-2507-ptpc
31B • Updated • 9
• 1
amd/gpt-oss-20b-MoE-Quant-W-MXFP4-A-FP8-KV-FP8
11B • Updated • 8.06k
amd/gpt-oss-20b-WFP8-AFP8-KVFP8
21B • Updated • 7.34k
amd/Qwen3-VL-235B-A22B-Instruct-ptpc
236B • Updated • 6
amd/DeepSeek-R1-0528-ptpc
671B • Updated • 6
amd/DeepSeek-R1-0528-MTP-PTPC-FP8
684B • Updated • 7
amd/DeepSeek-V3.2-mtp-ptpc
686B • Updated • 70
• 1
amd/Kimi-K2-Thinking-W4A8
Text Generation
• Updated • 153
pzhang56/Qwen2.5-1.5B-AMD-FP8
2B • Updated • 2
amd/DeepSeek-V3.2-Speciale-mtp-ptpc
686B • Updated • 11
amd/Qwen3-235B-A22B-Thinking-2507-ptpc
235B • Updated • 6
amd/Kimi-K2-Thinking-MXFP4
550B • Updated • 1.9k
amd/Qwen3-235B-A22B-Instruct-2507-MXFP4
Text Generation
• 118B • Updated • 1.28k
• 2
amd-quark/internal-testing-qwen3_0.6b-mxfp4-hadamard
0.5B • Updated • 12
amd-quark/internal-testing-qwen3_0.6b-fp8-hadamard
0.8B • Updated • 11
amd-quark/internal-testing-qwen3_0.6b-int8-hadamard
0.8B • Updated • 4
amd-quark/internal-testing-qwen3_0.6b-mxfp4-tuned-orthogonal
0.5B • Updated • 11
amd-quark/internal-testing-qwen3_0.6b-fp8-tuned-orthogonal
0.8B • Updated • 11
EmbeddedLLM/Qwen3-30B-A3B-Instruct-2507-MXFP4
377B • Updated • 2.91k
amd/Qwen3-Coder-480B-A35B-Instruct-MXFP4
Text Generation
• 246B • Updated • 27
amd/Kimi-K2-Instruct-0905-MXFP4
550B • Updated • 176
• 1
Text Generation
• 116B • Updated • 3.52k
• 1