Pi-Lumine 4B β€” Flow-Matching Action Decoder for Elden Ring

A Pi0.5-style flow-matching action decoder trained on top of a frozen Qwen3.5-4B VLM backbone.

Architecture

  • Base VLM: Qwen/Qwen3.5-4B (frozen, not included β€” downloaded at runtime)
  • Action Decoder: FiLM-conditioned transformer with cross-attention to VLM hidden states
    • 2 decoder layers, VLM dim 2560 β†’ decoder dim 1024, 8 attention heads
    • Projection layers decouple decoder from VLM hidden size
    • Instruction-conditioned via AdaptiveRMSNorm (FiLM)
    • Sinusoidal time embedding for flow matching
    • ~64M trainable parameters
  • Action Space: 6 steps x 20 dims (4 sticks + 16 buttons per step)
  • Training: Flow matching with Euler ODE integration at inference

Files

  • action_decoder.pt β€” Trained action decoder weights
  • decoder_config.json β€” Architecture and tokenizer config
  • tokenizer.json / tokenizer_config.json β€” Tokenizer with special tokens
  • chat_template.jinja β€” Chat template
  • processor_config.json β€” Processor config
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AntimLabs/Pi-Lumine-4B-FlowMatching

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(176)
this model