LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers Paper • 2507.04404 • Published Jul 6, 2025 • 22
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8, 2025 • 120
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference Paper • 2505.02922 • Published May 5, 2025 • 28
FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios Paper • 2505.03730 • Published May 6, 2025 • 28
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published Apr 20, 2025 • 53
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper • 2504.14899 • Published Apr 21, 2025 • 20
WORLDMEM: Long-term Consistent World Simulation with Memory Paper • 2504.12369 • Published Apr 16, 2025 • 35
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published Apr 11, 2025 • 46
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11, 2025 • 130
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10, 2025 • 50
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3, 2025 • 51
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published Mar 31, 2025 • 76
Synthetic Video Enhances Physical Fidelity in Video Synthesis Paper • 2503.20822 • Published Mar 26, 2025 • 16
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models Paper • 2503.18446 • Published Mar 24, 2025 • 12
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24, 2025 • 119
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24, 2025 • 19
One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper • 2503.13358 • Published Mar 17, 2025 • 95
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14, 2025 • 146
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation Paper • 2503.10618 • Published Mar 13, 2025 • 19