Prince Canuma's picture

Building on HF

Prince Canuma

prince-canuma

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Co-Evolving Policy Distillation

upvoted a paper 1 day ago

Heterogeneous Scientific Foundation Model Collaboration

updated a model 4 days ago

mlx-community/Mistral-Medium-3.5-128B-4bit

View all activity

Organizations

upvoted 2 papers 1 day ago

Co-Evolving Policy Distillation

Paper • 2604.27083 • Published 6 days ago • 53

Heterogeneous Scientific Foundation Model Collaboration

Paper • 2604.27351 • Published 5 days ago • 197

upvoted 2 papers 20 days ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 24 days ago • 78

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping

Paper • 2604.11297 • Published 22 days ago • 141

upvoted 5 papers 28 days ago

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Paper • 2503.16257 • Published Mar 20, 2025 • 28

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Paper • 2402.02750 • Published Feb 5, 2024 • 5

Token Warping Helps MLLMs Look from Nearby Viewpoints

Paper • 2604.02870 • Published Apr 3 • 34

Self-Distilled RLVR

Paper • 2604.03128 • Published Apr 3 • 170

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10, 2024 • 28

upvoted a paper 2 months ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 149

upvoted an article 4 months ago

Article

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

Jan 5

•

86

upvoted a collection 4 months ago

Nemotron Speech

Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 12 items • Updated 14 days ago • 49

upvoted 2 articles 4 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

Dec 18, 2025

•

124

Article

NVIDIA brings agents to life with DGX Spark and Reachy Mini

+1

Jan 5

•

66

upvoted a collection 5 months ago

INTELLECT 3

5 items • Updated Nov 27, 2025 • 1

upvoted a collection 8 months ago

EmbeddingGemma

7 items • Updated Sep 4, 2025 • 4

upvoted 2 collections 9 months ago

Gemma 3-270m

20 items • Updated Aug 14, 2025 • 6

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 29 items • Updated Aug 14, 2025 • 32

upvoted 2 collections 12 months ago

Perception Encoder

16 items • Updated Mar 2 • 81

LLaMA-Omni

13 items • Updated May 17, 2025 • 20