7 166 89

Quentin Tardif

ntnq

AI & ML interests

None yet

Recent Activity

liked a model about 4 hours ago

zai-org/GLM-4.7-Flash

upvoted a paper 12 days ago

Scaling Laws for Code: Every Programming Language Matters

liked a Space 13 days ago

HuggingFaceFW/FinePDFsBlog

View all activity

Organizations

liked a model about 4 hours ago

zai-org/GLM-4.7-Flash

Text Generation • 31B • Updated about 9 hours ago • 15.2k • • 648

upvoted a paper 12 days ago

Scaling Laws for Code: Every Programming Language Matters

Paper • 2512.13472 • Published Dec 15, 2025 • 12

liked a Space 13 days ago

FinePDFs: Liberating 3T of the finest tokens from PDFs

📄

liked a Space about 1 month ago

The Jagged AI Frontier is a Data Frontier

🧭

Why AI capabilities are shaped by data availability

upvoted 2 articles about 1 month ago

Article

Saving Memory Using Padding-Free Transformer Layers during Finetuning

Jun 11, 2024

•

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Dec 15, 2025

•

106

liked 2 models about 1 month ago

mistralai/Devstral-Small-2-24B-Instruct-2512

24B • Updated 29 days ago • 275k • 486

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated 27 days ago • 4.78k • • 299

upvoted an article about 2 months ago

Article

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Dec 4, 2025

•

liked a Space about 2 months ago

Evaluation Guidebook

📝

253

Display benchmark evaluation data for LLMs

upvoted an article about 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

306

upvoted a collection 2 months ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 9 items • Updated 28 days ago • 159

upvoted 2 papers 2 months ago

Fantastic Pretraining Optimizers and Where to Find Them

Paper • 2509.02046 • Published Sep 2, 2025 • 13

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published Nov 12, 2025 • 95

upvoted an article 3 months ago

Article

What makes good reasoning data

Oct 30, 2025

•

liked 2 Spaces 3 months ago

The Smol Training Playbook

📚

2.89k

The secrets to building world-class LLMs

Unlocking On-Policy Distillation for Any Model Family

📝

Apply on-policy distillation to any model family

upvoted an article 3 months ago

Article

On the Shifting Global Compute Landscape

Oct 29, 2025

•

upvoted a paper 3 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 505

liked a model 4 months ago

ServiceNow-AI/Apriel-1.5-15b-Thinker

Image-Text-to-Text • 15B • Updated Oct 6, 2025 • 401 • 463

Quentin Tardif

AI & ML interests

Recent Activity

Organizations

ntnq's activity

FinePDFs: Liberating 3T of the finest tokens from PDFs

The Jagged AI Frontier is a Data Frontier

Saving Memory Using Padding-Free Transformer Layers during Finetuning

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Evaluation Guidebook

Continuous batching from first principles

What makes good reasoning data

The Smol Training Playbook

Unlocking On-Policy Distillation for Any Model Family

On the Shifting Global Compute Landscape