nltpt

Team

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Xiaoqun2023 authored a paper 18 days ago

ProTrain: Efficient LLM Training via Memory-Aware Techniques

clefourrier authored a paper about 1 month ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

mryab submitted a paper about 2 months ago

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

View all activity

Xiaoqun2023

authored a paper 18 days ago

ProTrain: Efficient LLM Training via Memory-Aware Techniques

Paper • 2406.08334 • Published Jun 12, 2024

clefourrier

authored a paper about 1 month ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65

alvarobartt

posted an update about 2 months ago

Post

3605

Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥

> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr

mryab

submitted a paper to Daily Papers about 2 months ago

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published Feb 24 • 7

alvarobartt

posted an update 3 months ago

Post

3221

💥 hf-mem v0.4.1 now also estimates KV cache memory requirements for any context length and batch size with the --experimental flag!

uvx hf-mem --model-id ... --experimental will automatically pull the required information from the Hugging Face Hub to include the KV cache estimation, when applicable.

💡 Alternatively, you can also set the --max-model-len, --batch-size and --kv-cache-dtype arguments (à la vLLM) manually if preferred.

1 reply

pcuenq

posted an update 3 months ago

Post

4458

👉 What happened in AI in 2025? 👈

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 — Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 — Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 — "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 — Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🤯

Credits
🙏 NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫡 @reach-vb for the original idea, design and recipe

🙌 @ariG23498 and yours truly for compiling and verifying the 2025 edition

🥳 Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! 🥂

3 replies

eliebak

submitted a paper to Daily Papers 4 months ago

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Paper • 2512.14080 • Published Dec 16, 2025 • 9

alozowski

authored a paper 4 months ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2, 2025 • 23

ariG23498

authored a paper 6 months ago

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20, 2025 • 80

merve

posted an update 6 months ago

Post

11491

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

4 replies

jfchi

authored 6 papers 6 months ago

Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks

Paper • 2410.18210 • Published Oct 23, 2024

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 60

Shape it Up! Restoring LLM Safety during Finetuning

Paper • 2505.17196 • Published May 22, 2025 • 1

maheshp9

authored 3 papers 6 months ago

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 118

Llama Guard 3 Vision: Safeguarding Human-AI Image Understanding Conversations

Paper • 2411.10414 • Published Nov 15, 2024 • 1

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 60

merve

posted an update 7 months ago

Post

6986

large AI labs open-sourced a ton of models last week 🔥
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 🤝
> IBM released a new Docling model with 258M params based on Granite (A2.0) 📝 ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana 🍌 (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset 💻 OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash 💭 meituan-longcat/LongCat-Flash-Thinking

2 replies

AI & ML interests

Recent Activity

Team members 184

nltpt's activity