-
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines
Paper • 2408.01050 • Published • 9 -
Seesaw: High-throughput LLM Inference via Model Re-sharding
Paper • 2503.06433 • Published -
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
Evaluation Guidebook
📝269Explore LLM benchmark trends over time
Collections
Discover the best community collections!
Collections including paper arxiv:2504.08791
-
Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment
Paper • 2601.14249 • Published • 11 -
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Paper • 2402.07033 • Published • 19 -
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences
Paper • 2601.07251 • Published • 11 -
GameTalk: Training LLMs for Strategic Conversation
Paper • 2601.16276 • Published • 12
-
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Paper • 2504.17192 • Published • 123 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 130
-
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Paper • 2504.08641 • Published • 6 -
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
A Survey of Interactive Generative Video
Paper • 2504.21853 • Published • 46
-
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines
Paper • 2408.01050 • Published • 9 -
Agent-as-a-Judge: Evaluate Agents with Agents
Paper • 2410.10934 • Published • 23 -
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper • 2412.14470 • Published • 13 -
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
Paper • 2501.11067 • Published • 13
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 19 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines
Paper • 2408.01050 • Published • 9 -
Seesaw: High-throughput LLM Inference via Model Re-sharding
Paper • 2503.06433 • Published -
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
Evaluation Guidebook
📝269Explore LLM benchmark trends over time
-
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines
Paper • 2408.01050 • Published • 9 -
Agent-as-a-Judge: Evaluate Agents with Agents
Paper • 2410.10934 • Published • 23 -
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper • 2412.14470 • Published • 13 -
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
Paper • 2501.11067 • Published • 13
-
Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment
Paper • 2601.14249 • Published • 11 -
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Paper • 2402.07033 • Published • 19 -
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences
Paper • 2601.07251 • Published • 11 -
GameTalk: Training LLMs for Strategic Conversation
Paper • 2601.16276 • Published • 12
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 19 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Paper • 2504.17192 • Published • 123 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 130
-
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
Paper • 2504.08641 • Published • 6 -
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
Paper • 2504.08791 • Published • 139 -
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
A Survey of Interactive Generative Video
Paper • 2504.21853 • Published • 46