REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
•
2501.03262
•
Published
•
103
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
•
2501.08313
•
Published
•
300
Towards Best Practices for Open Datasets for LLM Training
Paper
•
2501.08365
•
Published
•
62
Qwen2.5-1M Technical Report
Paper
•
2501.15383
•
Published
•
72
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper
•
2502.02737
•
Published
•
252
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building
a Chinese-Centric LLM
Paper
•
2502.06635
•
Published
•
6
Predictive Data Selection: The Data That Predicts Is the Data That
Teaches
Paper
•
2503.00808
•
Published
•
56
Exploring Data Scaling Trends and Effects in Reinforcement Learning from
Human Feedback
Paper
•
2503.22230
•
Published
•
45
WorldPM: Scaling Human Preference Modeling
Paper
•
2505.10527
•
Published
•
34
Paper
•
2505.09388
•
Published
•
320
Model Merging in Pre-training of Large Language Models
Paper
•
2505.12082
•
Published
•
40
QwenLong-L1: Towards Long-Context Large Reasoning Models with
Reinforcement Learning
Paper
•
2505.17667
•
Published
•
88