Collections
Discover the best community collections!
Collections including paper arxiv:2502.02737
-
Neural Machine Translation by Jointly Learning to Align and Translate
Paper β’ 1409.0473 β’ Published β’ 7 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 110 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 25 -
Hierarchical Reasoning Model
Paper β’ 2506.21734 β’ Published β’ 47
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 77 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 205 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper β’ 2303.03915 β’ Published β’ 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 254
-
Reinforcement Pre-Training
Paper β’ 2506.08007 β’ Published β’ 263 -
A Survey on Latent Reasoning
Paper β’ 2507.06203 β’ Published β’ 93 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 19 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper β’ 1910.10683 β’ Published β’ 16
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 77 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 205 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 254
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper β’ 2503.14734 β’ Published β’ 6 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper β’ 2401.02117 β’ Published β’ 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper β’ 2506.01844 β’ Published β’ 149 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper β’ 2506.16035 β’ Published β’ 89
-
Neural Machine Translation by Jointly Learning to Align and Translate
Paper β’ 1409.0473 β’ Published β’ 7 -
Attention Is All You Need
Paper β’ 1706.03762 β’ Published β’ 110 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper β’ 1810.04805 β’ Published β’ 25 -
Hierarchical Reasoning Model
Paper β’ 2506.21734 β’ Published β’ 47
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 77 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 205 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper β’ 2303.03915 β’ Published β’ 7 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 254
-
Reinforcement Pre-Training
Paper β’ 2506.08007 β’ Published β’ 263 -
A Survey on Latent Reasoning
Paper β’ 2507.06203 β’ Published β’ 93 -
Language Models are Few-Shot Learners
Paper β’ 2005.14165 β’ Published β’ 19 -
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Paper β’ 1910.10683 β’ Published β’ 16
-
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper β’ 2506.20920 β’ Published β’ 77 -
SmolVLM: Redefining small and efficient multimodal models
Paper β’ 2504.05299 β’ Published β’ 205 -
YourBench: Easy Custom Evaluation Sets for Everyone
Paper β’ 2504.01833 β’ Published β’ 22 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper β’ 2502.02737 β’ Published β’ 254
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper β’ 2503.14734 β’ Published β’ 6 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper β’ 2401.02117 β’ Published β’ 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper β’ 2506.01844 β’ Published β’ 149 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper β’ 2506.16035 β’ Published β’ 89