Papers to Read
updated
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Paper
• 2402.11131
• Published
• 42
Generative Representational Instruction Tuning
Paper
• 2402.09906
• Published
• 54
Chain-of-Thought Reasoning Without Prompting
Paper
• 2402.10200
• Published
• 109
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper
• 2402.10193
• Published
• 21
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
• 2402.13064
• Published
• 50
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language
Models
Paper
• 2402.10986
• Published
• 81
2D Matryoshka Sentence Embeddings
Paper
• 2402.14776
• Published
• 7
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in
Long-Horizon Generation
Paper
• 2403.05313
• Published
• 9
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
• 2312.00752
• Published
• 150
Long-context LLMs Struggle with Long In-context Learning
Paper
• 2404.02060
• Published
• 37
ReALM: Reference Resolution As Language Modeling
Paper
• 2403.20329
• Published
• 22
ProAgent: Building Proactive Cooperative AI with Large Language Models
Paper
• 2308.11339
• Published
ProAgent: From Robotic Process Automation to Agentic Process Automation
Paper
• 2311.10751
• Published
• 10
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper
• 2404.05719
• Published
• 83
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Paper
• 2409.01704
• Published
• 83