Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Renat's picture

8 3

Renat

u-brixton

·

AI & ML interests

None yet

Organizations

u-brixton 's collections 8

OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs

Paper • 2504.04030 • Published Apr 5 • 1
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published Mar 4 • 33
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22, 2024 • 48
Magicoder: Source Code Is All You Need

Paper • 2312.02120 • Published Dec 4, 2023 • 82

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model

Paper • 2212.09146 • Published Dec 18, 2022 • 3
RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models

Paper • 2308.10633 • Published Aug 21, 2023 • 1
MemeCap: A Dataset for Captioning and Interpreting Memes

Paper • 2305.13703 • Published May 23, 2023
Contrastive Learning for Inference in Dialogue

Paper • 2310.12467 • Published Oct 19, 2023

foundation_models

Apple Intelligence Foundation Language Models

Paper • 2407.21075 • Published Jul 29, 2024 • 5
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117
Nemotron-4 340B Technical Report

Paper • 2406.11704 • Published Jun 17, 2024
Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

monte_carlo_24_best

Planning Like Human: A Dual-process Framework for Dialogue Planning

Paper • 2406.05374 • Published Jun 8, 2024
Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents

Paper • 2311.00262 • Published Nov 1, 2023
Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning

Paper • 2403.06769 • Published Mar 11, 2024
Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

Paper • 2305.13660 • Published May 23, 2023

DSI++: Updating Transformer Memory with New Documents

Paper • 2212.09744 • Published Dec 19, 2022 • 1
Where to start? Analyzing the potential value of intermediate models

Paper • 2211.00107 • Published Oct 31, 2022
INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback

Paper • 2305.14282 • Published May 23, 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

Paper • 2303.16634 • Published Mar 29, 2023 • 3

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Paper • 2402.17457 • Published Feb 27, 2024
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners

Paper • 2402.04553 • Published Feb 7, 2024
TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 31
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Paper • 2405.14578 • Published May 23, 2024 • 1

alignment_24_best

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 21
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23, 2024 • 12
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12, 2024 • 10

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7, 2024 • 5
A Closer Look at the Limitations of Instruction Tuning

Paper • 2402.05119 • Published Feb 3, 2024 • 5
STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Paper • 2403.19154 • Published Mar 28, 2024
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Paper • 2407.10457 • Published Jul 15, 2024 • 24

OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs

Paper • 2504.04030 • Published Apr 5 • 1
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published Mar 4 • 33
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published Jun 22, 2024 • 48
Magicoder: Source Code Is All You Need

Paper • 2312.02120 • Published Dec 4, 2023 • 82

DSI++: Updating Transformer Memory with New Documents

Paper • 2212.09744 • Published Dec 19, 2022 • 1
Where to start? Analyzing the potential value of intermediate models

Paper • 2211.00107 • Published Oct 31, 2022
INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback

Paper • 2305.14282 • Published May 23, 2023
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

Paper • 2303.16634 • Published Mar 29, 2023 • 3

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model

Paper • 2212.09146 • Published Dec 18, 2022 • 3
RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models

Paper • 2308.10633 • Published Aug 21, 2023 • 1
MemeCap: A Dataset for Captioning and Interpreting Memes

Paper • 2305.13703 • Published May 23, 2023
Contrastive Learning for Inference in Dialogue

Paper • 2310.12467 • Published Oct 19, 2023

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Paper • 2402.17457 • Published Feb 27, 2024
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners

Paper • 2402.04553 • Published Feb 7, 2024
TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11, 2024 • 31
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Paper • 2405.14578 • Published May 23, 2024 • 1

foundation_models

Apple Intelligence Foundation Language Models

Paper • 2407.21075 • Published Jul 29, 2024 • 5
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117
Nemotron-4 340B Technical Report

Paper • 2406.11704 • Published Jun 17, 2024
Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 78

alignment_24_best

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 21
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23, 2024 • 12
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12, 2024 • 10

monte_carlo_24_best

Planning Like Human: A Dual-process Framework for Dialogue Planning

Paper • 2406.05374 • Published Jun 8, 2024
Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents

Paper • 2311.00262 • Published Nov 1, 2023
Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning

Paper • 2403.06769 • Published Mar 11, 2024
Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

Paper • 2305.13660 • Published May 23, 2023

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Paper • 2402.04833 • Published Feb 7, 2024 • 5
A Closer Look at the Limitations of Instruction Tuning

Paper • 2402.05119 • Published Feb 3, 2024 • 5
STaR-GATE: Teaching Language Models to Ask Clarifying Questions

Paper • 2403.19154 • Published Mar 28, 2024
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism

Paper • 2407.10457 • Published Jul 15, 2024 • 24

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs