Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

yueqin yin's picture

4 9 3

yueqin yin

yyqoni

Gargaz's profile picture

zhangchenxu's profile picture

Ironieser's profile picture

·

AI & ML interests

None yet

Organizations

yyqoni 's collections 1

DenseRewardRLHF-PPO

This repository contains the released models for our paper Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model.

yyqoni/Phi-3-mini-4k-instruct-segment-rm-700k

Text Classification • 4B • Updated Jan 8 • 16
yyqoni/Phi-3-mini-4k-instruct-token-rm-700k

Text Classification • 4B • Updated Jan 8 • 12
yyqoni/Phi-3-mini-4k-instruct-bandit-rm-700k

Text Classification • 4B • Updated Jan 8 • 13
yyqoni/rlhflow-llama-3-sft-8b-v2-segment-rm-700k

Text Classification • 8B • Updated Jan 8 • 9

DenseRewardRLHF-PPO

This repository contains the released models for our paper Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model.

yyqoni/Phi-3-mini-4k-instruct-segment-rm-700k

Text Classification • 4B • Updated Jan 8 • 16
yyqoni/Phi-3-mini-4k-instruct-token-rm-700k

Text Classification • 4B • Updated Jan 8 • 12
yyqoni/Phi-3-mini-4k-instruct-bandit-rm-700k

Text Classification • 4B • Updated Jan 8 • 13
yyqoni/rlhflow-llama-3-sft-8b-v2-segment-rm-700k

Text Classification • 8B • Updated Jan 8 • 9

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs