Self-Alignment with Instruction Backtranslation
Paper
•
2308.06259
•
Published
•
42
7374-course-llm
This model is a LoRA-finetuned version of NousResearch/Llama-2-7b-hf, trained to predict the instruction (x) given the assistant response (y). This implements the backward model training from the paper:
timdettmers/openassistant-guanaco and extract pairs of the form:
### Output (y)
<assistant's answer>
### Instruction (x)
<human's original question>
Base model
NousResearch/Llama-2-7b-hf