train_stsb_42_1767887010

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4508
  • Num Input Tokens Seen: 3928080

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7515 0.5002 1294 0.6846 197040
0.9202 1.0004 2588 0.5452 392608
0.5493 1.5006 3882 0.5131 588592
0.3837 2.0008 5176 0.4846 785728
0.3415 2.5010 6470 0.4859 982048
0.3315 3.0012 7764 0.4833 1178784
0.4845 3.5014 9058 0.4625 1374176
0.6275 4.0015 10352 0.4571 1571952
0.5573 4.5017 11646 0.4721 1768848
0.5374 5.0019 12940 0.4582 1964960
0.348 5.5021 14234 0.4588 2161632
0.8313 6.0023 15528 0.4508 2358288
0.4394 6.5025 16822 0.4532 2554352
0.6566 7.0027 18116 0.4547 2750912
0.3465 7.5029 19410 0.4567 2947664
0.3684 8.0031 20704 0.4522 3144128
0.4798 8.5033 21998 0.4562 3339904
0.2915 9.0035 23292 0.4530 3537024
0.2465 9.5037 24586 0.4512 3733152

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_42_1767887010

Adapter
(2204)
this model