BoolQ_Llama-3.2-1B-26t8ytsb

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6420
  • Model Preparation Time: 0.0059
  • Mdl: 7746.5496
  • Accumulated Loss: 5369.4990
  • Correct Preds: 2337.0
  • Total Preds: 3270.0
  • Accuracy: 0.7147
  • Correct Gen Preds: 2302.0
  • Gen Accuracy: 0.7040
  • Correct Gen Preds 9642: 1525.0
  • Correct Preds 9642: 1556.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.7680
  • Gen Accuracy 9642: 0.7527
  • Correct Gen Preds 2822: 768.0
  • Correct Preds 2822: 781.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.6344
  • Gen Accuracy 2822: 0.6239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0059 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.5194 1.0 3 0.7779 0.0059 3669.7865 2543.7022 1857.0 3270.0 0.5679 1701.0 0.5202 670.0 760.0 2026.0 0.3751 0.3307 1022.0 1097.0 1231.0 0.8911 0.8302
0.2429 2.0 6 0.8811 0.0059 4156.8448 2881.3053 2088.0 3270.0 0.6385 1960.0 0.5994 1723.0 1837.0 2026.0 0.9067 0.8504 228.0 251.0 1231.0 0.2039 0.1852
0.0895 3.0 9 0.7633 0.0059 3601.1554 2496.1307 2252.0 3270.0 0.6887 2036.0 0.6226 1357.0 1527.0 2026.0 0.7537 0.6698 670.0 725.0 1231.0 0.5890 0.5443
0.3695 4.0 12 2.0120 0.0059 9491.7849 6579.2040 2288.0 3270.0 0.6997 2274.0 0.6954 1905.0 1917.0 2026.0 0.9462 0.9403 360.0 371.0 1231.0 0.3014 0.2924
0.0001 5.0 15 1.6420 0.0059 7746.5496 5369.4990 2337.0 3270.0 0.7147 2302.0 0.7040 1525.0 1556.0 2026.0 0.7680 0.7527 768.0 781.0 1231.0 0.6344 0.6239
0.0 6.0 18 1.8149 0.0059 8562.1293 5934.8158 2298.0 3270.0 0.7028 2231.0 0.6823 1282.0 1340.0 2026.0 0.6614 0.6328 940.0 958.0 1231.0 0.7782 0.7636
0.0006 7.0 21 1.9537 0.0059 9216.6870 6388.5206 2267.0 3270.0 0.6933 2202.0 0.6734 1241.0 1299.0 2026.0 0.6412 0.6125 952.0 968.0 1231.0 0.7864 0.7734
0.0 8.0 24 2.0340 0.0059 9595.7529 6651.2691 2302.0 3270.0 0.7040 2246.0 0.6869 1366.0 1414.0 2026.0 0.6979 0.6742 871.0 888.0 1231.0 0.7214 0.7076
0.0 9.0 27 2.1642 0.0059 10209.8590 7076.9350 2299.0 3270.0 0.7031 2241.0 0.6853 1413.0 1464.0 2026.0 0.7226 0.6974 819.0 835.0 1231.0 0.6783 0.6653
0.0 10.0 30 2.2394 0.0059 10564.7680 7322.9392 2289.0 3270.0 0.7 2231.0 0.6823 1436.0 1485.0 2026.0 0.7330 0.7088 786.0 804.0 1231.0 0.6531 0.6385
0.0 11.0 33 2.2853 0.0059 10781.2871 7473.0188 2289.0 3270.0 0.7 2239.0 0.6847 1459.0 1500.0 2026.0 0.7404 0.7201 771.0 789.0 1231.0 0.6409 0.6263
0.0 12.0 36 2.3170 0.0059 10930.7882 7576.6450 2293.0 3270.0 0.7012 2249.0 0.6878 1474.0 1513.0 2026.0 0.7468 0.7275 766.0 780.0 1231.0 0.6336 0.6223
0.0 13.0 39 2.3358 0.0059 11019.6269 7638.2233 2291.0 3270.0 0.7006 2250.0 0.6881 1480.0 1519.0 2026.0 0.7498 0.7305 761.0 772.0 1231.0 0.6271 0.6182
0.0 14.0 42 2.3477 0.0059 11075.6473 7677.0537 2288.0 3270.0 0.6997 2249.0 0.6878 1484.0 1521.0 2026.0 0.7507 0.7325 756.0 767.0 1231.0 0.6231 0.6141
0.0 15.0 45 2.3567 0.0059 11118.0345 7706.4343 2285.0 3270.0 0.6988 2248.0 0.6875 1483.0 1517.0 2026.0 0.7488 0.7320 756.0 768.0 1231.0 0.6239 0.6141
0.0 16.0 48 2.3619 0.0059 11142.3851 7723.3128 2282.0 3270.0 0.6979 2248.0 0.6875 1483.0 1517.0 2026.0 0.7488 0.7320 756.0 765.0 1231.0 0.6214 0.6141
0.0 17.0 51 2.3645 0.0059 11154.6211 7731.7942 2292.0 3270.0 0.7009 2256.0 0.6899 1489.0 1524.0 2026.0 0.7522 0.7349 758.0 768.0 1231.0 0.6239 0.6158
0.0 18.0 54 2.3710 0.0059 11185.5857 7753.2572 2283.0 3270.0 0.6982 2251.0 0.6884 1485.0 1517.0 2026.0 0.7488 0.7330 757.0 766.0 1231.0 0.6223 0.6149
0.0 19.0 57 2.3719 0.0059 11189.8794 7756.2333 2285.0 3270.0 0.6988 2252.0 0.6887 1488.0 1520.0 2026.0 0.7502 0.7345 755.0 765.0 1231.0 0.6214 0.6133
0.0 20.0 60 2.3739 0.0059 11199.2181 7762.7064 2287.0 3270.0 0.6994 2255.0 0.6896 1489.0 1520.0 2026.0 0.7502 0.7349 757.0 767.0 1231.0 0.6231 0.6149
0.0 21.0 63 2.3731 0.0059 11195.3841 7760.0489 2287.0 3270.0 0.6994 2255.0 0.6896 1491.0 1521.0 2026.0 0.7507 0.7359 755.0 766.0 1231.0 0.6223 0.6133
0.0 22.0 66 2.3758 0.0059 11208.0963 7768.8604 2285.0 3270.0 0.6988 2258.0 0.6905 1494.0 1522.0 2026.0 0.7512 0.7374 756.0 763.0 1231.0 0.6198 0.6141
0.0 23.0 69 2.3778 0.0059 11217.4939 7775.3743 2284.0 3270.0 0.6985 2255.0 0.6896 1493.0 1521.0 2026.0 0.7507 0.7369 753.0 763.0 1231.0 0.6198 0.6117
0.0 24.0 72 2.3792 0.0059 11224.3777 7780.1458 2289.0 3270.0 0.7 2258.0 0.6905 1491.0 1522.0 2026.0 0.7512 0.7359 758.0 767.0 1231.0 0.6231 0.6158
0.0 25.0 75 2.3799 0.0059 11227.6572 7782.4189 2290.0 3270.0 0.7003 2260.0 0.6911 1493.0 1522.0 2026.0 0.7512 0.7369 759.0 768.0 1231.0 0.6239 0.6166
0.0 26.0 78 2.3831 0.0059 11242.5989 7792.7757 2283.0 3270.0 0.6982 2251.0 0.6884 1488.0 1520.0 2026.0 0.7502 0.7345 754.0 763.0 1231.0 0.6198 0.6125
0.0 27.0 81 2.3824 0.0059 11239.3411 7790.5176 2287.0 3270.0 0.6994 2259.0 0.6908 1492.0 1520.0 2026.0 0.7502 0.7364 758.0 767.0 1231.0 0.6231 0.6158
0.0 28.0 84 2.3854 0.0059 11253.1701 7800.1031 2288.0 3270.0 0.6997 2258.0 0.6905 1494.0 1522.0 2026.0 0.7512 0.7374 756.0 766.0 1231.0 0.6223 0.6141
0.0 29.0 87 2.3858 0.0059 11255.3105 7801.5867 2292.0 3270.0 0.7009 2261.0 0.6914 1496.0 1526.0 2026.0 0.7532 0.7384 756.0 766.0 1231.0 0.6223 0.6141
0.0 30.0 90 2.3892 0.0059 11271.3917 7812.7334 2285.0 3270.0 0.6988 2260.0 0.6911 1494.0 1520.0 2026.0 0.7502 0.7374 757.0 765.0 1231.0 0.6214 0.6149
0.0 31.0 93 2.3900 0.0059 11274.9614 7815.2077 2287.0 3270.0 0.6994 2259.0 0.6908 1493.0 1521.0 2026.0 0.7507 0.7369 757.0 766.0 1231.0 0.6223 0.6149
0.0 32.0 96 2.3922 0.0059 11285.5291 7822.5327 2285.0 3270.0 0.6988 2256.0 0.6899 1492.0 1520.0 2026.0 0.7502 0.7364 755.0 765.0 1231.0 0.6214 0.6133
0.0 33.0 99 2.3909 0.0059 11279.2140 7818.1554 2287.0 3270.0 0.6994 2262.0 0.6917 1493.0 1518.0 2026.0 0.7493 0.7369 760.0 769.0 1231.0 0.6247 0.6174
0.0 34.0 102 2.3927 0.0059 11287.6169 7823.9798 2283.0 3270.0 0.6982 2259.0 0.6908 1495.0 1519.0 2026.0 0.7498 0.7379 755.0 764.0 1231.0 0.6206 0.6133
0.0 35.0 105 2.3930 0.0059 11289.1913 7825.0711 2285.0 3270.0 0.6988 2258.0 0.6905 1494.0 1523.0 2026.0 0.7517 0.7374 755.0 762.0 1231.0 0.6190 0.6133

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
1B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-26t8ytsb

Finetuned
(841)
this model

Evaluation results