BoolQ_Llama-3.2-1B-26t8ytsb
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.6420
- Model Preparation Time: 0.0059
- Mdl: 7746.5496
- Accumulated Loss: 5369.4990
- Correct Preds: 2337.0
- Total Preds: 3270.0
- Accuracy: 0.7147
- Correct Gen Preds: 2302.0
- Gen Accuracy: 0.7040
- Correct Gen Preds 9642: 1525.0
- Correct Preds 9642: 1556.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.7680
- Gen Accuracy 9642: 0.7527
- Correct Gen Preds 2822: 768.0
- Correct Preds 2822: 781.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.6344
- Gen Accuracy 2822: 0.6239
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0059 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.5194 | 1.0 | 3 | 0.7779 | 0.0059 | 3669.7865 | 2543.7022 | 1857.0 | 3270.0 | 0.5679 | 1701.0 | 0.5202 | 670.0 | 760.0 | 2026.0 | 0.3751 | 0.3307 | 1022.0 | 1097.0 | 1231.0 | 0.8911 | 0.8302 |
| 0.2429 | 2.0 | 6 | 0.8811 | 0.0059 | 4156.8448 | 2881.3053 | 2088.0 | 3270.0 | 0.6385 | 1960.0 | 0.5994 | 1723.0 | 1837.0 | 2026.0 | 0.9067 | 0.8504 | 228.0 | 251.0 | 1231.0 | 0.2039 | 0.1852 |
| 0.0895 | 3.0 | 9 | 0.7633 | 0.0059 | 3601.1554 | 2496.1307 | 2252.0 | 3270.0 | 0.6887 | 2036.0 | 0.6226 | 1357.0 | 1527.0 | 2026.0 | 0.7537 | 0.6698 | 670.0 | 725.0 | 1231.0 | 0.5890 | 0.5443 |
| 0.3695 | 4.0 | 12 | 2.0120 | 0.0059 | 9491.7849 | 6579.2040 | 2288.0 | 3270.0 | 0.6997 | 2274.0 | 0.6954 | 1905.0 | 1917.0 | 2026.0 | 0.9462 | 0.9403 | 360.0 | 371.0 | 1231.0 | 0.3014 | 0.2924 |
| 0.0001 | 5.0 | 15 | 1.6420 | 0.0059 | 7746.5496 | 5369.4990 | 2337.0 | 3270.0 | 0.7147 | 2302.0 | 0.7040 | 1525.0 | 1556.0 | 2026.0 | 0.7680 | 0.7527 | 768.0 | 781.0 | 1231.0 | 0.6344 | 0.6239 |
| 0.0 | 6.0 | 18 | 1.8149 | 0.0059 | 8562.1293 | 5934.8158 | 2298.0 | 3270.0 | 0.7028 | 2231.0 | 0.6823 | 1282.0 | 1340.0 | 2026.0 | 0.6614 | 0.6328 | 940.0 | 958.0 | 1231.0 | 0.7782 | 0.7636 |
| 0.0006 | 7.0 | 21 | 1.9537 | 0.0059 | 9216.6870 | 6388.5206 | 2267.0 | 3270.0 | 0.6933 | 2202.0 | 0.6734 | 1241.0 | 1299.0 | 2026.0 | 0.6412 | 0.6125 | 952.0 | 968.0 | 1231.0 | 0.7864 | 0.7734 |
| 0.0 | 8.0 | 24 | 2.0340 | 0.0059 | 9595.7529 | 6651.2691 | 2302.0 | 3270.0 | 0.7040 | 2246.0 | 0.6869 | 1366.0 | 1414.0 | 2026.0 | 0.6979 | 0.6742 | 871.0 | 888.0 | 1231.0 | 0.7214 | 0.7076 |
| 0.0 | 9.0 | 27 | 2.1642 | 0.0059 | 10209.8590 | 7076.9350 | 2299.0 | 3270.0 | 0.7031 | 2241.0 | 0.6853 | 1413.0 | 1464.0 | 2026.0 | 0.7226 | 0.6974 | 819.0 | 835.0 | 1231.0 | 0.6783 | 0.6653 |
| 0.0 | 10.0 | 30 | 2.2394 | 0.0059 | 10564.7680 | 7322.9392 | 2289.0 | 3270.0 | 0.7 | 2231.0 | 0.6823 | 1436.0 | 1485.0 | 2026.0 | 0.7330 | 0.7088 | 786.0 | 804.0 | 1231.0 | 0.6531 | 0.6385 |
| 0.0 | 11.0 | 33 | 2.2853 | 0.0059 | 10781.2871 | 7473.0188 | 2289.0 | 3270.0 | 0.7 | 2239.0 | 0.6847 | 1459.0 | 1500.0 | 2026.0 | 0.7404 | 0.7201 | 771.0 | 789.0 | 1231.0 | 0.6409 | 0.6263 |
| 0.0 | 12.0 | 36 | 2.3170 | 0.0059 | 10930.7882 | 7576.6450 | 2293.0 | 3270.0 | 0.7012 | 2249.0 | 0.6878 | 1474.0 | 1513.0 | 2026.0 | 0.7468 | 0.7275 | 766.0 | 780.0 | 1231.0 | 0.6336 | 0.6223 |
| 0.0 | 13.0 | 39 | 2.3358 | 0.0059 | 11019.6269 | 7638.2233 | 2291.0 | 3270.0 | 0.7006 | 2250.0 | 0.6881 | 1480.0 | 1519.0 | 2026.0 | 0.7498 | 0.7305 | 761.0 | 772.0 | 1231.0 | 0.6271 | 0.6182 |
| 0.0 | 14.0 | 42 | 2.3477 | 0.0059 | 11075.6473 | 7677.0537 | 2288.0 | 3270.0 | 0.6997 | 2249.0 | 0.6878 | 1484.0 | 1521.0 | 2026.0 | 0.7507 | 0.7325 | 756.0 | 767.0 | 1231.0 | 0.6231 | 0.6141 |
| 0.0 | 15.0 | 45 | 2.3567 | 0.0059 | 11118.0345 | 7706.4343 | 2285.0 | 3270.0 | 0.6988 | 2248.0 | 0.6875 | 1483.0 | 1517.0 | 2026.0 | 0.7488 | 0.7320 | 756.0 | 768.0 | 1231.0 | 0.6239 | 0.6141 |
| 0.0 | 16.0 | 48 | 2.3619 | 0.0059 | 11142.3851 | 7723.3128 | 2282.0 | 3270.0 | 0.6979 | 2248.0 | 0.6875 | 1483.0 | 1517.0 | 2026.0 | 0.7488 | 0.7320 | 756.0 | 765.0 | 1231.0 | 0.6214 | 0.6141 |
| 0.0 | 17.0 | 51 | 2.3645 | 0.0059 | 11154.6211 | 7731.7942 | 2292.0 | 3270.0 | 0.7009 | 2256.0 | 0.6899 | 1489.0 | 1524.0 | 2026.0 | 0.7522 | 0.7349 | 758.0 | 768.0 | 1231.0 | 0.6239 | 0.6158 |
| 0.0 | 18.0 | 54 | 2.3710 | 0.0059 | 11185.5857 | 7753.2572 | 2283.0 | 3270.0 | 0.6982 | 2251.0 | 0.6884 | 1485.0 | 1517.0 | 2026.0 | 0.7488 | 0.7330 | 757.0 | 766.0 | 1231.0 | 0.6223 | 0.6149 |
| 0.0 | 19.0 | 57 | 2.3719 | 0.0059 | 11189.8794 | 7756.2333 | 2285.0 | 3270.0 | 0.6988 | 2252.0 | 0.6887 | 1488.0 | 1520.0 | 2026.0 | 0.7502 | 0.7345 | 755.0 | 765.0 | 1231.0 | 0.6214 | 0.6133 |
| 0.0 | 20.0 | 60 | 2.3739 | 0.0059 | 11199.2181 | 7762.7064 | 2287.0 | 3270.0 | 0.6994 | 2255.0 | 0.6896 | 1489.0 | 1520.0 | 2026.0 | 0.7502 | 0.7349 | 757.0 | 767.0 | 1231.0 | 0.6231 | 0.6149 |
| 0.0 | 21.0 | 63 | 2.3731 | 0.0059 | 11195.3841 | 7760.0489 | 2287.0 | 3270.0 | 0.6994 | 2255.0 | 0.6896 | 1491.0 | 1521.0 | 2026.0 | 0.7507 | 0.7359 | 755.0 | 766.0 | 1231.0 | 0.6223 | 0.6133 |
| 0.0 | 22.0 | 66 | 2.3758 | 0.0059 | 11208.0963 | 7768.8604 | 2285.0 | 3270.0 | 0.6988 | 2258.0 | 0.6905 | 1494.0 | 1522.0 | 2026.0 | 0.7512 | 0.7374 | 756.0 | 763.0 | 1231.0 | 0.6198 | 0.6141 |
| 0.0 | 23.0 | 69 | 2.3778 | 0.0059 | 11217.4939 | 7775.3743 | 2284.0 | 3270.0 | 0.6985 | 2255.0 | 0.6896 | 1493.0 | 1521.0 | 2026.0 | 0.7507 | 0.7369 | 753.0 | 763.0 | 1231.0 | 0.6198 | 0.6117 |
| 0.0 | 24.0 | 72 | 2.3792 | 0.0059 | 11224.3777 | 7780.1458 | 2289.0 | 3270.0 | 0.7 | 2258.0 | 0.6905 | 1491.0 | 1522.0 | 2026.0 | 0.7512 | 0.7359 | 758.0 | 767.0 | 1231.0 | 0.6231 | 0.6158 |
| 0.0 | 25.0 | 75 | 2.3799 | 0.0059 | 11227.6572 | 7782.4189 | 2290.0 | 3270.0 | 0.7003 | 2260.0 | 0.6911 | 1493.0 | 1522.0 | 2026.0 | 0.7512 | 0.7369 | 759.0 | 768.0 | 1231.0 | 0.6239 | 0.6166 |
| 0.0 | 26.0 | 78 | 2.3831 | 0.0059 | 11242.5989 | 7792.7757 | 2283.0 | 3270.0 | 0.6982 | 2251.0 | 0.6884 | 1488.0 | 1520.0 | 2026.0 | 0.7502 | 0.7345 | 754.0 | 763.0 | 1231.0 | 0.6198 | 0.6125 |
| 0.0 | 27.0 | 81 | 2.3824 | 0.0059 | 11239.3411 | 7790.5176 | 2287.0 | 3270.0 | 0.6994 | 2259.0 | 0.6908 | 1492.0 | 1520.0 | 2026.0 | 0.7502 | 0.7364 | 758.0 | 767.0 | 1231.0 | 0.6231 | 0.6158 |
| 0.0 | 28.0 | 84 | 2.3854 | 0.0059 | 11253.1701 | 7800.1031 | 2288.0 | 3270.0 | 0.6997 | 2258.0 | 0.6905 | 1494.0 | 1522.0 | 2026.0 | 0.7512 | 0.7374 | 756.0 | 766.0 | 1231.0 | 0.6223 | 0.6141 |
| 0.0 | 29.0 | 87 | 2.3858 | 0.0059 | 11255.3105 | 7801.5867 | 2292.0 | 3270.0 | 0.7009 | 2261.0 | 0.6914 | 1496.0 | 1526.0 | 2026.0 | 0.7532 | 0.7384 | 756.0 | 766.0 | 1231.0 | 0.6223 | 0.6141 |
| 0.0 | 30.0 | 90 | 2.3892 | 0.0059 | 11271.3917 | 7812.7334 | 2285.0 | 3270.0 | 0.6988 | 2260.0 | 0.6911 | 1494.0 | 1520.0 | 2026.0 | 0.7502 | 0.7374 | 757.0 | 765.0 | 1231.0 | 0.6214 | 0.6149 |
| 0.0 | 31.0 | 93 | 2.3900 | 0.0059 | 11274.9614 | 7815.2077 | 2287.0 | 3270.0 | 0.6994 | 2259.0 | 0.6908 | 1493.0 | 1521.0 | 2026.0 | 0.7507 | 0.7369 | 757.0 | 766.0 | 1231.0 | 0.6223 | 0.6149 |
| 0.0 | 32.0 | 96 | 2.3922 | 0.0059 | 11285.5291 | 7822.5327 | 2285.0 | 3270.0 | 0.6988 | 2256.0 | 0.6899 | 1492.0 | 1520.0 | 2026.0 | 0.7502 | 0.7364 | 755.0 | 765.0 | 1231.0 | 0.6214 | 0.6133 |
| 0.0 | 33.0 | 99 | 2.3909 | 0.0059 | 11279.2140 | 7818.1554 | 2287.0 | 3270.0 | 0.6994 | 2262.0 | 0.6917 | 1493.0 | 1518.0 | 2026.0 | 0.7493 | 0.7369 | 760.0 | 769.0 | 1231.0 | 0.6247 | 0.6174 |
| 0.0 | 34.0 | 102 | 2.3927 | 0.0059 | 11287.6169 | 7823.9798 | 2283.0 | 3270.0 | 0.6982 | 2259.0 | 0.6908 | 1495.0 | 1519.0 | 2026.0 | 0.7498 | 0.7379 | 755.0 | 764.0 | 1231.0 | 0.6206 | 0.6133 |
| 0.0 | 35.0 | 105 | 2.3930 | 0.0059 | 11289.1913 | 7825.0711 | 2285.0 | 3270.0 | 0.6988 | 2258.0 | 0.6905 | 1494.0 | 1523.0 | 2026.0 | 0.7517 | 0.7374 | 755.0 | 762.0 | 1231.0 | 0.6190 | 0.6133 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 4
Model tree for donoway/BoolQ_Llama-3.2-1B-26t8ytsb
Base model
meta-llama/Llama-3.2-1B