Pretrained models from the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
Zayd Muhammad Kawakibi Zuhri PRO
zaydzuhri
AI & ML interests
I really like watching loss go down
Recent Activity
updated
a model
36 minutes ago
zaydzuhri/top-code-7B-4096-batch8x2-steps40000-2-20260114-083636
published
a model
3 days ago
zaydzuhri/top-code-7B-4096-batch8x2-steps40000-2-20260114-083636
updated
a model
8 days ago
zaydzuhri/dsmtp-math-7B-4096-model
Organizations
None yet