Add proper metadata
#101 opened about 13 hours ago
by
osanseviero
This model is so good, but the KV cache ruins it.
#100 opened 1 day ago
by
UniversalLove333
Update README.md
#99 opened 3 days ago
by
hectorruiz9
o-O too good to be true
#98 opened 3 days ago
by
mekasu
could ya' add gpt j 6b to google ai studio? completion mode
#97 opened 3 days ago
by
Tralalabs
Weird token after underscore `_//`
1
#96 opened 5 days ago
by
yamikumods
CodeGemma 4 collection
β 2
1
#95 opened 5 days ago
by
cbrain
fix(chat_template): Emit multimodal placeholders in tool response content-parts
π 3
#94 opened 6 days ago
by
harshaljanjani
{Feature request] Eliminate pre-attention RMSNorm in Gemma 4 via scale invariance + weight folding
#93 opened 6 days ago
by
graefics
Reviews of Gemma 4
1
#92 opened 8 days ago
by
Juanoto2012
Update chat_template.jinja to address JSON Schema shapes that do not expose their meaning through a direct top-level type
π 6
#91 opened 8 days ago
by
sigjhl
Possible chat_template.jinja issue: nullable $ref tool schemas are rendered as empty types
1
#87 opened 8 days ago
by
sigjhl
When training models from the gemma4 series using GRPO, an abnormally high grad norm was observed
#84 opened 9 days ago
by
mamazi00
add newlines and thinking tokens to template to avoid having to compute 3 extra tokens per generation in chat completion+reasoning
π 2
2
#83 opened 10 days ago
by
quasar-of-mikus
Update README.md
#81 opened 12 days ago
by
hectorruiz9
Gemma 4 models are way to paranoid about dates, any tips?
π₯ 3
3
#80 opened 12 days ago
by
Ahugm
Incorrect output in Gemma 4: seeking a solution to the problem ( la la la )
6
#79 opened 12 days ago
by
Lintrarius
Fix chat_template: emit empty <|channel>thought\n<channel|> wrapper for existing asst turns
1
#78 opened 13 days ago
by
flotherxi
[Bug] chat_template: missing <|channel>thought\n<channel|> wrapper for non-thinking SFT / multi-turn
1
#77 opened 13 days ago
by
flotherxi
Thinking erratic at 30000+ context
1
#76 opened 13 days ago
by
JeslynMcKenzie
Multilingual Support List
β 1
#75 opened 13 days ago
by
abcdvzz
Will there be a small model like gemma-3-270m?
π₯ 1
#74 opened 14 days ago
by
ymcki
Unexpected loss spikes and performance degradation when fine-tuning Gemma 4 (google/gemma-4-31B-it)
1
#73 opened 15 days ago
by
rstaruch
Add ParseBench evaluation results
4
#72 opened 19 days ago
by
boyang-runllama
Will there be a small model for speculative decoding?
3
#71 opened 20 days ago
by
Regrin
Imagen 1 (2022) Should Be Open Sourced
π 5
#70 opened 20 days ago
by
Tralalabs
Question about tool-calling order in chat_template.jinja
1
#67 opened 21 days ago
by
json0
gemma-4-31b-it unable to execute tool calling
3
#66 opened 21 days ago
by
Naman2302
Do Gemma 4 models work well?
3
#65 opened 22 days ago
by
Regrin
fix: embed chat_template in tokenizer_config.json
#64 opened 24 days ago
by
NERDDISCO
Infinite loop is not fixed even with Google API
π 1
2
#63 opened 24 days ago
by
alexcardo
Chat Template has a bug.
π€ 2
5
#62 opened 24 days ago
by
Reithan
why print rightarrow
πβ€οΈ 3
5
#61 opened 25 days ago
by
wangtf-Kevin
Can anyone improve the model using the Rys methodologyβby duplicating a block of layers?
11
#60 opened 25 days ago
by
Regrin
Strange behaviour of the tokenizer
2
#58 opened 27 days ago
by
andercorral
Good Workflow
2
#57 opened 27 days ago
by
anthoekfj
fix: function calling formatting in chat template
β€οΈ 2
1
#55 opened 28 days ago
by
RyanMullins
Chat template is too complicated that even Gemma 4 itself has no idea how to parse it
1
#53 opened 28 days ago
by
alexcardo
Hardware requirement
ππ 3
13
#52 opened 28 days ago
by
Charan01
Tokens per Image Parameter?
2
#51 opened 28 days ago
by
buckeye17
Guys please add the MTP to this model
π₯ 5
5
#50 opened 28 days ago
by
Narutoouz
Will there be QAT models?
π€π 12
2
#49 opened 29 days ago
by
Regrin
Gemma 4 E4B will be as encyclopedically well-read as the 12b model?
3
#48 opened 29 days ago
by
Regrin
Create BTS
#47 opened 29 days ago
by deleted
brokersponsor
1
#46 opened 29 days ago
by
Brokersponsor
Update README.md
#45 opened 29 days ago
by
Brokersponsor
Qusetion about math_vision and mmmu_pro evaluation result
1
#44 opened 29 days ago
by
JjjjjZzz
The Gemma 4 model is great. But...
π 4
5
#43 opened 29 days ago
by
suitup91