Max model len check
#1
by
rekrek - opened
Is this intentional :"max_position_embeddings": 8192,
compared to
"max_position_embeddings": 262144,
in non GPTQ version ?
Thanks
Also is it's size normal ? It seems size is more int8 ? I do prefer AWQ for quantization on vllm, seems faster/better.
Thanks again