Max model len check

#1
by rekrek - opened

Is this intentional :
"max_position_embeddings": 8192,

compared to
"max_position_embeddings": 262144,

in non GPTQ version ?

Thanks

Also is it's size normal ? It seems size is more int8 ? I do prefer AWQ for quantization on vllm, seems faster/better.

Thanks again

Sign up or log in to comment