Max model len check

by rekrek - opened Jan 30

Jan 30

Is this intentional :
"max_position_embeddings": 8192,

compared to
"max_position_embeddings": 262144,

in non GPTQ version ?

Thanks

Jan 30

Also is it's size normal ? It seems size is more int8 ? I do prefer AWQ for quantization on vllm, seems faster/better.

Thanks again

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment