• 0 Posts
  • 98 Comments
Joined 1 year ago
cake
Cake day: July 1st, 2023

help-circle


















  • The technology for quantisation has improved a lot this past year making very small quants viable for some uses. I think the general consensus is that an 8bit quant will be nearly identical to a full model. Though a 6bit quant can feel so close that you may not even notice any loss of quality.

    Going smaller than that is where the real trade off occurs. 2-3 bit quants of much larger models can absolutely surprise you, though they will probably be inconsistent.

    So it comes down to the task you’re trying to accomplish. If it’s programming related, 6bit and up for consistency with whatever the largest coding model you can fit. If it’s creative writing or something a much lower quant with a larger model is the way to go in my opinion.