• Xylight@lemdro.id
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    That is a thing, and it’s called quantization aware training. Some open weight models like Gemma do it.

    The problem is that you need to re-train the whole model for that, and if you also want a full-quality version you need to train a lot more.

    It is still less precise, so it’ll still be worse quality than full precision, but it does reduce the effect.