• plinky [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    4 months ago

    nvidia could easily switch gears from hyperscalers to “run your deepseek locally and not lose information, just buy 8 blackwells per location”. (although do people even buy gpu for inference idk) (according to trusted source - random commenter at tpu - epyc can run it at 5 tokens/s, so it’s roughly 20k bux, but unfeasible for multiusers at those speeds)