nvidia could easily switch gears from hyperscalers to “run your deepseek locally and not lose information, just buy 8 blackwells per location”. (although do people even buy gpu for inference idk) (according to trusted source - random commenter at tpu - epyc can run it at 5 tokens/s, so it’s roughly 20k bux, but unfeasible for multiusers at those speeds)
nvidia could easily switch gears from hyperscalers to “run your deepseek locally and not lose information, just buy 8 blackwells per location”. (although do people even buy gpu for inference idk) (according to trusted source - random commenter at tpu - epyc can run it at 5 tokens/s, so it’s roughly 20k bux, but unfeasible for multiusers at those speeds)