• 6 Posts
  • 724 Comments
Joined 1 year ago
cake
Cake day: March 22nd, 2024

help-circle





  • One can use ik_llama.cpp to run the dense layers on a 3090/4090 and offload the MoE layers to a threadripper/EPYC CPU, with full support for its MLA attention scheme, at quite reasonable speeds. In other words, the full deepseek is surprisingly usable locally if you shoot for the right setup.

    And now we have something similar from Qwen, at “only” 235B.






  • It’s a bit scary because many of those things (Wikipedia, academic piracy) are being threatened and villainized, others (Reddit niches, maybe eventually YouTube) are hemorrhaging useful info, and utilitarian LLMs are simultaneously being vilified and enshittified by opposing political sides.

    Like, with the Qwen3 release, I just realized my internet barometer for “is it any good?” and technical info is totally gone… Reddit and other niches have withered away, Twitter/Linkdin are pure engagement farms, and I can’t hardly discuss it anywhere else populated without getting banned as an alleged AI Bro (whom, for the record, I hate with a burning passion). I seriously considered joining WeChat just to see some sane discussion.

    This is true for other fandoms and niches I’m in.

    I hate to sound apocalyptic, but it feels like my information sphere is imploding. The real marker will be when the US government starts taking action against Wikipedia.


  • I found the post to be succinct and coherent.

    Some problems need 2 or 3 paragraphs to even begin to convey them. They could’ve said “the problem isn’t just capitalism,” and that would have been met with vitriol, as it doesn’t convey that the actual article is more nuanced than “anti solar,”that meeting variable power demand with solar supply is a challenge, that at some point one does indeed saturate regional demand for solar to the point that building more plants isn’t productive (which frequent negative prices are an indication of), and so on.

    And if that’s too long and complex, well… I dunno what to tell you.







  • My perspective is that research machine learning has been chugging along reasonably for years, without any fuss, until Altman went against OpenAI’s mandate and commercialized (and marketed) ChatGPT.

    Now it’s enshittified. And ruining shit. Thanks for that.

    One example I often cite is the utter shock in finance land at Deepseek R1 coming out when the research/tinkerer community saw that coming miles away.