Who the hell is thinking of these concepts. News has been running on chatgpt giving dangerous hallucinations, suicide instructions, mimicking love and attachment.
In short the only way to wind up with an LLM that’s probably safe for kids, would be to start training from zero. Give it absolutely no exposure to anything that wasn’t curated from the start… say the initial data set being a catalog of mr rogers and seseme street scripts. Starting from “everything on the internet” and then trying to restrict down is a fools erend. That’s like trying to make a porn blocker with a blacklist strategy.
That doesn’t work either. The technology itself requires training on absolutely bonkers massive datasets to function the way we expect it to. You can’t just “only train it on Mr. Rogers and Sesame Street” because that won’t result in an LLM model anymore, it will simply be a tiny “Mr. Rogers and Sesame Street word model” with such extremely limited capabilities that you won’t even recognize it as an AI or chatbot at all. No matter how much you train it on such a limited dataset it will never appear any “smarter”. Having read and consumed almost all of humanity’s entire corpus of collective knowledge and fiction and having near instant access to everything added to it over time is the special sauce. That’s what it takes to make it appear “smart”. But it’s not. And after undoing that then restricting it to such a limited dataset, it will never even appear to be in the first place.
Even if you expanded it to be trained on every possible “suitable for children” piece of content you can find, even if such a thing were possible to define and acquire, there is probably simply not enough such content available in the world to train something into the same sort of thing that we currently use for chatbots in a safe way that you can be sure it will be safe and suitable for children. Insofar as anyone can be sure about a random typewriter not eventually producing Mein Kampf Children’s Edition. It’s random, it’s just an infinite generator for finite probabilities, the most we could perhaps say is that it would probably be safe with a very generous statistical amount of certainty. But not actual certainty, because it’s random and safety is not a fixed target anyway, it moves.
The whole thing is a fool’s errand. We’re being fed bullshit for profit and to misinform and manipulate us. It was bullshit from day 0. It’s still bullshit. It was never about and will never be about anything other than profit and manipulation. The technology itself is very interesting and MAY have real applications, with real value, but right now what it’s being sold as and for is almost exclusively pure, high-grade bullshit, that the entire economy has started snorting and getting high on thinking we have finally reached the tech singularity, won capitalism and solved all the world’s problems. They are wrong. We will have to be ready to make sure the consequences of them being so wrong are not inflicted on innocent people, because if we can do that, then the consequences will simply be really funny and satisfying to watch. I’m not holding my breath for that, but a man can dream.
Agreed there too, there probably could never be a large enough pool of safely organized data to actually make a child safe chat gpt. But yeah kind of the key issue I agree with.
also agreed on the inevitable implosion of generative AI. I think we’re basically hitting moores law on it where there isn’t enough data to even remotely train it much further than it is, and the mass output of AI data now, is we’ll certainly be very soon hitting extreme repercussions of “incestuous data”. (IE the internet is getting flooded with AI slop, AI models are searching the internet, won’t be long before the copy of copy of copy problems lead to everything going backwards).
So yeah, agreed AI was always bullshit day 1. The greatest flaw was releasing it to the public in it’s infancy, and it’s very early on super impressive demos in a form that your average boomer accountant could see it and go “that’s amazing”
There is also the privacy concern (which isn’t new) about toys that upload everything your child says to the company’s servers. There’s a concern about the privacy of your child’s words, but also about the corporation getting recordings of their voice, given all the nefarious purposes a voice recording can be used for these days (surveillance voice recognition, deepfakes, etc.). Plus the toy could be listening and recording at any time.
In the future, local AI models will solve this problem! Then parents will be complaining about how hot the toy is and it’ll get recalled because little kids everywhere kept getting “GPU burns”.
Easy way to add AI buzzword - just slap a transcriber, a relay to ChatGPT, and text-to-speech. Middle manager makes a presentation to corporate, and Blam!
Who the hell is thinking of these concepts. News has been running on chatgpt giving dangerous hallucinations, suicide instructions, mimicking love and attachment.
In short the only way to wind up with an LLM that’s probably safe for kids, would be to start training from zero. Give it absolutely no exposure to anything that wasn’t curated from the start… say the initial data set being a catalog of mr rogers and seseme street scripts. Starting from “everything on the internet” and then trying to restrict down is a fools erend. That’s like trying to make a porn blocker with a blacklist strategy.
That doesn’t work either. The technology itself requires training on absolutely bonkers massive datasets to function the way we expect it to. You can’t just “only train it on Mr. Rogers and Sesame Street” because that won’t result in an LLM model anymore, it will simply be a tiny “Mr. Rogers and Sesame Street word model” with such extremely limited capabilities that you won’t even recognize it as an AI or chatbot at all. No matter how much you train it on such a limited dataset it will never appear any “smarter”. Having read and consumed almost all of humanity’s entire corpus of collective knowledge and fiction and having near instant access to everything added to it over time is the special sauce. That’s what it takes to make it appear “smart”. But it’s not. And after undoing that then restricting it to such a limited dataset, it will never even appear to be in the first place.
Even if you expanded it to be trained on every possible “suitable for children” piece of content you can find, even if such a thing were possible to define and acquire, there is probably simply not enough such content available in the world to train something into the same sort of thing that we currently use for chatbots in a safe way that you can be sure it will be safe and suitable for children. Insofar as anyone can be sure about a random typewriter not eventually producing Mein Kampf Children’s Edition. It’s random, it’s just an infinite generator for finite probabilities, the most we could perhaps say is that it would probably be safe with a very generous statistical amount of certainty. But not actual certainty, because it’s random and safety is not a fixed target anyway, it moves.
The whole thing is a fool’s errand. We’re being fed bullshit for profit and to misinform and manipulate us. It was bullshit from day 0. It’s still bullshit. It was never about and will never be about anything other than profit and manipulation. The technology itself is very interesting and MAY have real applications, with real value, but right now what it’s being sold as and for is almost exclusively pure, high-grade bullshit, that the entire economy has started snorting and getting high on thinking we have finally reached the tech singularity, won capitalism and solved all the world’s problems. They are wrong. We will have to be ready to make sure the consequences of them being so wrong are not inflicted on innocent people, because if we can do that, then the consequences will simply be really funny and satisfying to watch. I’m not holding my breath for that, but a man can dream.
Agreed there too, there probably could never be a large enough pool of safely organized data to actually make a child safe chat gpt. But yeah kind of the key issue I agree with.
also agreed on the inevitable implosion of generative AI. I think we’re basically hitting moores law on it where there isn’t enough data to even remotely train it much further than it is, and the mass output of AI data now, is we’ll certainly be very soon hitting extreme repercussions of “incestuous data”. (IE the internet is getting flooded with AI slop, AI models are searching the internet, won’t be long before the copy of copy of copy problems lead to everything going backwards).
So yeah, agreed AI was always bullshit day 1. The greatest flaw was releasing it to the public in it’s infancy, and it’s very early on super impressive demos in a form that your average boomer accountant could see it and go “that’s amazing”
There is also the privacy concern (which isn’t new) about toys that upload everything your child says to the company’s servers. There’s a concern about the privacy of your child’s words, but also about the corporation getting recordings of their voice, given all the nefarious purposes a voice recording can be used for these days (surveillance voice recognition, deepfakes, etc.). Plus the toy could be listening and recording at any time.
In the future, local AI models will solve this problem! Then parents will be complaining about how hot the toy is and it’ll get recalled because little kids everywhere kept getting “GPU burns”.
Easy way to add AI buzzword - just slap a transcriber, a relay to ChatGPT, and text-to-speech. Middle manager makes a presentation to corporate, and Blam!