- cross-posted to:
- news@lemmy.world
- cross-posted to:
- news@lemmy.world
This month, the Internet Archive’s Wayback Machine archived its trillionth webpage, and the nonprofit invited its more than 1,200 library partners and 800,000 daily users to join a celebration of the moment. To honor “three decades of safeguarding the world’s online heritage,” the city of San Francisco declared October 22 to be “Internet Archive Day.” The Archive was also recently designated a federal depository library by Sen. Alex Padilla (D-Calif.), who proclaimed the organization a “perfect fit” to expand “access to federal government publications amid an increasingly digital landscape.”
The Internet Archive might sound like a thriving organization, but it only recently emerged from years of bruising copyright battles that threatened to bankrupt the beloved library project. In the end, the fight led to more than 500,000 books being removed from the Archive’s “Open Library.”
“We survived,” Internet Archive founder Brewster Kahle told Ars. “But it wiped out the Library.”
An Internet Archive spokesperson confirmed to Ars that the archive currently faces no major lawsuits and no active threats to its collections. Kahle thinks “the world became stupider” when the Open Library was gutted—but he’s moving forward with new ideas.
Distributed archives seem to be the way forward. It’s much harder to take something down if it’s spread across the globe and not controlled by a single entity
There are some around. I know of https://annas-archive.org/ at least
It’s also much harder to guarantee preservation with distributed archive. Example: torrents with 0 seeders.
That’s why you need more people and spread the word. If enough people and devices are dedicated to the archival probably cess, the safer it is
So 5 times more overhead to guarantee the safety of data, that is x5 more cost cause it’s not like regular people have servers with lots of memory just sitting at their homes.
I have mixed feelings. I’m glad they survived the lawsuits, and now they can spend their funding on their actual goals rather than it going towards lawyers.
On the other hand, it’s really sad that they had to delete so much of their archive - over half a million books, and a bunch of recordings from their Great 78 Project (which was archiving 300k+ music albums released between ~1900 and 1950). A lot of the things that can’t be archived are eventually going to become lost media.
I really hope that they didn’t actually delete anything, and only just removed public access.
And open themselves up to massive penalties? That would be beyond stupid.
I wouldn’t think a library/archive retaining data in an offline form would incur penalties, and I feel like preserving books for the future is the opposite of stupid.
Preserving is important, sure. But if the settlement required them to delete it and they keep an offline backup and this ever gets out, the settlement is voided and it opens up a world of hurt for them.
This is not a debate about the merits of preservation but about legal repercussions for the Internet Archive.
I didn’t know if it did or didn’t. But since you say that’s the case, that sucks and I hate the publishers even more.




