Key Takeaways
- The Internet Archive preserves digital content, historical snapshots of websites, and public domain works to protect our digital history.
- Websites are disappearing rapidly, with up to 38% vanishing by 2024, necessitating archives like the Wayback Machine.
- The Internet Archive safeguards more than just websites, including software, books, music archives, documentaries, and interactive experiences.
The Internet Archive, or IA, catalogs and preserves websites from the early days of the Internet. Some of these websites no longer exist, and IA is the only place we can go to get a snapshot of how they looked back then. A hack of this digital museum could mean losing these sites forever.
The Cultural Heritage of Ye Olde Internet
I got my first computer in 1996, but it wasn’t until 2001 that I experienced the internet in all its dial-up glory. I even had a GeoCities website, and if you remember what that is, I hope your back doesn’t hurt you. The Internet in those days felt so much more alive because we had more than just social media sites.
The slow march of time consumes all things, however, and many of those old websites disappeared into the ether. Luckily, someone had the foresight to save these digital pages (the most popular ones, anyway) as a virtual museum: the Internet Archive. The Internet Archive (and the Wayback Machine) allows you to see the Internet as it used to be.
Unfortunately, the Internet Archive had a rough few days in the last month, and was taken offline multiple times. While the site recovered and eventually returned to operation, its vulnerability opened the door to the question: “What happens if we lose the Internet Archive and the Wayback Machine?”
What Does the Internet Archive Protect, Anyway?
Why on earth would anyone want to preserve dead websites? They’re dead for a reason, right? Aside from their commercial use, these old websites can teach us a lot about the time and place they were created. The IA exists to protect certain things, such as:
- Digital Content: IA saves digital content, ensuring that it doesn’t disappear from the internet, even if the website producing that content is no longer around or hosted.
- Historical Snapshots: The Wayback Machine saves periodic snapshots of websites, allowing users to search for what a particular site looked like during a period. It’s not just about nostalgia, either. Internet investigators can use this to verify information that may no longer be around.
- Public Domain Works: IA actively curates and preserves public domain works, including books, films, and music. By digitizing and making these resources available, it promotes cultural heritage and access to knowledge.
By all metrics, that’s a lot of things to keep alive. In many cases, it’s the only place on the Internet where you can find some things. And that’s very concerning, given its vulnerability.
Losing Our Digital History
While it might sound like a paradox, digital history is real, and we’re losing pieces of it daily. According to the BBC,as many as a quarter of all websites disappeared entirely between 2013 and 2023. That value has gone up further, with the count standing at 38% in 2024. While we can depend on carved tablets to tell us who the worst copper merchant in Ur was; we can’t do the same with websites because once they’re gone, they’re gone.
That’s where archives like The Wayback Machine are invaluable. For example, if we look at MySpace in the Wayback Machine, we can get a sense of the foundations of social media networks. While Facebook and other social media sites are significantly different today, internet archivists can weigh MySpace’s impact on the development and evolution of these sites.
Do you remember Wikitravel? Don’t worry—most people who didn’t travel before 2012 wouldn’t know about it, but The Wayback Machine remembers. The site went extinct in 2012, but it served as a proto-TripAdvisor, where people shared their insights and tips on a publicly editable wiki page. Today, Wikitravel’s snapshots show how much digital travel has evolved since it went kaput. Losing IA and the Wayback Machine means losing these snapshots and all they could tell us.
It’s Not Just Websites, Either
Most people might not know it, but IA is also a repository for software, books, and public domain documents. Here’s a cross-section of what else we stand to lose if IA goes down.
- Software: Old operating systems, such as the early versions of UNIX and MS-DOS, are preserved on IA, and anyone can access them and try them out.
- Legacy Applications: Some older people may remember Lotus 1-2-3, the spreadsheet software. If the IA goes down, we will completely lose older versions of this software.
- Music Archives: Public domain audio, unique recordings, and historical music collections could go extinct in the blink of an eye.
- Documentaries and Oral Histories: Personal narratives in the form of snapshotted blogs or social media pages will cease to exist, and the people responsible for them will also be lost to time.
- Interactive Experiences: The 90s and early 00s were filled with interactive experiences tied to web pages. With the new standards of web pages today, we will never see the kind of things we saw back then. If we lose those archived pages, we may never get to re-live 90s computing again.
- Archival Projects: From out-of-print video games to other archival media, we may lose a huge chunk of our history, especially from the late 20th and early 21st centuries.
This isn’t an exhaustive list by any stretch of the imagination, but it touches on most of the things that IA preserves for us. Many of us take history (especially early internet history) for granted. Yet, it’s like that song, Big Yellow Taxi, you don’t know what you have until it’s gone. And if the IA goes down, it’ll be gone forever.
Learning From The Internet Archive Collapse
The recent hack that took down the IA should highlight the importance of preserving our digital history. The Smithsonian discusses all the challenges of archiving data and presents a few solutions, including data security, backups, and audits. Yet, it’s a difficult road for an organization like the Internet Archive to walk. Much of the preservation of our digital past depends on us, the users.
Tomorrow, we could wake up and realize that two decades of our digital history have been erased overnight. I don’t think I ever want to face that reality, nor should you. While the IA is around, we should do what we can to keep it alive. Internet history is just as essential to preserve as archaeological finds, and we should treat those archives with the same respect.