The Ever-Expanding Job of Preserving the Internet's Backpages

A quarter of a century after it began collecting web pages, the Internet Archive is adapting to new challenges. From a report: Within the walls of a beautiful former church in San Francisco's Richmond district, racks of computer servers hum and blink with activity. They contain the internet. Well, a very large amount of it. The Internet Archive, a non-profit, has been collecting web pages since 1996 for its famed and beloved Wayback Machine. In 1997, the collection amounted to 2 terabytes of data. Colossal back then, you could fit it on a $50 thumb drive now. Today, the archive's founder Brewster Kahle tells me, the project is on the brink of surpassing 100 petabytes -- approximately 50,000 times larger than in 1997. It contains more than 700bn web pages. The work isn't getting any easier. Websites today are highly dynamic, changing with every refresh. Walled gardens like Facebook are a source of great frustration to Kahle, who worries that much of the political activity that has taken place on the platform could be lost to history if not properly captured. In the name of privacy and security, Facebook (and others) make scraping difficult.

The Ever-Expanding Job of Preserving the Internet's Backpages

Admin

0 Response to "The Ever-Expanding Job of Preserving the Internet's Backpages"

Post a Comment

ad

Search Your Job

Popular Entry

Label

Blog Archives

The Ever-Expanding Job of Preserving the Internet's Backpages

Related Posts :

Admin

0 Response to "The Ever-Expanding Job of Preserving the Internet's Backpages"

Post a Comment

ad

Search Your Job