Yesterday, Search Engine Land featured a post about the Internet Archive’s Wayback Machine now providing access to more than 440 billion archived web pages back to 1996.
As I’ve written about before on Search Engine Land and my infoDOCKET blog, the Wayback Machine is an absolutely essential resource for just about every web user.
If you’ve been using Wayback since it debuted, you probably remember that the lag time for material (new pages and updates) to become accessible used to be six months or longer.
However, in the past couple of years this has also dramatically improved.
In Fall 2012, the Wayback Machine introduced a new feature that allows any user to archive any publicly accessible webpage or PDF that can be crawled on-demand.
The Wayback Machine: Your Own Web Archiver
Basically, simply cut and paste the URL of a web page or PDF and the Wayback crawler will archive and index the material and provide you with a direct url to it in real-time.
You’ll find a box to paste the URL into on the Wayback homepage. It’s labeled “Save Page Now.”
Once the crawling and indexing is complete, a URL to the archived copy will either be provided in a pop-up box or — if archiving a PDF file — it will be found in the location bar.
There is no cost to use this feature and with it you can be assured the page/PDF you saw is available at a later date. At the same time, you’ve also helped make the Wayback Machine more comprehensive from all users.
By the way, the massive crawling of web material that has built a database of more than 440 billion pages continues. This new on-demand feature is in addition to the regular crawl. (Of course, pages and sites that are password protected, blocked by Robots.txt, etc. are not crawled.)
Finally, here’s info about a bookmarklet that makes adding content to the Wayback Machine even easier and faster.
Thanks to Brewster Kahle and the entire Internet Archive team for this incredibly useful feature.