Archive

Posts Tagged ‘wayback’

Scrape Archived Website – Wayback

April 29th, 2011 Comments off

I need someone to go to the Internet archive (the new beta) at http://web.archive.org/ and copy the files for the site www((dot))sevenswordsmovie((dot))com and put them in a format so I can re-upload them as a website.

Doesn’t have to be perfect. But most of the site is there intact. You’ll need to go back a year or so and find a good copy.

Thanks!

Wayback Machine Clone

February 7th, 2010 Comments off

This is just a copy of an older project. I don’t know whether it is possible like that or not.

This project includes both a spider and a simple website to view the archived websites. It would be similar to archive.org’s spider (wayback machine). This spider should be able to do exactly what the wayback machine does.

As an example:
http://web.archive.org/web/*/http://www.scriptlance.com

It is easy to see that the archive creates a full duplication including changing the redirects so that it references the archive set. Take a good look at their archived set to know how this works (If you do not already know)

I would additionally want to grab a screenshot of the homepage as well as the Google PR of the site.

The interface should be relatively simple like archive.org in that you could select the date to pull up the archived version of the website.

You should have experience in creating web spiders.

Wayback Machine Clone

September 4th, 2009 Comments off

This project includes both a spider and a simple website to view the archived websites. It would be similar to archive.org’s spider (wayback machine). This spider should be able to do exactly what the wayback machine does.

As an example:
http://web.archive.org/web/*/http://www.scriptlance.com

It is easy to see that the archive creates a full duplication including changing the redirects so that it references the archive set. Take a good look at their archived set to know how this works (If you do not already know)

I would additionally want to grab a screenshot of the homepage as well as the Google PR of the site.

The interface should be relatively simple like archive.org in that you could select the date to pull up the archived version of the website.

You MUST have experience in creating web spiders.

Bear