Spider
I need an extra feature for the sphider.eu script. I want to be able to set it to crawl and index only URLs from a specific TLD. For example: I want it to index only .COM domains. The next day I only want it to index .NET domains. Etcetera.
I also want it to be able to crawl URLs with a specific word in them, for example only those URL’s with the word -net-. It will then index only the domains like for example: internet.com, netvibes.com, etc.
It should be possible to set a pause between URLs to crawl, for example crawl 1 URL, then wait 3 seconds. It should not use too much of my server resources nor from the server it is crawling.
It has to index the site title, description and keywords from the meta tags.



