Archive

Archive for the ‘Crawler’ Category

Setting Up Mongodb On Ec2 To Auto-scale

February 19th, 2012 Comments off

I currently run my website on a single server with a single MongoDB. I am moving to Amazon EC2 and building a horizontal scale-out model where more web-servers are dynamically started up. This poses a challenge for how I use MongoDB. I have a crawler/indexer that read/writes to the DB and a web-server that read/writes to the DB. I want to change my architecture so that I have 3 server types (crawler, DB, web machines). This task is to build the MongoDB on Amazon EC2 so that it scales horizo…

Categories: Amazon, Crawler Tags: , , , , , ,

Need A Script To Scrape Urls From A Website

February 17th, 2012 Comments off

I want to spend about $60 on this. I don’t think it will take more 1.5 hours if you work straight through.

My Goal:

I want to quantify how many individual listings are under each category “world wide” for this page: http://expertpages.com/all_top.htm I am not concerned with duplicates. I just want a list of URLs, so I can run a count function in excel and get an idea of where the most listings are in this directory.

What I need this script to do step by step:

1. Crawl each cate…

Help Setup Comparison Script

February 17th, 2012 Comments off

We got the kaon price comparison script and need a skilled and experienced programmer who already worked with the script before to help me setup the categories and products

The website category structure must be set up according our example.

We need a cronjob to automate all settings , means daily update of all merchant feeds and products or setup a descent price crawler for this.

We also got the filter options, so need to set this up too, this can be automated based on keywords in des…

Web Crawler

February 14th, 2012 Comments off

I need a web crawler that will eventually crawl all web sites in the world and index the home page. There should only be one page per url and then it should go to the next URL. The data must be stored in a SQL Server 2008 database so that it can be searched.

I am not sure where to get the starting point for the web crawler, but I would imagine that there would be many places to get urls that can act as the seeds. If you are bidding on this project, please know where you will get the seed w…

Auto Fetch Website

February 14th, 2012 Comments off

I am looking for somebody who can develop a fully operational website (vertical crawler) which crawls selected auto classifieds websites.

So when somebody comes to my site, he should choose what vehicle, model, year, mileage, fuel etc. to search, then the site should return the related results from the sites defined.
Basic information should be displayed and order by criterias like price, year,…

if a user clicks on the information he will be pointed to the original website and offer.

Data Scrape Of A Website 2

February 9th, 2012 Comments off

Given this directory of conferences:

http://tinyurl.com/gbrbr

The script must crawl the directory and extract the fields:

Name of conference, subject, location, dates, website and description

Script must be a command line, no user interface. Use curl. Output to a csv file

Data Entry / Web Research Job

February 8th, 2012 Comments off

Do you love to surf the Internet? Then this job is maybe for you!

I am looking for someone who is:

-An internet pro who knows how to research and crawl such internet resources

-Able to work smart, be able to identify and seek relevant sources for such information

-Curious and Enjoy data mining and information extraction from the web

-This position is result driven and pays well. Great compensation for the successful applicant/s

-This a part-time job, so payment is contracted pe…

Error 404 Problem In Google Webmaster

February 7th, 2012 Comments off

I have a problem with a website where Google Webmaster Tools is showing pages as not found – 404 (Not found) – when the pages actually exist.

The site in question is http://www.merlinpestcontrol.co.uk

Attached are the reports from Webmaster Tools (Crawl Errors, Crawl error Sources).

I require to know:
1. Why I am getting this problem
2. What I need to do to sort it (now and in the future)
3. What robots text I should be adding to pages to ensure site is most accessible to Google.

Need To Copy My 3 Pages Html Website To My 83 Domains

February 1st, 2012 Comments off

Hi I need a quick job as below:

I have a 3 html pages website

http://graphic-design-perth.com.au

WHAT YOU HAVE TO DO:

1. Copy this website to my another 84 domains (domains info will be provided)

2. Each Domain Title/keyword/description will different ( info will be provided)

3. First paragraph on home page will be different in each website ( content will be provided)

4. As you can see i have 4 services, Design, Print, Website, Marketing
what you have to do is shuffle …

Website Parsing 2

January 29th, 2012 Comments off

We need a reliable developer to parse a large source of data in a few days.
The source is a website similar in structure and nature to tripadvisor.com

We need to crawl all pages like this one:
http://www.tripadvisor.com/Hotel_Review-g32655-d595189-Reviews-Garden_Cottage_B_B-Los_Angeles_California.html

And extract all relevant information (will be provided in the SPEC). The results should be organized in 1 or multiple Excel (xls) files.

Preferably the parser should be done as a separat…

Advanced Property Portal And Information Service

January 28th, 2012 Comments off

Need a full service property information website and portal.

As an information service portal, the site will:
- have dynamic content with users able to create & update content, blogs, discussion forums, groups, etc in real-time
- be able to crawl other sites for property news, data, etc; as well as generate its own data, tools and information resources all arranged in a pre-defined logic
- email and phone alert service to members (e.g. property price change alerts, etc)
- store database o…

Php Crawler

January 26th, 2012 Comments off

I would like a php crawling script made.

I need URLS taken and placed into a mysql database

Intelligent Crawler / Scraper

January 26th, 2012 Comments off

Below are very specific instructions. Please do read and only bid if you agree.

1. Use any programming language you want.
2. I want the source code.
3. I want to have the algorithm installed and ready to go on my server. Any hosting service is fine as long as I own the domain and hosting. (I prefer bluehost.com or arvixe.com)

Here’s how the crawler should work:
4. The user uploads a CSV file with, for example, 10,000 rows of keywords, one keyword phrase per line
5. The user specifies …

Scrap (crawler) For A Website

January 25th, 2012 Comments off

I need a browser of desktop scrap script for the website http://www.qoc.de/plaintext

I will require a demo for escrow. Smallest bid with a demo wins the project.

Feel free to contact me if u need any further information

Regards
Timon

Top Videos 2

January 24th, 2012 Comments off

I need a script or set of scripts to get the most viewed youtube/vimeo videos for a certain set of keywords. The script should tell me the number of views of each video and get the description for each video from the website.

Top Videos

January 21st, 2012 Comments off

I need a script or set of scripts to get the most viewed youtube/vimeo videos for a certain set of keywords. The script should tell me the number of views of each video and get the description for each video from the website.

Seo Reseller

January 16th, 2012 Comments off

Hello,
I’m looking for a company to work in the long term with. I had posted last June/2011 to find some prices for a company to help me get my clients to the first pages of Google, yahoo, Bing. I would like to use my business name also as the SEO provider, so I would need your company to remain unknown to them. Please let me know your prices and in detail what you can do for me. Dont just place…I can do it!
Thanks for your time and good luck.

ON PAGE OPTMIZATION
Keyword URL Mapping
Tar…

Crawler Needed Quickly

January 12th, 2012 Comments off

Hello,

i need a SOFTWARE OR SCRIPT THAT CAN CRAWL A SITE AND EXTRACT ALL MOBILE NUMBERS AND EMAILS from IMAGES !

emails and mobile numbers are shown in images

It should be anonymous

Regards

Website Parsing

January 11th, 2012 Comments off

We need a reliable developer to parse a large source of data in a few days.
The source is a website similar in structure and nature to tripadvisor.com

We need to crawl all pages like this one:
http://www.tripadvisor.com/Hotel_Review-g32655-d595189-Reviews-Garden_Cottage_B_B-Los_Angeles_California.html

And extract all relevant information (will be provided in the SPEC). The results should be organized in 1 or multiple Excel (xls) files.

Preferably the parser should be done as a separat…

Crawler Needed

January 11th, 2012 Comments off

Hello,

i need a SOFTWARE OR SCRIPT THAT CAN CRAWL A SITE AND EXTRACT ALL MOBILE NUMBERS AND EMAILS from IMAGES !

emails and mobile numbers are shown in images

It should be anonymous

Regards

Scrap (crawler) For A Flash Website

January 11th, 2012 Comments off

I need a browser or desktop tool to scrap information from a flash website.

the lowest bid with a Demo wins the project

Regards
Timon

Scrape & Install Products To Oscommerce Site

January 10th, 2012 Comments off

Need products scraped from http://tinyurl.com/6n8x4ut and installed onto my site but i only need the sub categories inside cheap jordans category such as air jordan 1, 2, 3

Webpage Crawl Or Search. Not Bot

January 9th, 2012 Comments off

We want a tool that searches a specific web page for some specific information This webpage are normally very small sites with maybe 10 tot 20 pages It is not automated, a user puts in a webpage and then it needs to show the information, see mock up. We then will use the info and store it in our application. It needs to run on a web server.

Search voor the word kvk or k.v.k. or Kamer van Koophandel and retreive the number behind it
Search for the word BTW or B.T.W. And retreive the number b…

Vusker.com Clone

January 7th, 2012 Comments off

Hi there

i search for some one that can clone Vusker.com
it can be practacly the same

it must have a backend admin
to see gallery’s and see reports etc

and a ilegal gallery report button i main site

it must be run on a dedicated server
with php and mysql
the source is based on the old fusker technology
so that :
Create a new fusker. There are two ways to do this: a) Enter the URL (web address) of a web page that has direct links to images. This works well with TGP pages

Web Crawler / Web Scraping / Indexing

December 26th, 2011 Comments off

I am looking to extract data from this two website’s search results
a) http://bit.ly/tNpujL 707 pages
b) http://bit.ly/ucoVoF 90 pages

The data I need are as below for search results from http://bit.ly/tNpujL :
a) Name of the real estate agent
b) Phone number
c) Email
d) Total listings
e) Locations covered

The data I need are as below for search results from http://bit.ly/ucoVoF :
a) Name of the Realtor
b) Estate agent license
c) Phone number
d) Fax number
e) State
f) Postal…

Scrape Sites For Email Addresses 2

December 21st, 2011 Comments off

I have several thousand sites i want to scrape for email addresses.

i will provide an excel with the url’s.

I need an excel sheet back with all associated email addresses for each url in a separate column. For example

URL Email
www.sample.com info@sample.com; sales@sample.com

Data Scraper For Android Marketplace

December 21st, 2011 Comments off

We are looking for an experienced programmer that can scrape every app on the Android Marketplace for specific data on the app info page. The end result will be a mysql database or csv file for each app that will show specific detail about the app. Along with that, we want the script to continuously check for new apps (on a daily basis, using cronjobs) and save new apps into the database.

We will give full details in a private message after reviewing your previous work history

Here is wha…

Oscommerce Website Creation

December 18th, 2011 Comments off

Need someone to use the following template and install to oscommerce. the clone needs to be identically installed without products

http://osc3.template-help.com/osc_21466/index.php

i will have person install 3 oscommerce modules, edit some header & title tags, and edit the checkout process

need someone to clone and install all shoes at…. http://tinyurl.com/6wwpbuw (need all the product colors removed from the description and append all products with a 5 digit item number

I PREFER…

WordPress Optimization

December 17th, 2011 Comments off

Integrate the WordPress site with Facebook but facebook should not load at same time as site, deferred.

Integrate Skype that when people click on our skype name a skype telephone call or message is initiated

WP Index.php page should look like the current HTML destinations page with text boxes and pictures.4 columns with 6 text boxes per row makes 24 text boxes countries should be displayed in alphabetical order

Each Country Category (Country Packages) should collapse with the lastest 5 …

Copyscape.com Search 2

December 15th, 2011 Comments off

This is a simple job for a programmer with the write skills. Our site is written in PHP although the site displays as ASPX it is definitely in php and the address has been changed using mod rewrite.

I am wanting a script written that will check every page of our site using copyscape.com

If a page on our site appears to be duplicate content the page will be flagged and able to be viewed on a separate page of our admin system. The script will need to search copyscape in such a way that our …

Coupon Scraper Data Mining

December 10th, 2011 Comments off

This project is for a Coupon scraper of 3 of the largest coupon websites, including retail-me-not.

Scrape Stores

1. Description
2. Store logo/Screenshot
3. Store URL
4. Average Savings amount (If applicable)

Scrape Coupons

1. Coupon Title + actual Coupon Code (If applicable)
2. Store Name
3. Destination/Deep Link URL (Stripped of affiliate ID)
4. Expiration Date, Posted date (if applicable)
5. Tags
6. Categories
7. Coupon Type (printable, web, etc)
8. Average savings amou…

Indeed Total Jobs

December 9th, 2011 Comments off

We need a small crawler or using Indeed api, to get total jobs for a company. all you have to do, provide us code which will get #jobs from indeed website.

We dont care if you do php or other script.

Scrape Sites For Email Addresses

December 8th, 2011 Comments off

I have several thousand sites i want to scrape for email addresses.

i will provide an excel with the url’s.

I need an excel sheet back with all associated email addresses for each url in a separate column. For example

URL Email
www.sample.com info@sample.com; sales@sample.com

Cms Based Website Needs Full Seo + Optimization

December 7th, 2011 Comments off

Thank you for taken the time to look at my job request.

My Website: http://www.reflexleague.com

This website is using the “Webspell Content Management System” which is “php” based mixed with html, css ect..

I am looking to have my website fully setup with SEO Optimization & Marketing. Within this process making the website fully valid via W3C and other markup tools.

Not limiting to the following:

ON-PAGE:

*Keyword analysis.
*Google Site-map Generation
*canonical issue sol…

Itunes Scraper

December 1st, 2011 Comments off

Requirements:

We need a robot to crawl the iTunes Podcast directory, and then
provide us with the name, website, and email address of every podcast
in the directory. (Where they are available.)

Specification:

1. The main podcast directory is here:
http://itunes.apple.com/us/genre/podcasts-arts/id1301

On the left hand side of this page, you can see all the podcast
Categories. On the right-hand side, there is a list of podcast names
for each letter in the alphabet.

Th…

Need Php Expert In Seo Optimization

December 1st, 2011 Comments off

I need a PHP expert in SEO optimization on my site.

- original content
- 1 year old
- it has pagerank of 4 is based on http://www.prchecker.info
- based on google analytics, it keeps having crawl error on 404 page error. It has this type of error constantly.

I truly believe it’s all about the coding of the script.

Please message if you are interested. Thanks

Jobs Aggregator Site

November 29th, 2011 Comments off

Develop an effective jobs aggregator site.

Not looking for RSS based scripts. Search engines should be able to crawl the job info. The site should be search engine friendly and fully optimized with atleast 100 backlinks from established one way pr 3 or above static pages/links (Must use and guarantee white hat SEO methods only. No backlinks from farms or other shady sites, reputable directories only).
Allow monetization through affiliate programs, contextual ads, banners etc

Clean Up And Optimize WordPress Site

November 28th, 2011 Comments off

I would like to rid my site of all errors including but not limited to;

1. Crawl errors – 404, 302 etc.
2. W3C validation errors
3. Google Webmasters errors
4. Clean up wordpress database (especially taking out deleted plugins)
5. others errors (if any)

At the end of the project there should be no errors whatsoever on the site – this will be confirmed by Google webmasters tool and W3C site.

Programmer is expected to have adequate experience and the only way to prove is reviews relat…

Classified Bot Scraper/crawler

November 15th, 2011 Comments off

I need a bot to scrape / crawl ads from 3 sites in approx 6 or 7 categories. www.cityvibe.com/ http://backpage.com/FemaleEscorts/ And 1 or 2 other sites to be determined.
2. Then post the ads on my classifieds script (Via ADMIN)
3.The bot should not post ads that do not have phone numbers or email addresses. The bot should should post ads in appropriate location & category
The bot should post most recent ads. The bot should scrape and post pictures along with ads.
The bot should not repe…

Copyscape.com Search

November 14th, 2011 Comments off

This is a simple job for a programmer with the write skills. Our site is written in PHP although the site displays as ASPX it is definitely in php and the address has been changed using mod rewrite.

I am wanting a script written that will check every page of our site using copyscape.com

If a page on our site appears to be duplicate content the page will be flagged and able to be viewed on a separate page of our admin system. The script will need to search copyscape in such a way that our …

Bear