I would like a program written to scrape google and google places.
The program should accept a search phrase and a maximum number of pages to search (defaulted to 10).
The program should them scrape the Google Places search results and extract the following data:
Title
URL
Description
Address
Phone
Ranking Position
# reviews
Star rating (if available)
Google Places Page URL
The program should them scrape the Google search results and extract the following data:
Title
URL
Description
Ranking Position
# reviews
PPC Add running (Y/N)
The data should be presented in a grid that can be sorted/saved to a csv file.
The user should be able to select one or more rows from the grid and request additional details. If additional details are requested then:
The program should them scrape the Google Places URL of the selected item and extract the following data:
Has the listing been claimed?
Categories
# Photos
# Videos
# reviews
Star rating (if available)
The program should them scrape the Website URL of the selected item and extract the following data:
Title Tag
Meta keywords
Meta Desscription
H1 Tag
Email addresses
Is there an xml file
Is there a kml file
Facebook link
Twitter link
These results should be updated in the grid.
The program should have a configurable wait delay between requests to google, this is needed to prevent google from banning the users IP.
This program needs to be able to run on windows, will need to be wrapped up into an installer package and all source code should be provided upon project completion.
This program also needs to be written in a modular fashion as I may want to add addition sources (yellow pages, etc.) in the future to scrap data from.
When replying, please indicate what experience you have developing scrapers like this and how long this will take you to develop and what language you will code this in.
Thanks
Steve