Home | How it works | Projects archive | Contact Us
Air Compressor Bot
 
The Career Path of Freelance Programming Jobs 

   Web Based Continuous Query Bot

Bidding Time:
30/06/2008 13:14 - 14/07/2008 13:14
Budget:
Maximum $500
Status:
Closed


Job Type:
PHP, XML, MySQL
Description:



This is to be a php driven script
It would be set to run regularly via a cron function (every minute or hour or
day)
It would take and compile the desired information into a MySQL database, which
would be sortable by the various fields

We need to be able to regularly collect information about corporations (LLCs
actually) that are filed with the state of New York. Here is the link to the
database entry query:

http://appsext8.dos.state.ny.us/corp_public/corpsearch.entity_search_entry

Now here is the issue. You can only access one record at a time - unless I am
mistaken there is no way to get a tabular record of all entries, or even
subgroups, and we are talking hundreds of thousands of entries. The important
field that we are interested in is the date of formation, and we will be
interested in to groupings. One would be all entities formed after a certain
date (as of a few years ago). We will also want to get a list of companies newly
formed since the last time (say in the past week, if we set it to cycle through
weekly). So ultimately we would need a script/bot/spider that would essentially
be continuously going through the site. Is this doable? Can you describe in
general terms how you would approach it and what you feel is possible? What
additional information can I provide?
We would also want to be able to generate XML files of any record t will.
I am assuming this would be done with cron web query, but I am open to whatever
is the most efficient way to accomplish this. Please let me know whatever
information I can provide.
Additional Info (Added 7/1/2008
at 15:20 EST)...

OK - let me add some clarification:
1.) The amount of records in the New York Corporations DB will be in the
millions, not hundreds of thousands.
2.) Even at one query per second it would take 1-2 years, perhaps more. I need
to get it down to one month or less. We will need to have multiple programs
running and an algorithm to construct intelligent queries. Please be prepared to
detail how you would approach this.
3.) It does not have to be PHP driven, that was just an assumption on my part. I
want whatever is the best, most efficient way to extract this information.
4.) Here are the ultimate parameters: Must run on a linux server. The records I
want are as follows: Entity Type: DOMESTIC LIMITED LIABILITY COMPANY or FOREIGN
LIMITED LIABILITY COMPANY, with an initial DOS Filing date after December 31,
1998, and the LLC must be active. The records must be captured in a MySQL
database, with full PHP admin for searching/sorting/editing and XML file
generation. Must be able to extract the required information in one month or
less, and run continuously thereafter for an indefinite period.

Start your work-at-home career for $7.00. Get direct access to thousands of freelance and home-based jobs. Click here to find work now.

Related Projects:
Recruitment Site
English to German Translation Project
Adult community site
3D Logo Needed
Fix few CSS problems

This project is the proprietary information of . Click here to remove this project from OUR database.
Operating System:
N/A
Database System:
N/A
<<< back

Recent Projects Archive:

Friday - Thursday - Wednesday - Tuesday - Monday - Sunday - Saturday

View all freelance web projects

 
Home | Projects archive | RSS | Resources | Links | Contact Us © 2004-2008 ProjectsList.biz /0.513