Crawler

April 10th, 2006
Posted in .NET, C/C++, Windows

I need a piece of software to add to our crawler to identify spam websites, link farm spam pages, doorway pages, and pages that falsely inflate their rankings. It would need to be written in C++, and would have to compilable in Visual Stuido.net 2003. I would need the source code so we can make adjustments and add it to our crawler.

Experience in this type of crawler only. If you have not done this type of work before, please do not apply. I need this as soon as possible. We also have several other projects like this. Please post samples
(Budget: $100-300, Keywords: .NET, C/C++, Windows)







Related Projects

Demo Crawler–By Merrows on Oct 7–Max Bid: $500


A demo crawler to crawler 4 pages on a test website. The crawler will crawl the 4 pages and read the.asp page. The first page has 3 links. The linked to pages have three pieces of data as follows: Na (more...) Keywords:Microsoft Windows,Language Specific,Requirements,Operating Systems / Platforms,C#,Misc (software related),Software Related (Includes Websites)Place Your Bid Post similar project 

Web Crawler Fix


I have a web crawler and I need it to grab keywords from pages it self not the meta tags. All keywors must be taken from the page and put into the database. Let me know if you need more details.Place Your Bid Post similar project 

Web Crawler


Need a crawler to crawl the web and rank everything it grabs, would need it to grab meta tags: title description only, for keywords needs to query the page and grab keywords from the page it self NOT The Tags. For web crawler needs to grab tons of results grab keywords from the pages and also need to rank them.Place Your Bid Post similar project 

Google adwords crawler by programmingbids


A crawler for Google Adwords that can populate a database of keyword prices for a given set of keywods. (Budget: $250-750, Jobs: Windows)Place Your Bid Post similar project 

E-Mail Crawler


We are looking for an email crawler to use to collect emails from web pages/url´s. This email crawler must have the following functionalities:1) We must be able to submit one of more webpage’s that it should collect emails from.2) The program must use all possible threads on the system, and it must not lock the main application thread so we can not add more URL etc. to collect emails from.3) The crawler must collect all emails from the specified url/domain and ONLY emails from the same url/domain. This mean that if it find two links on the first page in

C# web crawler


I would like a web crawler written in C#Requirements:- based on a specific list of websites, be able to crawl the entire content of the sites- need some strategy to permit the crawling to scale to a decent level using a single machine; so I would expect the crawler to use multithreading or asychnronous I/O to reach processing speeds of at least 5 pages/secondSome other questions:- do you have a strategy for handling dynamic pages? I.e. crawling a site in which most of the content is hidden behind a form?- what is your approach for making sure the crawler doesn't

Web crawler to spider search engine results


I need a Windows XP program to do the following:1. User types in a search term, file extention and selects whether to crawl only within search result domain or to through every site linked to within tree2. Code will then perform a google search on a search term that is input by the user3. For each resulting URL, follow all links on every page from that URL down (only within the same domain as the search result or not as specified by the user) and record which page contains a given file type (and how many files of the file

FlashPaper crawler–By David Brick on Sep 7–Max Bid: Open to fair suggestions


Develop crawler that will crawl flashpaper document at maximum resolution and save all pages on disk$0$0$0$0Example of document is http://www.monitor.si/listalnik.php?datum=200702. Crawler should:$0$ (more...) Keywords:Flash,Microsoft Windows,Requirements,Operating Systems / Platforms,Misc (software related),Web Services,Software Related (Includes Websites)Place Your Bid Post similar project 

Web Crawler ! by programmingbids


What I need done: I need a web crawler that is able to scrape information from websites. The informtion will vary so the crawler will need to have the ability to be used on different sites for slightly different uses... (Budget: $30-250)Place Your Bid Post similar project 

Crawler


I need a fast search engine crawler to crawl the web and find websites that fit the simple algorithym (I will assit the programmer with details) and store the --> web addresses <-- NOT CONTENT that fit the simple algorithym in a database or text file. The web crawler should have many threads and should be able to crawl many websites everyday. (When submitting your bid please specify the number of pages the crawler will do per day) I should also be able to specify if I want a specific page to be crawled. I need this ASAP so be

Ad Sense Crawler


We need a crawler - multi thread in either C# or C++ that we can add to our existing crawler to search for any google ads in our db or when spidering the web. It would have to be able to find all instances of the ads including googlesyndication.com and the java script for adsense and any other words or scripts used. We would need the source so we can include in our current crawler - would also like it to include a search for all yahoo sponsered ads, msn ads and ask if possible. Need it asap.

SSL Crawler by stevewduncan


I would like a web crawler that could visit lop level domains and parse information from their SSL crtificates. For example, the crawler would go to https://www.SAMPLE.com and determine that there... (Budget: $1500-3000)Place Your Bid Post similar project 

Yellow Pages Crawler


Hi, I need someone to make an email crawler for the swedish yellow pages. This should be done in PH, Visual C++ or C#, Delphi, Java or Python. http://gulasidorna.eniro.se/ The program/script should work like this: 1. First I enter "Rubrik" (title) and "Område" (area), click Crawl. 2. All search hits should be crawled and save the crawled data into an xml-file. 3. The emails should be in text format. In eniro they are in image format so they need to be converted back to text, using some so...Place Your Bid Post similar project 

Page crawler


I need a FAST crawler to crawl a site and gather meteorological information and insert it into a database on regular intervals (about every 10 min or so).There are, I believe about 2000 pages to crawl and about 10 pieces of data to collect from each page. (I have the list ready), In addition, one field in the databse will be needed for time, and one time zone will need to be collected from another NTS server.This database will be made available to some client software, that will be developed after this is done.If good work is done on this

Improvements to crawler by PatrickKahuna


I need to improve an existing crawler so that it will very query speed by time of day through the us of a programable table. In addition, crawler must work on multiple sites. (Budget: $30-250, Jobs: .NET, PHP, Python)Place Your Bid Post similar project 

Web Crawler Required by rockerstech


Hello Bidder, i am looking for some of web crawler which will full fill my requirement ----Crawler Must crawl list of website and brings data from it and store in database -----i can... (Budget: $30-250, Jobs: Java, PHP)Place Your Bid Post similar project 

Crawler


Hi, I need Crawler, able to index/reindex pages and download content making xml file for each page. Here are main requirements: * Can be scheduled * The Agent can accept multiple crawl start locations per web site * Support for robots.txt * Forbiden string in url (for example do not follow ?, %, or keyword) * Can leave domain / do not leave domain * Max pages per domain (user input) * The agent can support exclusions of files beyond that of the servers standard robots.txt * Specify how man...Place Your Bid Post similar project 

web Crawler/Extractor by akmm


Web Extractor/Crawler: Need developer with exp with Web Extractor/Crawler perferably already havig a script if you already have something similar that would be great. We need to develop a crawler... (Budget: $30-250, Jobs: ASP, C/C++, Java, Javascript, Website Design)Place Your Bid Post similar project 

Goggle And Yahoo Crawler Clone - Budget Over $2,500+


Hello,We are currently seeking professionals who can build a very extensive clone of the Google bot and the Yahoo Slup bot. The crawler needs to be VERY extensive, and we will be running it 24 hours per day 7 days per week on 4 different dedicated high end servers, to create our very own search index. We do not have a certain requirement as to how many sites the crawler must crawl per day, but we eventually want a database that is very comparable to Yahoo’s and Google’s. By or own research Google currently has over billions and

Crawler III–By Merrows on Oct 2–Max Bid: $2,000


The develop or enhance an existing c# web crawler to handle .asp sites. The target site is ttp://www.hudsonreed.co.uk">http://www.hudsonreed.co.uk. The crawler will get the images and da (more...) Keywords:Microsoft Windows,Database,Language Specific,Requirements,Operating Systems / Platforms,ASP,C#,Misc (software related),Microsoft Access,ASP .NET,Software Related (Includes Websites)Place Your Bid Post similar project 

  • PHP Web Development and Hosting Freelance career guide
  • Advertise Here

  • Dedicated Django developer and ruby programmer available for hire for $5/hour.


FreeLance Home Jobs is Digg proof thanks to caching by WP Super Cache!