What is your philosophy?
Our goal is to provide a end-to-end web scraping service that allows our clients to focus on their business model. Since the Internet is evolving so rapidly and the means to secure the information being volatile, a company would need to worry about hiring individuals, writing scripts, purchasing generic software and learn how to use and configure the web scraping software almost monthly.
Do I have to install anything on my servers or configure anything?
No. All of our web scraping programs run from our data centers. Our experienced team programmers find and script the best way to crawl and scrape data from the site. All you have to do is give us the targeted website and we'll get you the data.
Another programming company told me that scraping website abc.com is impossible because of varies reasons, can you do it?
We'd sure like to try. Much of our business comes from clients that have been turned away by other programming companies for various reasons. If we can't extract it, there is never a charge to you. It is a rare occasion that we have not been able to harvest the data from a targeted website.
We have dozens of sites we need scraped. Can you handle projects that large?
Yes. Though a majority of our clients require us to scrape one or two websites or limited amounts of data, but our technology, infrastructure, and team are built to scale for enterprise projects. A few of our projects have required us to scrape 100+ websites and over 40 million records.
How are the web scraping costs determined?Costs are determined on scope and frequency of data, volume of data and complexity of target website. There is no setup fee or any other hidden charges other than what will be quoted.
After I give you a site, how long does it take to get the data?
In most cases, we will have setup the scraper in 2-3 business days and data being available the next business day. For complex sites or sites with a large amount of data, it could take upto a week to scrape the data.
How frequently can you provide me with the data?
The frequency depends upon underlying structure of the target website. Some responsive sites allow us to read over 200,000 pages / day while others no more than few thousand / day. We would need to review the project before we can reach a definite conclusion.
What are the various formats in which you can deliver the data to me? And how?
We are able to deliver the scraped data in any format that is needed, such as MS Access, MS-SQL backup file, Microsoft Excel, CSV (Comma / Tab Separated) file, XML, MySQL script etc.
We can send you files through e-mail for small amounts of data (less than 10MB), or make them available on our FTP servers or push them to an FTP server you specify.
Can a website block scraping?
Yes a website can block scraping. There are a number of ways scraping can be blocked, such as adding image verification (CAPTCHA) system before results are displayed or blocking the IP Addresses from which requests are coming, by monitoring traffic etc.
Can you get around blocking?
Yes, we can get around blocking. We've been improving the core technology for more than six years and have learned to overcome many issues that most newer and off-the-shelf solutions have not.
Can you scrape non-English websites?
Yes. We have scraped Spanish, Chinese, German, and other non-English sites as well.
Are you able to scrape the comparison shopping engines (CSEs) such as Google product search?
Yes. We have scraped similar sites for a few of our clients.
Do you have limits on the types of sites/data that you will collect?
We will not consider any projects that target websites related to gambling, lottery, pornography or have otherwise "adult content", or illegal content. We reserve the right to refuse any scraping project at any time.
Is there any difference between Data Extraction, Web Scraping, Web Harvesting, Data Mining etc.?
When referenced in context of automation of the task - manual copy paste of information from a website, they are all the same. Web scraping involves simulating a human web browsing using computer software.
Is Web scraping legal?
When it comes to web scraping public information, then there definitely is no legal issue behind it. There is nothing illegal about grabbing the exchange rates from remote sites or scraping thousands and even millions of documents, movie files, and PDFs from other sites. Some websites, however, limit web scraping by mentioning it within their terms of use. But to this day, the legality of web scraping remains ambiguous. Danish Maritime and Commercial Court (Copenhagen) has found that web scraping is not in conflict with the database directive of the European Union. Within the United States, many cases of web scraping have been dismissed. However, in 2008, an Irish airline filed a suit against a website that was web scraping its ticket availability information to sell tickets. Courts are yet to release a verdict in this case.