Web Data Extractor Pro 3.10

License: Free Trial ‎File size: 8.31 MB
‎Users Rating: 4.3/5 - ‎15 ‎votes

Web Data Extractor Pro is a web scraping tool specifically designed for mass-gathering of various data types. It can harvest URLs, phone and fax numbers, email addresses, as well as meta tag information and body text. Special feature of WDE Pro is custom extraction of structured data. This high-speed and multithreaded program works by using a keyword into search engines, by spidering a website or a list of URLs from a file. You can also allow it to follow external links from the original pages, with the capability to go as deep into the URL paths as you need and actually search the entire Internet. Web Data Extractor is superior for harvesting structured information and specific data types related to the keywords you provide by searching through multiple layers of websites.

VERSION HISTORY

  • Version 3.10 posted on 2020-01-06
    Significantly improved parser of email addresses; User agents list has been updated; Added "Retry non-extracted URLs" and "Enhanced Human factor" options in Connection for even more effective work with target websites; Added options "Check each X minutes" and "Renew after it has read Y number of links" in Proxy Servers for more effective work with proxies; Many improvements have been made according to our customers reviews!
  • Version 3.9 posted on 2018-12-30
    List of search engines is cleared of outdated/broken links. This allowed us to increase the speed of the software in Search engines mode;Significantly improved email addresses parser, especially for JS (JavaScript) hidden emails;Improved option to import own proxy servers from CSV files;Improved work with HTTPS websites;Improved performance when working with large URL lists;Improved "Cookie Capture" option;Various minor fixes/improvements according to customers feedbacks
  • Version 3.8 posted on 2017-12-29
    Added ability to load and extract information from PDF files;Added ability to load the license file directly from the UI form, when the trial period of using the program expires. Alternatively, the license file can be uploaded from the Options -> About form if the trial period has not yet expired;Significantly improved work through the proxy servers;Parser of encoded JS-emails has been improved;The context menu item "Re-start URL" was added to the "Bad URLs" list;Improved work with the software internal data repository;Added the ability to delete sessions along with all its data and the service files, also software automatically compress the internal repository of the program to reduce the required disk space;Added "Initial Referrer" text field in UI. Some websites may display different information depending on which external site they come from. The "Initial Referrer" field allows you to specify the web address of such a site;We also made various minor changes and improvements based on feedbacks from our customers
  • Version 3.7 posted on 2017-02-28
    Improved work of "Search Engines" mode;Improved "Remove HTML Tags" and "Page must contain the following text to extract data" filters;Added "Use country IP filter" filter which allows to exclude results of servers which does not related (by geolocation) to country selected in "Search Engines option;Significantly improved email parser and Custom Builder parser;General improvements in data detection and extraction;We also made various minor changes and improvements based on feedbacks from our customers
  • Version 3.6 posted on 2016-08-22
    Added checkbox "Get redirected URL" on the "Custom Data Editor" form to extract urls (e.g. website addresses) that are presented through a redirect; Added checkbox "Mark Non-Responding Proxies Like Inactive Automatically". If during the session proxy server determined as bad (not working), it is automatically marked as inactive, and its not used in the session; Added new option "Use single line merge" to merge data into a single string. For example, you can export t-shirt colors like: "T-Shirt", "Black, Yellow, Red, Green; Significantly improved loading of public proxy servers from the Internet; "Human Factor" option has been improved; Improved a parser of closed by JS email adresses; Improved option of passing Google-captcha when searching data via Google; We also made various minor changes and improvements based on feedbacks from our customers
  • Version 2.0 posted on 2012-08-29
    Reworked the algorithm for determining the depth of scan, Program sustainability to the physical damage of the database is added, Improved streams control, which has a positive impact on the overall performance, Improved work with a large list of keywords in "Search Engines" mode

Program Details