We created a Chrome extension which uses AI to detect tabular or listing type data on web pages. Such data can be scraped into CSV or Excel file, no coding skills required. Our extension can also click on the “Next” page links or buttons and retrieve data from multiple pages into one file. The extension runs completely in user’s browser and does not send data to Web Robots. When testing it we benchmarked that this tool would work with the most Yelp, Amazon, Ebay, Bestbuy, Craigslist, Walmart, Etsy, Home Depot, Yellow Pages, etc. – it works on all of them.
How to use it:
- Open the first page of listing results (products, directory, etc) in your browser
- Activate the extension
- Extension will guess where your data is. If not happy use “Try another table” button to guess again.
- Download CSV or Excel from the first page if that is all you need. Or click to locate “Next” button to mark the “Next” link/button on a website.
- Click “Start crawling” to start crawling through multiple pages a website. Extension will show statistics on what is being collected.
- Download Excel or CSV file at any time during the crawl.
- Clean up Excel or CSV files – it will most likely have some unwanted additional fields that were extracted from the page. Most likely column names will have to be renamed as well.
Try another table – AI guesses an alternative table if the initial guess was not what you want.
Locate “Next” button – press this and mark the location of “Next” button or linked on a website. This will be used to scrape data from multiple pages into one file.
Crawl delay – time in seconds before going to the next page. Default value is 1 second. it can be increased when pages load information dynamically.
CSV and XLSX – file download buttons. They are active right away when any data is found.