ProWebScraper has its own Selector to select the elements that you would like to scrape from the website. You can simply click on each element on the page to select and extract.
In some cases, elements you want to extract from the page is hidden or not selectable in the selector,
At that time you can use an alternative way: Manual XPaths or CSS selector to select those elements on the page.
Here are a few cases where it might be necessary (or at least a lot more elegant) to use manual XPath. This includes when...
- you can't select a specific element
- you need data, such as latitude and longitude, from a map where it isn't readily available
- hidden elements
- Select data from a drop-down list or undisplayed tab
If you are not familiar with XPath, then you will first want to start with an ultimate guide to XPath.
How to set manual XPath in scraper
To use XPath to select any element
- Go to Selector → Create a new column by clicking on the button Add New Column, then click the Column Options dropdown arrow and select Use CSS selector.
- The Use Manual XPath dialogue box appears.
- Enter Manual path / XPath: enter XPath of the element you want to select.
- Enter attribute: enter attribute node of element node (this field is optional).
- Cancel: Closes screen and saves your regular expression.
- Apply: Applies your regular expression.
The following examples demonstrate possible ways to select elements using manual XPath.
Example: Scraping Hidden Latitude/Longitude
This example demonstrates how to use Set manual Xpath to take hidden ratings from a web page.
- Enter below details at Manual XPath Dialog box:
- Enter XPath: //div[@id="property-map"]
- Enter attribute: data-lat