Web scraping is a powerful technique to extract information from websites and analyze them automatically. Although you may manually do this, it can be a tedious and time-consuming task. Web scraping tools make the process faster and more efficient, all the while costing less.
Interestingly, Google Sheets has the potential to be your one-stop web scrapping tool, thanks to its IMPORTXML function. With IMPORTXML, you may easily scrape data from web pages and use it for analysis, reporting, or any other data-driven tasks.

The IMPORTXML Function in Google Sheets
Google Sheets provides a built-in function called IMPORTXML, which lets you import data from web formats such as XML, HTML, RSS, and CSV. This function can be a game-changer if you want to collect data from websites without resorting to complex coding.
Here’s the basic syntax of IMPORTXML:
XPath (XML Path Language) is a language used to navigate XML documents, including HTML—allowing you to specify the location of data within an HTML structure. Understanding XPath queries is essential to using IMPORTXML properly.
Understanding XPath
XPath provides various functions and expressions to navigate and filter data within an HTML document. A comprehensive XML and XPath guide is beyond this article’s scope, so we’ll settle for some essential XPath concepts:
How to Extract XPath From a Website
So far, you know the IMPORTXML syntax, you know the website’s URL, and you know which element you want to extract. But how do you get the element’s XPath?
You don’t have to know a website’s structure by heart to extract its data with IMPORTXML. In fact, every browser has a nifty tool that lets you instantly copy any element’s XPath.

The Inspect Element toollets you extract the XPath from website elements. Here’s how:
Now that you’ve got all you need, it’s time to see IMPORTXML in action and scrape some links.

How to Scrape Links From a Website With IMPORTXML
You can use IMPORTXML to scrape all sorts of data from websites. This includes links, videos, images, and almost any element of the website. Links are one of the most prominent elements in web analysis, and you can learn a lot about a website just by analyzing the pages it links to.
IMPORTXML lets you quickly scrape links in Google Sheets and then further analyze them using the various functions Google Sheets offers.

1. Scraping All Links
To scrape all links from a webpage, you can use the following formula:
This XPath query selects allhrefattributes ofaelements, effectively extracting all the links on the page.

The formula above scrapes all links in a Wikipedia article.
It’s a good idea to input the web page’s URL in a separate cell and then refer to that cell. This will prevent your formula from getting too long and unwieldy. You can do the same with the XPath query.
2. Scraping All Link Texts
To extract the text of the links along with their URLs, you can use:
This query selects all elements, and you can extract the link text and URLs from the results.
The formula above gets the link texts in the same Wikipedia article.
How to Scrape Specific Links From a Website With IMPORTXML
Sometimes, you may need to scrape specific links based on criteria. For example, you might be interested in extracting links that contain a particular keyword or links that are located in a specific section of the page.
With proper knowledge of XPath, you may pinpoint any element you’re looking for.
1. Scraping Links Containing a Keyword
To scrape links that contain a specific keyword, you can use the contains() XPath function:
This query selects href attributes of elements where the href contains the specified keyword.
The formula above scrapes all links that contain the word record in their text within a sample Wikipedia article.
2. Scraping Links Within a Section
To scrape links from a particular section of a page, you can specify the section’s XPath. For example:
This query selects href attributes of elements within div elements with the class “section.”
Similarly, the formula below selects all links within the div class that have the mw-content-container class:
It’s worth noting that you can use IMPORTXML for more than web scraping. You can use the IMPORT family of functions toimport data tables from websites to Google Sheets.
Although Google Sheets and Excel share most of their functions, the IMPORT family of functions is unique to Google Sheets. You’ll need to consider other methods toimport data from websites to Excel.
Simplify Web Scraping with Google Sheets
Web scraping with Google Sheets and the IMPORTXML function is a versatile and accessible way to collect data from websites.
By mastering XPath and understanding how to create effective queries, you’re able to unlock the full potential of IMPORTXML and gain valuable insights from web resources. So, start scraping and take your web analysis to the next level!