How to scrape JavaScript webpages using Selenium in Python

Render JavaScript webpages by yourself

Lynn G. Kwong
5 min readFeb 5, 2022

Due to the increasing popularity of modern JavaScript frameworks such as React, Angular, and Vue, more and more websites are now built dynamically with JavaScript. This poses a challenge for web scraping because the HTML markup is not available in the source code. Therefore, we cannot scrape these JavaScript webpages directly and need to render them as regular HTML markup first.

In a previous post, we introduced how to scrape JavaScripe webpages with ProxyCrawl, a handy web service that can be used to help scrape JavaScript webpages. However, ProxyCrawl is not free to use and can be costly if a large number of JavaScript webpages need to be scraped frequently. In this post, we will introduce how to use Selenium to render JavaScript webpages. Selenium is an open-source library primarily used for automating web applications for testing purposes. However, in this post, we will not use it to automate frontend code testing, but just use it to render a JavaScript webpage as HTML markup which can then be used for web scraping.

The demo site to be used in this tutorial is http://quotes.toscrape.com/js/. If you open this website, right-click on the webpage and select “View page source”, you can only see some JavaScript code and not the…

--

--

Lynn G. Kwong
Lynn G. Kwong

Written by Lynn G. Kwong

I’m a Software Developer (https://youtube.com/@kwonglynn) keen on sharing thoughts, tutorials, and solutions for the best practice of software development.

Responses (1)