How to scrape JavaScript webpages using ProxyCrawl in Python

Learn a simple way to scrape JavaScript webpages

Lynn G. Kwong
5 min readFeb 4, 2022

Due to the increasing popularity of modern JavaScript frameworks such as React, Angular, and Vue, more and more websites are now built dynamically with JavaScript. This poses a challenge for web scraping because the HTML markup is not available in the source code. Therefore, we cannot scrape these JavaScript webpages directly and need to render them as regular HTML markup first. In this article, we will introduce how to render JavaScript webpages using ProxyCrawl, a handy web service that can be used to help scrape JavaScript webpages.

Image by gTheMesh on Pixabay.

The demo site to be used in this tutorial is http://quotes.toscrape.com/js/. If you open this website, right-click on the webpage and select “View page source”, you can only see some JavaScript code and not the HTML markup. Luckily for this site, the data is included in the <script> tag. However, for many websites, especially those created with Angular, there is little data in the JavaScript code and you must render it before you can scrape it. For example:

--

--

Lynn G. Kwong

I’m a Software Developer (https://medium.com/@lynn-kwong) keen on sharing thoughts, tutorials, and solutions for the best practice of software development.