Skip to content

Unit 3: Finishing the Web Crawler

In the previous units we learned how to extract the first link, and then all the links from a webpage. In this unit we will be writing a Web Crawler following the extracted links. The main new topic for Unit 3 is structured data. By the end of this unit, you will have finished building a simple web crawler.

Structured data will be discussed in the form of famous list data structures. This will help us do a lot of necessary book-keeping. We shall discuss the structure of lists in Python, some popular operations on them, and finally we shall put these features to use in order to complete the web crawler. So let's get started with lists.

Back to top