Writing a crawler

Let's look at the following structure, which is very similar to what we had when discussing the Iterator design pattern:

Page(Container(Image(),               Link(),               Image()),     Table(),     Link(),     Container(Table(),               Link()),     Container(Image(),               Container(Image(),                         Link())))

The Page is a container for other HtmlElements, but not HtmlElement by itself. Container holds other containers, tables, links, and images. Image holds its link in the src attribute. Link has the href attribute instead.

We start by creating a function that will receive the root of our object tree, a Page in this case, and return a list of all available links:

fun collectLinks(page: Page): List<String> {    // No need for intermediate variable there return LinksCrawler().run ...

Get Hands-On Design Patterns with Kotlin now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.