Chapter 4. Browser-Agnostic Features

This chapter reviews those features of Selenium WebDriver that are interoperable in different web browsers. In this group, a relevant multipurpose characteristic is executing JavaScript. Also, the Selenium WebDriver API allows configuring timeouts for page and script loading. Another convenient feature is making screenshots of the browser screen, or only the portion corresponding to a given element. Then, we can manage different aspects of the controlled browser using WebDriver, such as browser size and position, history, or cookies. Then, WebDriver provides various assets for controlling specific web elements, such as dropdown lists (i.e., HTML select fields and data lists), navigation targets (i.e., windows, tabs, frames, and iframes), or dialog boxes (i.e., alerts, prompts, confirmations, and modal dialogs). Finally, we discover how to handle local and session data using web storage, implement event listeners, and use the exceptions provided by the Selenium WebDriver API.

Executing JavaScript

JavaScript is a high-level programming language supported by all major browsers. We can use JavaScript in the client side of web applications for a wide variety of operations, such as DOM manipulation, user interaction, handling requests-responses from remote servers, or working with regular expressions, among many other functions. Luckily for test automation, Selenium WebDriver allows injecting and executing arbitrary pieces of JavaScript. To that aim, Selenium WebDriver API provides the interface JavascriptExecutor. Table 4-1 introduces the available public methods in this interface grouped into three categories: synchronous, pinned, and asynchronous scripts. The subsections following provide more details and illustrate their use through different examples.

Table 4-1. JavascriptExecutor methods
Category Method Return Description

Synchronous scripts

executeScript(
    String script,
    Object... args)
Object

Execute JavaScript code on the current page.

Pinned scripts

pin(String
    script)
ScriptKey

Attach a piece of JavaScript to a WebDriver session. The pinned scripts can be used multiple times while the WebDriver session is alive.

unpin(ScriptKey
    key)
void

Detach a previously pinned script to the WebDriver session.

getPinnedScripts()
Set<ScriptKey>

Collect all pinned scripts (each one identified by a unique ScriptKey).

executeScript(
    ScriptKey key,
    Object... args)
Object

Call previously pinned script (identified with its ScriptKey).

Asynchronous scripts

executeAsyncScript(
    String script,
    Object... args)
Object

Execute JavaScript code (typically an asynchronous operation) on the current page. The difference with executeScript() is that scripts executed with executeAsyncScript() must explicitly signal their termination by invoking a callback function. By convention, this callback is injected into the script as its last argument.

Any driver object that inherits from the class RemoteWebDriver also implements the JavascriptExecutor interface. Therefore, when using a major browser (e.g., ChromeDriver, FirefoxDriver, etc.) declared using the generic WebDriver interface, we can cast it to JavascriptExecutor as shown in the following snippet. Then, we can use the executor (using variable js in the example) to invoke the methods presented in Table 4-1.

WebDriver driver = new ChromeDriver();
JavascriptExecutor js = (JavascriptExecutor) driver;

Synchronous Scripts

The method executeScript() of a JavascriptExecutor object allows executing a piece of JavaScript in the context of the current web page in a WebDriver session. The invocation of this method (in Java) blocks the control flow until the script terminates. Therefore, we typically use this method for executing synchronous scripts in a web page under test. The method executeScript() allows two arguments:

String script

Mandatory JavaScript fragment to be executed. This code is executed in the body of the current page as an anonymous function (i.e., a JavaScript function without a name).

Object... args

Optional arguments script. These arguments must be one of the following types: number, boolean, string, WebElement, or a List of these types (otherwise, WebDriver throws an exception). These arguments are available in the injected script using the arguments built-in JavaScript variable.

When the script returns some value (i.e., the code contains a return statement), the Selenium WebDriver executeScript() method also returns a value in Java (otherwise, executeScript() returns null). The possible returned types are:

WebElement

When returning an HTML element

Double

For decimals

Long

For nondecimal numbers

Boolean

For boolean values

List<Object>

For arrays

Map<String, Object>

For key-value collections

String

For all other cases

The situations that require executing JavaScript with Selenium WebDriver are very heterogeneous. The following subsections review two cases where the Selenium WebDriver does not provide built-in features, and instead, we need to use JavaScript to automate them: scrolling a web page and handling a color picker in a web form.

Scrolling

As explained in Chapter 3, Selenium WebDriver allows impersonating different mouse actions, including click, right-click, or double-click, among others. Nevertheless, scrolling down or up a web page is not possible using the Selenium WebDriver API. Instead, we can achieve this automation easily by executing a simple JavaScript line. Example 4-1 shows a basic example using a practice web page (see the URL of this page in the first line of the test method).

Example 4-1. Test executing JavaScript to scroll down a pixels amount
@Test
void testScrollBy() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/long-page.html"); 1
    JavascriptExecutor js = (JavascriptExecutor) driver; 2

    String script = "window.scrollBy(0, 1000);";
    js.executeScript(script); 3
}
1

Open a practice web page containing very long text (see Figure 4-1).

2

Cast the driver object to JavascriptExecutor. We will use the variable js to execute JavaScript in the browser.

3

Execute a piece of JavaScript code. In this case, we call the JavaScript function scrollBy() to scroll the document by a given amount (in this case, 1,000 px down). Notice that this fragment does not use return, and therefore, we do not receive any returned object in the Java logic. In addition, we are not passing any argument to the script.

hosw 0401
Figure 4-1. Practice web page with long content

Example 4-2 shows another test using scrolling and the same example web page as before. This time, instead of moving a fixed number of pixels, we move the document scroll until the last paragraph in the web page.

Example 4-2. Test executing JavaScript to scroll down to a given element
@Test
void testScrollIntoView() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/long-page.html");
    JavascriptExecutor js = (JavascriptExecutor) driver;
    driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10)); 1

    WebElement lastElememt = driver
            .findElement(By.cssSelector("p:last-child")); 2
    String script = "arguments[0].scrollIntoView();"; 3
    js.executeScript(script, lastElememt); 4
}
1

To make this test robust, we specify an implicit timeout. Otherwise, the test might fail if the page is not entirely loaded when executing the subsequent commands.

2

We locate the last paragraph in the web page using a CSS selector.

3

We define the script to be injected into the page. Notice the script does not return any value, but as a novelty, it uses the first function argument to invoke the JavaScript function scrollIntoView().

4

We execute the previous script, passing the located WebElement as an argument. This element will be the first argument for the script (i.e., arguments[0]).

The last example of scrolling is infinite scroll. This technique enables the dynamic loading of more content when the user reaches the end of the web page. Automating this kind of web page is an instructive use case since it involves different aspects of the Selenium WebDriver API. For example, you can use a similar approach to crawl web pages using Selenium WebDriver. Example 4-3 shows a test using an infinite scroll page.

Example 4-3. Test executing JavaScript in an infinite scroll page
@Test
void testInfiniteScroll() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/infinite-scroll.html");
    JavascriptExecutor js = (JavascriptExecutor) driver;
    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10)); 1

    By pLocator = By.tagName("p");
    List<WebElement> paragraphs = wait.until(
            ExpectedConditions.numberOfElementsToBeMoreThan(pLocator, 0));
    int initParagraphsNumber = paragraphs.size(); 2

    WebElement lastParagraph = driver.findElement(
            By.xpath(String.format("//p[%d]", initParagraphsNumber))); 3
    String script = "arguments[0].scrollIntoView();";
    js.executeScript(script, lastParagraph); 4

    wait.until(ExpectedConditions.numberOfElementsToBeMoreThan(pLocator,
            initParagraphsNumber)); 5
}
1

We define an explicit wait since we need to pause the test until the new content is loaded.

2

We find the initial number of paragraphs on the page.

3

We locate the last paragraph of the page.

4

We scroll down into this element.

5

We wait until more paragraphs are available on the page.

Color picker

A color picker in HTML is an input type that allows users to select a color by clicking and dragging the cursor using a graphical area. The practice web form contains one of these elements (see Figure 4-2).

hosw 0402
Figure 4-2. Color picker in the practice web form

The following code shows the HTML markup for the color picker. Notice that it sets an initial color value (otherwise, the default color is black).

<input type="color" class="form-control form-control-color" name="my-colors"
        value="#563d7c">

Example 4-4 illustrates how to interact with this color picker. Because the Selenium WebDriver API does not provide any asset to control color pickers, we use JavaScript. In addition, this test also illustrates the use of Color, a support class available in the Selenium WebDriver API for working with colors.

Example 4-4. Test executing JavaScript to interact with a color picker
@Test
void testColorPicker() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html");
    JavascriptExecutor js = (JavascriptExecutor) driver;

    WebElement colorPicker = driver.findElement(By.name("my-colors")); 1
    String initColor = colorPicker.getAttribute("value"); 2
    log.debug("The initial color is {}", initColor);

    Color red = new Color(255, 0, 0, 1); 3
    String script = String.format(
            "arguments[0].setAttribute('value', '%s');", red.asHex());
    js.executeScript(script, colorPicker); 4

    String finalColor = colorPicker.getAttribute("value"); 5
    log.debug("The final color is {}", finalColor);
    assertThat(finalColor).isNotEqualTo(initColor); 6
    assertThat(Color.fromString(finalColor)).isEqualTo(red);
}
1

We locate the color picker by name.

2

We read the initial value of the color picker (it should be #563d7c).

3

We define a color to work with using the following RGBA components: red=255 (maximum value), green=0 (minimum value), blue=0 (minimum value), and alpha=1 (maximum value, i.e., fully opaque).

4

We use JavaScript to change the value selected in the color picker. Alternatively, we can change the selected color invoking the statement colorPicker.sendKeys(red.asHex());.

5

We read the resulting value of the color picker (it should be #ff0000).

6

We assert that the color is different from the initial value, but as expected.

Pinned Scripts

The Selenium WebDriver API allows you to pin scripts in Selenium WebDriver 4. This feature enables attaching JavaScript fragments to a WebDriver session, assigning a unique key to each snippet, and executing these snippets on demand (even on different web pages). Example 4-5 shows a test using pinned scripts.

Example 4-5. Test executing JavaScript as pinned scripts
@Test
void testPinnedScripts() {
    String initPage = "https://bonigarcia.dev/selenium-webdriver-java/";
    driver.get(initPage);
    JavascriptExecutor js = (JavascriptExecutor) driver;

    ScriptKey linkKey = js
            .pin("return document.getElementsByTagName('a')[2];"); 1
    ScriptKey firstArgKey = js.pin("return arguments[0];"); 2

    Set<ScriptKey> pinnedScripts = js.getPinnedScripts(); 3
    assertThat(pinnedScripts).hasSize(2); 4

    WebElement formLink = (WebElement) js.executeScript(linkKey); 5
    formLink.click(); 6
    assertThat(driver.getCurrentUrl()).isNotEqualTo(initPage); 7

    String message = "Hello world!";
    String executeScript = (String) js.executeScript(firstArgKey, message); 8
    assertThat(executeScript).isEqualTo(message); 9

    js.unpin(linkKey); 10
    assertThat(js.getPinnedScripts()).hasSize(1); 11
}
1

We attach a JavaScript fragment to locate an element in the web page. Notice that we could do the same with the standard WebDriver API. Nevertheless, we use this approach for demo purposes.

2

We attach another piece of JavaScript that returns whatever we pass to it as a first parameter.

3

We read the set of pinned scripts.

4

We assert the number of pinned scripts is as expected (i.e., 2).

5

We execute the first pinned script. As a result, we get the third link in the web page as a WebElement in Java.

6

We click on this link, which should correspond to the practice web link. As a result, the browser should navigate to that page.

7

We assert the current URL is different from the initial one.

8

We execute the second pinned script. Notice that it is possible to run the pinned script even though the page has changed in the browser (since the script is attached to the session and not to a single page).

9

We assert the returned message is as expected.

10

We unpin one of the scripts.

11

We verify the number of pinned scripts is as expected (i.e., 1 at this point).

Asynchronous Scripts

The method executeAsyncScript() of the JavascriptExecutor interface allows executing JavaScript scripts in the context of a web page using Selenium WebDriver. In the same way that executeScript() explained previously, executeAsyncScript() executes an anonymous function with the provided JavaScript code in the body of the current page. The execution of this function blocks the Selenium WebDriver control flow. The difference is that in executeAsyncScript(), we must explicitly signal the script termination by invoking a done callback. This callback is injected into the executed script as the last argument (i.e., arguments[arguments.length - 1]) in the corresponding anonymous function. Example 4-6 shows a test using this mechanism.

Example 4-6. Test executing asynchronous JavaScript
@Test
void testAsyncScript() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    JavascriptExecutor js = (JavascriptExecutor) driver;

    Duration pause = Duration.ofSeconds(2); 1
    String script = "const callback = arguments[arguments.length - 1];"
            + "window.setTimeout(callback, " + pause.toMillis() + ");"; 2

    long initMillis = System.currentTimeMillis(); 3
    js.executeAsyncScript(script); 4
    Duration elapsed = Duration
            .ofMillis(System.currentTimeMillis() - initMillis); 5
    log.debug("The script took {} ms to be executed", elapsed.toMillis());
    assertThat(elapsed).isGreaterThanOrEqualTo(pause); 6
}
1

We define a pause time of 2 seconds.

2

We define the script to be executed. In the first line, we define a constant for the callback (i.e., the last script argument). After that, we use the JavaScript function window.setTimeout() to pause the script execution for a given amount of time.

3

We get the current system time (in milliseconds).

4

We execute the script. If everything works as expected, the test execution blocks in this line for second seconds (as defined in step 1).

5

We calculate the time required to execute the previous line.

6

We assert the elapsed time is as expected (typically, some milliseconds above the defined pause time).

Tip

You can find an additional example that executes an asynchronous script on “Notifications”.

Timeouts

Selenium WebDriver allows specifying three types of timeouts. We can use them by invoking the method manage().timeouts() in the Selenium WebDriver API. The first timeout is the implicit wait, already explained in “Implicit Wait” (as part of waiting strategies). The other options are page loading and script loading timeouts, explained next.

Page Loading Timeout

The page loading timeout provides a time limit to interrupt a navigation attempt. In other words, this timeout limits the time in which a web page is loaded. When this timeout (which has a default value of 30 seconds) is exceeded, an exception is thrown. Example 4-7 shows an example of this timeout. As you can see, this piece of code is a dummy implementation of a negative test. In other words, it checks unexpected conditions in the SUT.

Example 4-7. Test using a page loading timeout
@Test
void testPageLoadTimeout() {
    driver.manage().timeouts().pageLoadTimeout(Duration.ofMillis(1)); 1

    assertThatThrownBy(() -> driver
            .get("https://bonigarcia.dev/selenium-webdriver-java/"))
                    .isInstanceOf(TimeoutException.class); 2
}
1

We specify the minimum possible page loading timeout, which is one millisecond.

2

We load a web page. This invocation (implemented as Java lambda) will fail since it is impossible to load that web page in less than one millisecond. For this reason, the exception TimeoutException is expected to be thrown in the lambda, using the AssertJ method assertThatThrownBy.

Note

You can play with this test by removing the timeout declaration (i.e., step 1). If you do that, the test will fail since an exception is expected but not thrown.

Script Loading Timeout

The script loading timeout provides a time limit to interrupt a script that is being evaluated. This timeout has a default value of three hundred seconds. Example 4-8 shows a test using a script loading timeout.

Example 4-8. Test using a script loading timeout
@Test
void testScriptTimeout() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    JavascriptExecutor js = (JavascriptExecutor) driver;
    driver.manage().timeouts().scriptTimeout(Duration.ofSeconds(3)); 1

    assertThatThrownBy(() -> {
        long waitMillis = Duration.ofSeconds(5).toMillis();
        String script = "const callback = arguments[arguments.length - 1];"
                + "window.setTimeout(callback, " + waitMillis + ");"; 2
        js.executeAsyncScript(script);
    }).isInstanceOf(ScriptTimeoutException.class); 3
}
1

We define a script timeout of three seconds. This means that a script lasting for more than that time will throw an exception.

2

We execute an asynchronous script that pauses the execution of five seconds.

3

The script execution time is greater than the configured script timeout, resulting in a ScriptTimeoutException. Again, this example is a negative test, i.e., designed to expect this exception.

Screenshots

Selenium WebDriver is used mainly to carry out end-to-end functional testing of web applications. In other words, we use it to verify that web applications behave as expected by interacting with their user interface (i.e., using a web browser). This approach is very convenient to automate high-level user scenarios, but it also presents different difficulties. One of the main challenges in end-to-end testing is to diagnose the underlying cause of a failed test. Supposing the failure is legitimate (i.e., not induced by a poorly implemented test), the root cause might be diverse: the client side (e.g., incorrect JavaScript logic), the server side (e.g., internal exception), or the integration with other components (e.g., inadequate access to the database), among other reasons. One of the most pervasive mechanisms used in Selenium WebDriver for failure analysis is making browser screenshots. This section presents the mechanisms provided by the Selenium WebDriver API.

Tip

“Failure Analysis” reviews the framework-specific techniques to determine when a test has failed to carry out different failure analysis techniques, such as screenshots, recordings, and log gathering.

Selenium WebDriver provides the interface TakesScreenshot for making browser screenshots. Any driver object inheriting from RemoteWebDriver (see Figure 2-2) also implements this interface. Thus, we can cast a WebDriver object that instantiates one of the major browsers (e.g., ChromeDriver, FirefoxDriver, etc.) as follows:

WebDriver driver = new ChromeDriver();
TakesScreenshot ts = (TakesScreenshot) driver;

The interface TakesScreenshot only provides a method called getScreenshotAs(OutputType<X> target) to make screenshots. The parameter OutputType<X> target determines the screenshot type and the returned value. Table 4-2 shows the available alternatives for this parameter.

Table 4-2. OutputType parameters
Parameter Description Return Example
OutputType.FILE

Make screenshot as a PNG file (located in a temporary system directory)

File
File screenshot =
    ts.getScreenshotAs(
    OutputType.FILE);
OutputType.BASE64

Make a screenshot in Base64 format (i.e., encoded as an ASCII string)

String
String screenshot =
    ts.getScreenshotAs(
    OutputType.BASE64);
OutputType.BYTES

Make a screenshot as a raw byte array

byte[]
byte[] screenshot =
    ts.getScreenshotAs(
    OutputType.BYTES);
Tip

The method getScreenshotAs() allows making screenshots of the browser viewport. In addition, Selenium WebDriver 4 allows creating full-page screenshots using different mechanisms (see “Full-page screenshot”).

Example 4-9 shows a test for taking a browser screenshot in PNG format. Example 4-10 shows another test for creating a screenshot as a Base64 string. The resulting screenshot is shown in Figure 4-3.

Example 4-9. Test making a screenshot as a PNG file
@Test
void testScreenshotPng() throws IOException {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    TakesScreenshot ts = (TakesScreenshot) driver;

    File screenshot = ts.getScreenshotAs(OutputType.FILE); 1
    log.debug("Screenshot created on {}", screenshot);

    Path destination = Paths.get("screenshot.png"); 2
    Files.move(screenshot.toPath(), destination, REPLACE_EXISTING); 3
    log.debug("Screenshot moved to {}", destination);

    assertThat(destination).exists(); 4
}
1

We make the browser screen a PNG file.

2

This file is located in a temporary folder by default, so we move it to a new file called screenshot.png (in the root project folder).

3

We use standard Java to move the screenshot file to the new location.

4

We use assertions to verify that the target file exists.

hosw 0403
Figure 4-3. Browser screenshot of the practice site index page
Example 4-10. Test making a screenshot as Base64
@Test
void testScreenshotBase64() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    TakesScreenshot ts = (TakesScreenshot) driver;

    String screenshot = ts.getScreenshotAs(OutputType.BASE64); 1
    log.debug("Screenshot in base64 "
          + "(you can copy and paste it into a browser navigation bar to watch it)\n"
          + "data:image/png;base64,{}", screenshot); 2
    assertThat(screenshot).isNotEmpty(); 3
}
1

We make the browser screen in Base64 format.

2

We append the prefix data:image/png;base64, to the Base64 string and log it in the standard output. You can copy and paste this resulting string in a browser navigation bar to display the picture.

3

We assert that the screenshot string has content.

Note

Logging the screenshot in Base64 as presented in the previous example could be very useful for diagnosing failures when running tests in CI servers in which we do not have access to the file system (e.g., GitHub Actions).

WebElement Screenshots

The WebElement interface extends the TakesScreenshot interface. This way, it is possible to make partial screenshots of the visible content of a given web element. (See Example 4-11.) Notice that this test is very similar to the previous one using PNG files, but in this case, we invoke the method getScreenshotAs() directly using a web element. Figure 4-4 shows the resulting screenshot.

Example 4-11. Test making a partial screenshot as a PNG file
@Test
void testWebElementScreenshot() throws IOException {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html");

    WebElement form = driver.findElement(By.tagName("form"));
    File screenshot = form.getScreenshotAs(OutputType.FILE);
    Path destination = Paths.get("webelement-screenshot.png");
    Files.move(screenshot.toPath(), destination, REPLACE_EXISTING);

    assertThat(destination).exists();
}
hosw 0404
Figure 4-4. Partial screenshot of the practice web form

Window Size and Position

The Selenium WebDriver API allows manipulating browser size and position very easily using the Window interface. This type is accessible from a driver object using the following statement. Table 4-3 shows the available methods in this interface. Then, Example 4-12 shows a basic test about this feature.

Window window = driver.manage().window();
Table 4-3. Window methods
Method Return Description
getSize()
Dimension

Get the current window size. It returns the outer window dimension, not just the viewport (i.e., the visible area of a web page for end users).

setSize(Dimension
    targetSize)
void

Change the current window size (again, its outer dimension, and not the viewport).

getPosition()
Point

Get current window position (relative to the upper left corner of the screen).

setPosition(Point
    targetPosition)
void

Change the current window position (again, relative to the screen’s upper left corner).

maximize()
void

Maximize the current window.

minimize()
void

Minimize the current window.

fullscreen()
void

Fullscreen the current window.

Example 4-12. Test reading and changing the browser size and position
@Test
void testWindow() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    Window window = driver.manage().window();

    Point initialPosition = window.getPosition(); 1
    Dimension initialSize = window.getSize(); 2
    log.debug("Initial window: position {} -- size {}", initialPosition,
            initialSize);

    window.maximize(); 3

    Point maximizedPosition = window.getPosition();
    Dimension maximizedSize = window.getSize();
    log.debug("Maximized window: position {} -- size {}", maximizedPosition,
            maximizedSize);

    assertThat(initialPosition).isNotEqualTo(maximizedPosition); 4
    assertThat(initialSize).isNotEqualTo(maximizedSize);
}
1

We read the window position.

2

We read the window size.

3

We maximize the browser window.

4

We verify that the maximized position (and size, in the following line) is different from the original window.

Browser History

Selenium WebDriver allows manipulating the browser history through the Navigation interface. The following statement illustrates how to access this interface from a WebDriver object. Using this interface is quite simple. Table 4-4 shows its public methods, and Example 4-13 shows a basic example. Notice that this test navigates into different web pages using these methods, and at the end of the test, it verifies the web page URL is as expected.

Navigation navigation = driver.navigate();

The Shadow DOM

As introduced in “The Document Object Model (DOM)”, the DOM is a programming interface that allows us to represent and manipulate a web page using a tree structure. The shadow DOM is a feature of this programming interface that enables the creation of scoped subtrees inside the regular DOM tree. The shadow DOM allows the encapsulation of a group of a DOM subtree (called shadow tree, as represented in Figure 4-5) that can specify different CSS styles from the original DOM. The node in the regular DOM in which the shadow tree is attached is called the shadow host. The root node of the shadow tree is called the shadow root. As represented in Figure 4-5, the shadow tree is flattened into the original DOM in a single composed tree to be rendered in the browser.

hosw 0405
Figure 4-5. Schematic representation of the shadow DOM
Note

The shadow DOM is part of the standard suite (together with HTML templates or custom elements) that allows the implementation of web components (i.e., reusable custom elements for web applications).

The shadow DOM allows the creation of self-contained components. In other words, the shadow tree is isolated from the original DOM. This feature is useful for web design and composition, but it can be challenging for automated testing with Selenium WebDriver (since the regular location strategies cannot find web elements within the shadow tree). Luckily, Selenium WebDriver 4 provides a WebElement method that allows access to the shadow DOM. Example 4-14 demonstrates this use.

Example 4-14. Test reading the shadow DOM
@Test
void testShadowDom() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/shadow-dom.html"); 1

    WebElement content = driver.findElement(By.id("content")); 2
    SearchContext shadowRoot = content.getShadowRoot(); 3
    WebElement textElement = shadowRoot.findElement(By.cssSelector("p")); 4
    assertThat(textElement.getText()).contains("Hello Shadow DOM"); 5
}
1

We open the practice web page that contains a shadow tree. You can inspect the source code of this page to check the JavaScript method used to create a shadow tree.

2

We locate the shadow host element.

3

We get the shadow root from the host element. As a result, we get an instance of SearchContext, an interface implemented by WebDriver and WebElement, that allows us to find elements using the methods findElement() and find​Ele⁠ments().

4

We find the first paragraph element in the shadow tree.

5

We verify the text content of the shadow element is as expected.

Warning

This feature of the W3C WebDriver specification is recent at the time of this writing, and therefore might not be implemented in all drivers (e.g., chromedriver, geckodriver). For instance, it is available starting with version 96 of both Chrome and Edge.

Cookies

HTTP 1.x is a stateless protocol, meaning that the server does not track the user state. In other words, web servers do not remember users across different requests. The cookies mechanism is an extension to HTTP that allows tracking users by sending small pieces of text called cookies from server to client. These cookies must be sent back by clients, and this way, servers remember their clients. Cookies allow you to maintain web sessions or personalize the user experience on the website, among other functions.

Web browsers allow managing the browser cookies manually. Selenium WebDriver enables an equivalent manipulation, but programmatically. The Selenium WebDriver API provides the methods shown in Table 4-5 to accomplish this. They are accessible through the manage() function of a WebDriver object.

Table 4-5. Cookies management methods
Method Return Description
addCookie(Cookie cookie)
void

Add a new cookie

deleteCookieNamed(String name)
void

Delete an existing cookie by name

deleteCookie(Cookie cookie)
void

Delete an existing cookie by instance

deleteAllCookies()
void

Delete all cookies

getCookies()
Set<Cookie>

Get all cookies

getCookieNamed(String name)
Cookie

Get a cookie by name

As this table shows, the Cookie class provides an abstraction to a single cookie in Java. Table 4-6 summarizes the methods available in this class. In addition, this class has several constructors, which positionally accept the following parameters:

String name

Cookie name (mandatory)

String value

Cookie value (mandatory)

String domain

Domain in which the cookie is visible (optional)

String path

Path in which the cookie is visible (optional)

Date expiry

Cookie expiration date (optional)

boolean isSecure

Whether the cookie requires a secure connection (optional)

boolean isHttpOnly

Whether this cookie is an HTTP-only cookie, i.e., the cookie is not accessible through a client-side script (optional)

String sameSite

Whether this cookie is a same-site cookie, i.e., the cookie is restricted to a first-party or same-site context (optional)

The following examples show different tests managing web cookies with the Selenium WebDriver API. These examples use a practice web page that shows the site cookies on the GUI (see Figure 4-6):

hosw 0406
Figure 4-6. Practice web page for web cookies
Example 4-15. Test reading existing cookies
@Test
void testReadCookies() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/cookies.html");

    Options options = driver.manage(); 1
    Set<Cookie> cookies = options.getCookies(); 2
    assertThat(cookies).hasSize(2);

    Cookie username = options.getCookieNamed("username"); 3
    assertThat(username.getValue()).isEqualTo("John Doe"); 4
    assertThat(username.getPath()).isEqualTo("/");

    driver.findElement(By.id("refresh-cookies")).click(); 5
}
1

We get the Options object used to manage cookies.

2

We read all the cookies available on this page. It should contain two cookies.

3

We read the cookie with the name username.

4

The value of the previous cookie should be John Doe.

5

The last statement does not affect the test. We invoke this command to check the cookies in the browser GUI.

Example 4-16. Test adding new cookies
@Test
void testAddCookies() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/cookies.html");

    Options options = driver.manage();
    Cookie newCookie = new Cookie("new-cookie-key", "new-cookie-value"); 1
    options.addCookie(newCookie); 2
    String readValue = options.getCookieNamed(newCookie.getName())
            .getValue(); 3
    assertThat(newCookie.getValue()).isEqualTo(readValue); 4

    driver.findElement(By.id("refresh-cookies")).click();
}
1

We create a new cookie.

2

We add the cookie to the current page.

3

We read the value of the cookie just added.

4

We verify this value is as expected.

Example 4-17. Test editing existing cookies
@Test
void testEditCookie() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/cookies.html");

    Options options = driver.manage();
    Cookie username = options.getCookieNamed("username"); 1
    Cookie editedCookie = new Cookie(username.getName(), "new-value"); 2
    options.addCookie(editedCookie); 3

    Cookie readCookie = options.getCookieNamed(username.getName()); 4
    assertThat(editedCookie).isEqualTo(readCookie); 5

    driver.findElement(By.id("refresh-cookies")).click();
}
1

We read an existing cookie.

2

We create a new cookie reusing the previous cookie name.

3

We add the new cookie to the web page.

4

We read the cookie just added.

5

We verify the cookie has been correctly edited.

Example 4-18. Test deleting existing cookies
@Test
void testDeleteCookies() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/cookies.html");

    Options options = driver.manage();
    Set<Cookie> cookies = options.getCookies(); 1
    Cookie username = options.getCookieNamed("username"); 2
    options.deleteCookie(username); 3

    assertThat(options.getCookies()).hasSize(cookies.size() - 1); 4

    driver.findElement(By.id("refresh-cookies")).click();
}
1

We read all cookies.

2

We read the cookie with the name username.

3

We delete the previous cookie.

4

We verify the size of the cookies is as expected.

Dropdown Lists

A typical element in web forms is dropdown lists. These fields allow users to select one or more elements within an option list. The classical HTML tags used to render these fields are <select> and <options>. As usual, the practice web form contains one of these elements (see Figure 4-7), defined in HTML as follows:

<select class="form-select" name="my-select">
  <option selected>Open this select menu</option>
  <option value="1">One</option>
  <option value="2">Two</option>
  <option value="3">Three</option>
</select>
hosw 0407
Figure 4-7. Select field in the practice web form

These elements are very spread out in web forms. For this reason, Selenium WebDriver provides a helper class called Select to simplify their manipulation. This class wraps a select WebElement and provides a wide variety of features. Table 4-7 summarizes the public methods available in the Select class. After that, Example 4-19 shows a basic test using this class.

Table 4-7. Select methods
Method Return Description
Select(WebElement element)
Select

Constructor using a WebElement as parameter (it must be a <select> element); otherwise it throws an UnexpectedTagNameException

getWrappedElement()
WebElement

Get wrapped WebElement (i.e., the one used in the constructor)

isMultiple()
boolean

Whether the select element supports selecting multiple options

getOptions()
List<WebElement>

Read all options that belong to the select element

getAllSelectedOptions()
List<WebElement>

Read all selected options

getFirstSelectedOption()
WebElement

Read first selected option

selectByVisibleText(String text)
void

Select all options that match a given displayed text

selectByIndex(int index)
void

Select an option by index number

selectByValue(String value)
void

Select option(s) by value attribute

deselectAll()
void

Deselect all options

deselectByValue(String value)
void

Deselect option(s) by value attribute

deselectByIndex(int index)
void

Deselect by index number

deselectByVisibleText(String text)
void

Deselect options that match a given displayed text

Example 4-19. Test interacting with a select field
@Test
void test() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html");

    Select select = new Select(driver.findElement(By.name("my-select"))); 1
    String optionLabel = "Three";
    select.selectByVisibleText(optionLabel); 2

    assertThat(select.getFirstSelectedOption().getText())
            .isEqualTo(optionLabel); 3
}
1

We find the select element by name and use the resulting WebElement to instantiate a Select object.

2

We select one of the options available in this select, using a by-text strategy.

3

We verify the selected option text is as expected.

Data List Elements

Another way to implement dropdown lists in HTML is using data lists. Although data lists are very similar to select elements from a graphical point of view, there is a clear distinction between them. On the one hand, select fields display an options list, and users choose one (or several) of the available options. On the other hand, data lists show a list of suggested options associated with an input form (text) field, and users are free to select one of those suggested values or type a custom value. The practice web form contains one of these data lists. You can find its markup in the following snippet and a screenshot in Figure 4-8.

<input class="form-control" list="my-options" name="my-datalist"
        placeholder="Type to search...">
<datalist id="my-options">
  <option value="San Francisco">
  <option value="New York">
  <option value="Seattle">
  <option value="Los Angeles">
  <option value="Chicago">
</datalist>
hosw 0408
Figure 4-8. Data list field in the practice web form

Selenium WebDriver does not provide a custom helper class to manipulate data lists. Instead, we need to interact with them as standard input texts, with the distinction that their options are displayed when clicking on the input field. Example 4-20 shows a test illustrating this.

Example 4-20. Test interacting with a data list field
@Test
void testDatalist() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html");

    WebElement datalist = driver.findElement(By.name("my-datalist")); 1
    datalist.click(); 2

    WebElement option = driver
            .findElement(By.xpath("//datalist/option[2]")); 3
    String optionValue = option.getAttribute("value"); 4
    datalist.sendKeys(optionValue); 5

    assertThat(optionValue).isEqualTo("New York"); 6
}
1

We locate the input field used for the data list.

2

We click on it to display its options.

3

We find the second option.

4

We read the value of the located option.

5

We type that value in the input field.

6

We assert the option value is as expected.

Navigation Targets

When navigating web pages using a browser, by default, we use a single page corresponding to the URL in the navigation bar. Then, we can open another page in a new browser tab. This second tab can be explicitly opened when a link defines the attribute target, or the user can force navigation to a new tab, typically by using the modifier key Ctrl (or Cmd in macOS) together with the mouse click into a web link. Another possibility is opening web pages in new windows. For this, web pages typically use the JavaScript command window.open(url). Another way of displaying different pages at the same time is using frames and iframes. A frame is an HTML element type that defines a particular area (into a set called frameset) where a web page can be displayed. An iframe is another HTML element that allows embedding an HTML page into the current one.

Warning

Using frames is not encouraged since these elements have many drawbacks, such as performance and accessibility problems. I explain how to use them through Selenium WebDriver for compatibility reasons. Nevertheless, I strongly recommend avoiding frames on brand-new web applications.

The Selenium WebDriver API provides the interface TargetLocator to deal with the previously mentioned targets (i.e., tabs, windows, frames, and iframes). This interface allows changing the focus of the future commands of a WebDriver object (to a new tab, windows, etc.). This interface is accessible by invoking the method switchTo() in a WebDriver object. Table 4-8 describes its public methods.

Table 4-8. TargetLocator methods
Method Return Description
frame(int index)
WebDriver

Change focus to a frame (or iframe) by index number.

frame(String
    nameOrId)
WebDriver

Change focus to a frame (or iframe) by name or id.

frame(WebElement
    frameElement)
WebDriver

Change focus to a frame (or iframe) previously located as a WebElement.

parentFrame()
WebDriver

Change focus to the parent context.

window(String
    nameOrHandle)
WebDriver

Switch the focus to another window, by name or handle. A window handle is a hexadecimal string that univocally identifies a window or tab.

newWindow(WindowType
    typeHint)
WebDriver

Creates a new browser window (using WindowType.WINDOW) or tab (WindowType.TAB) and switches the focus to it.

defaultContent()
WebDriver

Select the main document (when using iframes) or the first frame on the page (when using a frameset).

activeElement()
WebElement

Get the element currently selected.

alert()
Alert

Change focus to a window alert (see “Dialog Boxes” for further details).

Tabs and Windows

Example 4-21 shows a test where we open a new tab for navigating a second web page. Example 4-22 shows an equivalent case but for opening a new window for the second web page. Notice that the difference between these examples is only the parameter WindowType.TAB and WindowType.WINDOW.

Example 4-21. Test opening a new tab
@Test
void testNewTab() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/"); 1
    String initHandle = driver.getWindowHandle(); 2

    driver.switchTo().newWindow(WindowType.TAB); 3
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html"); 4
    assertThat(driver.getWindowHandles().size()).isEqualTo(2); 5

    driver.switchTo().window(initHandle); 6
    driver.close(); 7
    assertThat(driver.getWindowHandles().size()).isEqualTo(1); 8
}
1

We navigate to a web page.

2

We get the current window handle.

3

We open a new tab and change the focus to it.

4

We open another web page (since the focus is in the second tab, the page is opened in the second tab).

5

We verify that the number of window handles at this point is 2.

6

We change the focus to the initial window (using its handle).

7

We close only the current window. The second tab remains open.

8

We verify that the number of window handles now is 1.

Example 4-22. Test opening a new window
@Test
void testNewWindow() {
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
    String initHandle = driver.getWindowHandle();

    driver.switchTo().newWindow(WindowType.WINDOW); 1
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-form.html");
    assertThat(driver.getWindowHandles().size()).isEqualTo(2);

    driver.switchTo().window(initHandle);
    driver.close();
    assertThat(driver.getWindowHandles().size()).isEqualTo(1);
}
1

This line is different in the examples. In this case, we open a new window (instead of a tab) and focus on it.

Frames and Iframes

Example 4-23 shows a test in which the web page under test contains an iframe. Example 4-24 shows the equivalent case but using a frameset.

Example 4-23. Test handling iframes
@Test
void testIFrames() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/iframes.html"); 1

    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    wait.until(ExpectedConditions
            .frameToBeAvailableAndSwitchToIt("my-iframe")); 2

    By pName = By.tagName("p");
    wait.until(ExpectedConditions.numberOfElementsToBeMoreThan(pName, 0)); 3
    List<WebElement> paragraphs = driver.findElements(pName);
    assertThat(paragraphs).hasSize(20); 4
}
1

We open a web page that contains an iframe (see Figure 4-9).

2

We use an explicit wait for waiting for the frame and switching to it.

3

We use another explicit wait to pause until the paragraphs contained in the iframe are available.

4

We assert the number of paragraphs is as expected.

hosw 0409
Figure 4-9. Practice web page using an iframe
Example 4-24. Test handling frames
@Test
void testFrames() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/frames.html"); 1

    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
    String frameName = "frame-body";
    wait.until(ExpectedConditions
            .presenceOfElementLocated(By.name(frameName))); 2
    driver.switchTo().frame(frameName); 3

    By pName = By.tagName("p");
    wait.until(ExpectedConditions.numberOfElementsToBeMoreThan(pName, 0));
    List<WebElement> paragraphs = driver.findElements(pName);
    assertThat(paragraphs).hasSize(20);
}
1

We open a web page that contains a frameset (see Figure 4-10).

2

We wait for the frame to be available. Note that steps 2 and 3 in Example 4-23 are equivalent to this step.

3

We change the focus to this frame.

hosw 0410
Figure 4-10. Practice web page using frames

Dialog Boxes

JavaScript provides different dialog boxes (sometimes called pop-ups) to interact with the user, namely:

Alert

To show a message and wait for the user to press the button OK (only choice in the dialog). For instance, the following code will open a dialog that displays “Hello world!” and waits for the user to press the OK button.

alert("Hello world!");
Confirm

To show a dialog box with a question and two buttons: OK and Cancel. For instance, the following code will open a dialog showing the message “Is this correct?” and prompting the user to click on OK or Cancel.

let correct = confirm("Is this correct?");
Prompt

To show a dialog box with a text message, an input text field, and the buttons OK and Cancel. For example, the following code shows a pop-up displaying “Please enter your name,” a dialog box in which the user can type, and two buttons (OK and Cancel).

let username = prompt("Please enter your name");

In addition, CSS allows implementing another type of dialog box called modal window. This dialog disables the main window (but keeps it visible) while overlaying a child pop-up, typically showing a message and some buttons. You can find a sample page on the practice web page containing all these dialog boxes (alert, confirm, prompt, and modal). Figure 4-11 shows a screenshot of this page when the modal dialog is active.

hosw 0411
Figure 4-11. Practice web page with dialog boxes (alert, confirm, prompt, and modal)

Alerts, Confirms, and Prompts

The Selenium WebDriver API provides the interface Alert to manipulate JavaScript dialogs (i.e., alerts, confirms, and prompts). Table 4-9 describes the methods provided by this interface. Then, Example 4-25 shows a basic test interacting with an alert.

Table 4-9. Alert methods
Method Return Description
accept()
void

To click OK

getText()
String

To read the dialog message

dismiss()
void

To click Cancel (not available in alerts)

sendKeys(String text)
void

To type some string in the input text (only available in prompts)

Example 4-25. Test handling an alert dialog
@Test
void testAlert() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/dialog-boxes.html"); 1
    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));

    driver.findElement(By.id("my-alert")).click(); 2
    wait.until(ExpectedConditions.alertIsPresent()); 3
    Alert alert = driver.switchTo().alert(); 4
    assertThat(alert.getText()).isEqualTo("Hello world!"); 5
    alert.accept(); 6
}
1

We open the practice web page that launches dialog boxes.

2

We click on the left button to launch a JavaScript alert.

3

We wait until the alert dialog is displayed on the screen.

4

We change the focus to the alert pop-up.

5

We verify that the alert text is as expected.

6

We click on the OK button of the alert dialog.

We can replace steps 3 and 4 with a single explicit wait statement, as follows (you can find it in a second test in the same class in the examples repository):

Alert alert = wait.until(ExpectedConditions.alertIsPresent());

The next test (Example 4-26) illustrates how to deal with a confirm dialog. Notice this example is quite similar to the previous one, but in this case, we can invoke the method dismiss() to click on the Cancel button available on the confirm dialog. Finally, Example 4-27 shows how to manage a prompt dialog. In this case, we can type a string into the input text.

Example 4-26. Test handling a confirm dialog
@Test
void testConfirm() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/dialog-boxes.html");
    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));

    driver.findElement(By.id("my-confirm")).click();
    wait.until(ExpectedConditions.alertIsPresent());
    Alert confirm = driver.switchTo().alert();
    assertThat(confirm.getText()).isEqualTo("Is this correct?");
    confirm.dismiss();
}
Example 4-27. Test handling a prompt dialog
@Test
void testPrompt() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/dialog-boxes.html");
    WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(5));

    driver.findElement(By.id("my-prompt")).click();
    wait.until(ExpectedConditions.alertIsPresent());
    Alert prompt = driver.switchTo().alert();
    prompt.sendKeys("John Doe");
    assertThat(prompt.getText()).isEqualTo("Please enter your name");
    prompt.accept();
}

Modal Windows

Modal windows are dialog boxes built with basic CSS and HTML. For this reason, Selenium WebDriver does not provide any specific utility for manipulating them. Instead, we use the standard WebDriver API (locators, waits, etc.) to interact with modal windows. Example 4-28 shows a basic test using the practice web page that contains dialog boxes.

Web Storage

The Web Storage API allows web applications to store data locally in the client file system. This API provides two JavaScript objects:

window.localStorage

To store data permanently

window.sessionStorage

To store data during the session time (data is deleted when the browser tab is closed)

Selenium WebDriver provides the interface WebStorage for manipulating the Web Storage API. Most of the WebDriver types supported by Selenium WebDriver inherit this interface: ChromeDriver, EdgeDriver, FirefoxDriver, OperaDriver, and SafariDriver. This way, we can use this feature of these browsers. Example 4-29 demonstrates this use in Chrome. This test uses both types of web storage (local and session).

Example 4-29. Test using web storage
@Test
void testWebStorage() {
    driver.get(
            "https://bonigarcia.dev/selenium-webdriver-java/web-storage.html");
    WebStorage webStorage = (WebStorage) driver; 1

    LocalStorage localStorage = webStorage.getLocalStorage();
    log.debug("Local storage elements: {}", localStorage.size()); 2

    SessionStorage sessionStorage = webStorage.getSessionStorage();
    sessionStorage.keySet()
            .forEach(key -> log.debug("Session storage: {}={}", key,
                    sessionStorage.getItem(key))); 3
    assertThat(sessionStorage.size()).isEqualTo(2);

    sessionStorage.setItem("new element", "new value");
    assertThat(sessionStorage.size()).isEqualTo(3); 4

    driver.findElement(By.id("display-session")).click();
}
1

We cast the driver object to WebStorage.

2

We log the number of elements of local storage.

3

We log the session storage (it should contain two elements).

4

After adding a new element, there should be three elements in the session storage.

Event Listeners

The Selenium WebDriver API allows creating listeners that notify events happening in WebDriver and derived objects. In former versions of Selenium WebDriver, this feature was accessible through the class EventFiringWebDriver. This class is deprecated as of Selenium WebDriver 4, and instead, we should use the following:

EventFiringDecorator

Wrapper class for WebDriver and derived objects (e.g., WebElement, TargetLocator, etc.). It allows registering one or more listeners (i.e., WebDriverListener instances).

WebDriverListener

Interface that should implement the listeners registered in the decorator. It supports three types of events:

Before events

Logic inserted just before some event starts

After events

Logic inserted just after some event terminates

Error events

Logic inserted before an exception is thrown

To implement an event listener, first, we should create a listener class. In other words, we need to create a class that implements the WebDriverListener. This interface defines all its methods using the default keyword, and therefore, it is optional to override their methods. Thanks to that feature (available as of Java 8), our class should only implement the method we need. There are plenty of listener methods available, for instance, afterGet() (executed after calling to the method get() in a WebDriver instance), or beforeQuit() (executed before calling to the quit() method in a WebDriver instance), to name a few. My recommendation for checking all these listeners is to use your favorite IDE to discover the possible methods to be overridden/implemented. Figure 4-12 shows the wizard for doing this in Eclipse.

hosw 0412
Figure 4-12. WebDriverListener methods in Eclipse

Once we have implemented our listener, we need to create the decorator class. There are two ways to do that. If we want to decorate a WebDriver object, we can create an instance of EventFiringDecorator (passing the listener as the argument to the constructor) and then invoke the method decorate() to pass the WebDriver object. For instance:

WebDriver decoratedDriver = new EventFiringDecorator(myListener)
        .decorate(originalDriver);

The second way is to decorate other objects of the Selenium WebDriver API, namely WebElement, TargetLocator, Navigation, Options, Timeouts, Window, Alert, or VirtualAuthenticator. In this case, we need to invoke the method createDecorated() in an EventFiringDecorator object to get a Decorated<T> generic class. The following snippet shows an example using a WebElement as a parameter:

Decorated<WebElement> decoratedWebElement = new EventFiringDecorator(
        listener).createDecorated(myWebElement);

Let’s look at a completed example. First, Example 4-30 shows the class that implements the WebDriverListener interface. Notice this class implements two methods: afterGet() and beforeQuit(). Both methods call takeScreenshot() to take a browser screenshot. All in all, we are collecting browser screenshots just after loading a web page (typically at the beginning of the test) and before quitting (typically at the end of the test). Then, Example 4-31 shows the test that uses this listener.

Example 4-30. Event listener implementing methods afterGet() and beforeQuit()
public class MyEventListener implements WebDriverListener {

    static final Logger log = getLogger(lookup().lookupClass());

    @Override
    public void afterGet(WebDriver driver, String url) { 1
        WebDriverListener.super.afterGet(driver, url);
        takeScreenshot(driver);
    }

    @Override
    public void beforeQuit(WebDriver driver) { 2
        takeScreenshot(driver);
    }

    private void takeScreenshot(WebDriver driver) {
        TakesScreenshot ts = (TakesScreenshot) driver;
        File screenshot = ts.getScreenshotAs(OutputType.FILE);
        SessionId sessionId = ((RemoteWebDriver) driver).getSessionId();
        Date today = new Date();
        SimpleDateFormat dateFormat = new SimpleDateFormat(
                "yyyy.MM.dd_HH.mm.ss.SSS");
        String screenshotFileName = String.format("%s-%s.png",
                dateFormat.format(today), sessionId.toString());
        Path destination = Paths.get(screenshotFileName); 3

        try {
            Files.move(screenshot.toPath(), destination);
        } catch (IOException e) {
            log.error("Exception moving screenshot from {} to {}", screenshot,
                    destination, e);
        }
    }

}
1

We override this method to execute custom logic after loading web pages with the WebDriver object.

2

We override this method to execute custom logic before quitting the WebDriver object.

3

We use a unique name for the PNG screenshots. For that, we get the system date (date and time) plus the session identifier.

Example 4-31. Test using EventFiringDecorator and the previous listener
class EventListenerJupiterTest {

    WebDriver driver;

    @BeforeEach
    void setup() {
        MyEventListener listener = new MyEventListener();
        WebDriver originalDriver = WebDriverManager.chromedriver().create();
        driver = new EventFiringDecorator(listener).decorate(originalDriver); 1
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

    @Test
    void testEventListener() {
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
        assertThat(driver.getTitle())
                .isEqualTo("Hands-On Selenium WebDriver with Java");
        driver.findElement(By.linkText("Web form")).click(); 2
    }

}
1

We create a decorated WebDriver object using an instance on MyEventListener. We use the resulting driver to control the browser in the @Test logic.

2

We click on a web link to change the page. The resulting two screenshots taken in the listener should be different.

WebDriver Exceptions

All the exceptions provided by the WebDriver API inherit from the class WebDriverException and are unchecked (see the following sidebar if you are unfamiliar with this terminology). Figure 4-13 shows these exceptions in Selenium WebDriver 4. As this image shows, there are many different exception types. Table 4-10 summarizes some of the most common causes.

hosw 0413
Figure 4-13. Selenium WebDriver exceptions
Table 4-10. Usual WebDriver exceptions and common causes
Exception Description Common causes
NoSuchElementException

Web element not available

  • Invalid locator strategy

  • The element has not been rendered (maybe you need to wait for it)

NoAlertPresentException

Dialog (alert, prompt, or confirm) not available

Trying to perform an action (e.g., accept() or dismiss()) into an unavailable dialog

NoSuchWindowException

Window or tab not available

Trying to switch into an unavailable window or tab

NoSuchFrameException

Frame or iframe not available

Trying to switch into an unavailable frame or iframe

InvalidArgumentException

Incorrect argument when calling some method of the Selenium WebDriver API

  • Bad URL in navigation methods

  • Nonexistent path when uploading files

  • Bad argument type in a JavaScript script

StaleElementReferenceException

The element is stale, i.e., it no longer appears on the page

The DOM gets updated when trying to interact with a previously located element

UnreachableBrowserException

Problem communicating with the browser

  • The connection with the remote browser could not be established

  • The browser died in the middle of a WebDriver session

TimeoutException

Page loading timeout

Some web page takes longer than expected to load

ScriptTimeoutException

Script loading timeout

Some script takes longer than expected to execute

ElementNotVisibleException
ElementNotSelectableException
ElementClickInterceptedException

The element is on the DOM but is not visible/selectable/clickable

  • Insufficient (or nonexistent) wait until the element is displayed/selectable/clickable

  • The page layout (perhaps caused by viewport change) makes that element overlay on the element we try to interact with

Summary and Outlook

This chapter provided a comprehensive review of those WebDriver API features interoperable in different web browsers. Among them, you discovered how to execute JavaScript with Selenium WebDriver, with synchronous, pinned (i.e., attached to a WebDriver session), and asynchronous scripts. Then, you learned about timeouts, used to specify a time limit interval for page loading and script execution. Also, you saw how to manage several browser aspects, such as size and position, navigation history, the shadow DOM, and cookies. Next, you discovered how to interact with specific web elements, such as dropdown lists (select and data lists), navigation targets (windows, tabs, frames, and iframes), and dialog boxes (alerts, prompts, confirms, and modals). Finally, we reviewed the mechanism for implementing web storage and event listeners in Selenium WebDriver 4 and the most relevant WebDriver exceptions (and their common causes).

The next chapter continues to expose the features of the Selenium WebDriver API. The chapter explains those aspects specific to a given browser (e.g., Chrome, Firefox, etc.), including browser capabilities (e.g., ChromeOptions, FirefoxOptions, etc.), the Chrome DevTools Protocol (CDP), network interception, mocking geolocation coordinates, the WebDriver BiDirectional (BiDi) protocol, authentication mechanisms, or printing web pages to PDF, among other features.

Get Hands-On Selenium WebDriver with Java now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.