1.1 Introduction

1.1.1 Why Read This Book?

This may sound a little odd…

While WebdriverIO is in this book’s title, this isn’t a book about WebdriverIO.

Yes, it does cover the popular test automation framework in-depth. More importantly though, this book is about teaching you how to effectively use UI Test Automation to validate Web App functionality.

The lessons here focus less on how specific WebdriverIO commands work and more on how specific approaches to testing are beneficial or harmful.

And yet, while any test framework would work for this, I did choose WebdriverIO for a specific reason: It’s a mature framework that allows us to spend less time on code and more time on important testing concepts.

Why do I start off the book saying all of this?

Well, over the past decade, website complexity has grown substantially, requiring much more effort when it comes to testing. While test automation is certainly not a recent development, it has grown in popularity lately due to the increased complexity of the sites we build.

In fact, many organizations have full teams dedicated just to testing their site. Quality Assurance (QA) is an important role for any company that wants to take its technology seriously.

However, having humans manually run through test scripts is a time-consuming task.

By automating our tests, we hope to shift the workload from manual labor to speedy CPUs. Without humans and their need to sleep and eat, we can theoretically test our sites on-demand, around the clock.

Wouldn’t it be ideal to have a test suite so effective that your QA team focused solely on keeping it up-to-date?

Unfortunately, that’s not the reality.

I’ve seen, heard, and have been part of many teams that set out after this ideal, only to realize months later that all the effort has provided them little benefit. Yes, they have test automation in place, but it’s constantly breaking and causing endless headaches. It seems they either have tests that run well but don’t validate much or have tests that check for everything but report false errors.

In 2017, I spent the year recording screencasts covering WebdriverIO in-depth. While I covered the details of the framework quite well (or so I was told), I was left with nagging questions of “is knowing the tool valuable enough?”

Can you paint beautiful art while only knowing how to mix colors and hold a paintbrush?

Is understanding WebdriverIO’s ‘addValue’ command enough to write tests that are effective?

I don’t think so.

This time around, I’m focusing on that second part. Yes, it’s important to cover the details of commands and code. More importantly though, you need to see how you can combine all that technology to create a test suite that provides actual value.

In this book, I cover not just what WebdriverIO can do, but specifically how you’ll be using it day-to-day. I’ve built the examples around real-world scenarios that demonstrate how you would actually set things up. I’m not here to only teach you what WebdriverIO does. I’m here to teach you how to approach problems from various angles and come to the right solution.

It takes a little more work on my part, and extra effort on your’s to get started, but the payoff is there, I promise.

With that, let’s get started.

What is User Interface (UI) Test Automation?

I mentioned that test automation isn’t anything new, but I didn’t explain what it is.

Truthfully, test automation is a lot of things. There are a multitude of programs and tools surrounding it, not to mention the various ideologies about automated testing in general (e.g., Test-driven Development vs. Behavior-driven Development).

Aside from that, there are also different types of tests. In this book, we’re focusing solely on UI automation, but there’s also unit, integration, performance, accessibility, usability, you-name-it testing.

All of these are important to know about, and can provide as much or more value than UI testing. So why focus on UI testing?

Well, you don’t really have an option. You can’t skip accessibility testing if you want to have a site that’s accessible. Equally, you can’t skip UI testing if you want to have a site that’s not broken.

Truthfully, we’re doing tons of UI testing. Every time someone loads up a website, they’re testing the UI. When they click that button, does it make that thing happen? When I use my mobile device, does the site fit on my small screen?

If you have a site in production, every user is testing your UI. Hopefully for them (and you), they’re not the first ones to try it out.

UI testing is constantly being done, just in a very labor-intensive way. Front-end developers working on the code will spend half of their day in the browser manually testing if their changes worked. If they’re testing in an older browser, it will be 3/4ths of their day.

UI testing is the simple (hah) task of validating that the code written for the browser, actually works in that browser, along with all the other components that went into that webpage.

UI test automation is a way to convert those manual mouse clicks and keystrokes into coded scripts that we can run on a regular basis.

Let’s Talk Benefits

I’ve alluded to this before, but it’s worth repeating. Here are some of the many real-world benefits of automated UI testing:

  • Jamie, your ace front-end developer, just finished their latest task and is itching to release it. While they were careful to validate that the new functionality works, unfortunately they forgot to test whether that new code broke the zip code widget on the contact page. Lucky for you, the UI automation test caught the error, and a fix was in place before the end of the day.
  • Taylor is an awesome full-stack developer. They’ve got everything about their coding environment fine-tuned so that they can focus solely on pumping out fixes and new features… except for one thing: Taylor always forgets to check their code on slower, outdated computers. That’s okay though, as our UI tests are configured to run on a variety of setups, and it just so happens that it caught that issue when running inside a Windows 8 environment.
  • Casey is a great product manager, but sometimes forgets to clarify the small details. This means developers can make the wrong assumption during development, which is only discovered by Casey during product demos. By adding automated tests to the mix, developers and product managers are pushed to have regular conversations on the desired behavior of certain functionality, resulting in fewer suprises later on.
  • Avery is a superb QA tester. With great attention to detail, Avery always thinks of unique ways in which the website would be seen. Unfortunately, manually testing these edge cases takes a fair amount of time to get in place. By adding automated scripts, Avery is able to programmatically generate the needed data for their tests, and allow them to run these scripts on a regular basis.

These are just a few ways test automation can help. I skipped over many other benefits for the sake of time, but there’s one I omitted on purpose. Many folks claim that UI tests allow you to have fewer developers/testers. While that could be an outcome of automation, I don’t think it’s a valuable argument.

In all the scenarios above, having an extra human around wouldn’t have necessarily helped. That’s because human’s have blind spots, and many of them are the same. Teams have collective blindspots and that’s difficult to get around. We naturally tend to focus on what’s in front of us, forgetting about realities outside our own influence.

UI testing isn’t about replacing humans, but rather augmenting their abilities. We use automation to shore up our limitations, allowing us to focus on our strengths.

Developers waste their time testing on hundreds of different devices, when they could instead let a computer do that part of the work. Your QA team wastes its time running through the same test scenario week after week, when instead they could be thinking of new and undiscovered test cases.

UI automation isn’t about replacing humans with machines, but rather giving us more freedom to work better than machines.

And let’s not forget, UI testing requires a fair amount of work. It’s not magic (although it’s okay to convince management it is). This brings me to an important consideration…

There Are Always Drawbacks

How long do you think it takes to get a set of automated tests up and running? A week… maybe two?

Sure, if all you want to do is test that a page loads properly.

But websites are incredibly complex — the number of features we jam on a page grows each day.

Consider a “simple” homepage. Here are some things you’ll want to test on it:

  • Do all the parts look right on a laptop and desktop computer?
  • Do all the parts look right on a tablet?
  • Do all the parts look right on a phone?
  • Does the site navigation work?
  • Does the “need help” chatbox pop up after five seconds?
  • Does the autocomplete in the site searchbar work?
  • Does the hidden menu show after you click the menu icon?
  • Does the carousel on the homepage rotate correctly?

Okay, I’ll stop there. Hopefully you get my point that there’s a lot to test even on a single page. A basic script that loads a page doesn’t provide much reassurance.

So if you want to test all your functionality on all your pages, you’re going to have to write a lot of code.

Covering all of that ground takes time. Time that could be spent on other, possibly more important, tasks.

And as you’re writing test after test, the website your testing is going to keep changing. New features and fixes will be continuously introduced. The same feature will be tweaked again and again, causing your once-solid test suite to mysteriously start failing.

When the glorious day comes and you’re “done” writing tests, you still have to maintain them. Now, there are ways to write tests to make them more maintainable, and we’ll cover that throughout the examples, but there’s no such thing as a future-proof test. You’re always going to need to update it.

There’s one more important point to consider.

While you’ll breeze through some parts of test writing, expect to sink 80% of your time figuring out how to test that special 20% of your site’s functionality. There are many areas of writing tests that aren’t trivial. It’s not as simple as calling the command to fill out a textbox and click the submit button.

You’ll need to work with databases, integrate with third-party services and see if you can get commands to work in browsers with poor automation support. You’re also going to run into instances where the way the website was coded just doesn’t jibe with test automation.

Animations are a good example of that. How do you test that an animation works? You can fairly easily check the properties before and after an animation, but what about everything in between those two states?

Honestly, you’d need to do a screen recording of every frame of the animation, and compare that screen recording to a previous run to see if they match. I don’t know about you, but that doesn’t sound easy to me.

Simpler Sites for UI Testing

If you find yourself facing a complex site that would require major work to test properly, there are other options to gain some value without too much effort.

Instead of testing a fully working site, you may want to create a variation of your site that represents portions of the real thing. For example, pattern libraries are a popular option out there, especially for larger websites.

In case you’re not familiar with them, pattern libraries are essentially living demos of the components that make up a website’s interface. Mailchimp has a public pattern library if you’d like to see one in action (https://ux.mailchimp.com/patterns).

For instance, a pattern library may contain standalone versions of the following:

  • Site layout components like the main navigation and site footer
  • Components used across multiple pages, like buttons, form inputs, and tabs
  • Style guidelines for simple page elements like links, headings, and lists

Pattern libraries are very helpful for teams, to document how the website should look and act. If you’re tasked with adding a sortable table to your companies site, having a pattern library with a living example of one makes the job simple.

They’re also useful for testers, as it gives us testable examples of individual components isolated from the complexity of the full website.

Skipping Automation is Sometimes the Best Option

Hopefully I haven’t scared you away from UI test automation entirely. I only wanted to get you thinking about the whole picture.

And that picture should include saying, “Maybe we shouldn’t write test automation for that… at least not yet.”

Let’s consider a few things…

New Features Aren’t User Tested

Business is booming and your team is tasked with a brand-new feature idea that management thinks the customer will love. While it’s tempting to say, “Yes, and let’s write tests to go side-by-side with this new feature,” it might not be the right time.

This brand-new feature has never seen the light of day, and the second a customer sees it there’s going to be something they don’t like about it. If a decision is made to rework the concept based on customer feedback, all those tests you’ve written become useless.

There’s no point in writing a test if you haven’t user-tested your site. So before you spend time writing assertions, ensure the assumptions about the user interactions are initially put to the test.

Time Writing Tests Takes Away from Writing Features

You’re not paid to write tests; tests only serve the application they’re testing. If an app is useless, tests won’t help.

If you’re working on a side project for a tool that no one uses, spending time writing tests takes away from time spent on more important tasks, like getting people to use your work.

Users don’t care whether you have good unit tests. There’s no difference between an unused tool and an unused unit tested tool.

Let yourself have untested code. Worry about that problem when it actually becomes one.

Tests Are Only Valuable When You Use Them

Don’t write more tests when you’re not using the ones you already have.

If you have 500 UI tests, but never put in the time to integrate them in your build and deployment process, you have 500 useless tests. Writing 500 more won’t help.

Your tests should run on every code push. They should run before every deploy. Every developer on the team should see that the tests passed or failed.

If that’s not true, you shouldn’t be writing more tests, you should be taking advantage of the tests you already have.

Parts of the Site Might be Better Tested by People

Remember when I said tests shouldn’t replace people, but rather augment their abilities. Well, in the scenario of testing an animation, we were stuck with a really complex solution. What if we just went with manual visual validation of the effect?

It’s okay to have some parts of your site that are too complex for automation. Grab that low-hanging fruit and leave the stuff higher in the tree for a later time when you have a ladder.

Are Tests Worth It Then?

I’ve outlined the benefits and drawbacks of test automation, including reasons to entirely skip some of it.

So how do you ensure you’re getting more benefits than drawbacks? Focus on these two goals:

  • Ensure you’re gaining value out of every test you write
  • Ensure you’re selling that value to those in charge

It’s really easy to get caught up in automation, trying to cover every nook and cranny of the site. But if a feature of your site doesn’t provide much value, how much less value would a test for that feature be?

The site login component breaking… that’s bad. The site’s About page having a typo… not such a big deal. Sure, we’d want to fix it, but it (hopefully) won’t cost the company a lot of money.

By focusing on gaining and selling that value, you can keep yourself honest in your test writing, and focus on what’s important: having a site that’s running smoothly and providing value to the user.

1.1.2 Why Use WebdriverIO?

Back in the late 2000s, I learned of a tool called Selenium that the tester’s on my team were fairly interested in using.

I thought it was a neat idea, but there was one big red flag to me. It required writing the tests in the Java programming language.

I had taken a couple semesters of Java programming in college and actually enjoyed the object-oriented nature of the language. I had to wrap my head around some of the complexities of the language, but overall I found it a useful language to understand.

But thinking about writing automated tests in Java gave me pause. Java is a very verbose language requiring a fair amount of setup and some tedious coding. I just didn’t think test automation was a good fit for it. So I stopped researching the idea in favor of other pending tasks.

Years later, a new tool came out called PhantomJS. It was based in Node.js, and promised the ability to automate browser usage. That definitely perked my interest, and I’ll explain in a minute, but first…

What’s a Node.js?

You may not be familiar with Node.js, so I’ll explain it a little here. If you are familiar, feel free to skip this part.

JavaScript is the coding language of the web. Starting in the mid 1990’s, an early version of JavaScript (originally called LiveScript) was included in the Netscape Navigator browser. Microsoft, eager to match and beat the features of Netscape, saw this new language and decided to add their own version (calling it JScript) to Internet Explorer (IE).

As the browser battle continued, JavaScript and JScript continued to grow in popularity among website authors. I recall my first use of JavaScript was to make a “mouse trail” on my very first website1. My second use was to float an animation of Ralph Wiggum eating glue across the screen of my “from the local police blotter” page.

It was dumb, but boy was it fun to play around with.

Most JavaScript usage for the next decade revolved around either cheesy browser effects, or useful add-ons like drop-down menus and browser-based form field validation. As a front-end developer, learning JavaScript was an important part of your job, although not a critical part like it is today.

In 2009, Ryan Dahl combined Chrome’s JavaScript engine (called V8) with a few new tricks, and have it run entirely outside the browser 2. While the initial idea was to use JavaScript and Node.js to create servers that could better handle high-traffic sites, developers across the globe saw even more power in the tool.

From 2009 to 2019, Node.js has grown tremendously. While development of the tool did stagnate around 2014, a fork of Node.js called io.js kicked those in charge back into gear, and eventually the two tools were combined to create a better future.

And a better future it has become. Node.js has become one of the most popular programming environments out there, and it’s used for everything from servers to development tools, and even desktop applications.

Back to PhantomJS

PhantomJS’s popularity grew as developers realized they now knew how to write code that would automate a browser. Automated testing immediately came to mind, and a sister-tool called CasperJS was created to compliment PhantomJS.

There was only one problem: PhantomJS was a paired-down version of Google Chrome. Sure, it was fast and had a lot of features of a normal browser, but it wasn’t the browser that site visitors would be using.

You could write all the test automation you wanted, but it still wouldn’t catch bugs that only occur in Internet Explorer. Remember, at that time, many websites still needed to support the bug-ridden versions of that browser (7, 8 and 9).

So PhantomJS’s popularity as a testing tool was always limited. The benefit of automated testing in that single non-traditional browser just never seemed worth the cost.

Enter WebdriverIO

In 2015 I found out about a tool called WebdriverCSS. It was a Visual Regression Testing tool used to compare two screenshots of a page and see if they’re visually different from each other. I had tried many tools like this in the past, but this one was unique.

WebdriverCSS was actually a plugin for a library called WebdriverIO. WebdriverIO came with all the great features of PhantomJS (being able to automate a browser through a Node.js script), but had the added benefit of supporting Selenium.

Remember Selenium? That tool from the mid-2000s that I glossed over because I was a little fearful of Java?

WebdriverIO made Selenium approachable to me, and that was literally life-changing (yes, literally). Since investing myself in WebdriverIO, my actual job duties had shifted from primarily front-end development (writing HTML and CSS) to a focus on front-end testing (writing WebdriverIO test scripts).

There were four selling points that convinced me to use WebdriverIO. But before I list my reasons, how about we hear from some other folks:

“Webdriverio is a comprehensive, well documented project with great coverage of the Selenium/Webdriver/Appium specs, as well as loads of very useful helper abstractions. @christian-bromann has been amazing to work with, providing fantastic support and encouraging a helpful community in general.” - Goerge Crawford, who is now a core contributor on the project

“The framework of choice at Oxford University Press, I have been using webdriverio since 2016 and it has made life so much easier, especially now with v5, hats off to @christian-bromann and his crew who maintain and continualy support it, keep up the good work guys.” - Larry G. - Automation Architect

“Give a QA Engineer a web-automation framework and he might automate some tests. Give him WebdriverIO and he will build a full-fledged, robust automation harness in a matter of days. All jokes aside, WebdriverIO was a blessing in disguise for us @Avira. I’ll never look back to other JS-based automation frameworks!” Dan Chivescu, QA Lead

And what about my reasons for choosing WebdriverIO?

WebdriverIO is “Front-end Friendly”

Unlike most other Selenium tools out there, WebdriverIO is written entirely in JavaScript. It’s also not restricted to just Selenium, as support for the Chrome Devtools protocol (which we’ll talk about later) was added in Version 5.13 of WebdriverIO. This means you can use WebdriverIO without installing Java or running Selenium.

Like I said, I always thought browser automation meant figuring out how to get some complex Java app running. There was also the Selenium IDE, but writing tests through page recordings reminded me too much of WYSIWYG web editors like Microsoft FrontPage (you’ll need to look that up if you weren’t doing web development in the early 2000’s).

Instead, WebdriverIO lets me write in a language I’m familiar with, and integrates with the same testing tools that I use for unit tests (e.g., Mocha).

As a developer, the mental switch from writing the functionality to writing the test code requires minimal effort (since it’s all just JavaScript), and I love that.

The other great thing, and this is more to credit WebDriver than WebdriverIO (there is a difference and we’ll talk about it), is that I can use advanced CSS selectors to find elements.

xPath scares me for no good reason. Something about slashes instead of spaces just chills my bones. But I don’t have to learn xPath.

Using WebdriverIO, I simply pass in my familiar CSS selector and it knows exactly what I’m talking about.

I believe front-end developers should write tests for their own code (both unit and UI), and WebdriverIO makes it incredibly easy.

It Has the Power of Selenium

I always felt held back when writing tests in PhantomJS, knowing that it could never validate functionality in popular, but buggy, browsers like IE.

But because WebdriverIO has built-in support for Selenium, I’m able to run my tests in all sorts of browsers.

Selenium is an incredibly robust platform and an industry leader for running browser automation. WebdriverIO stands on the shoulders of giants by piggy-backing on top of Selenium. All the great things about Selenium are available, without the overhead of writing Java-based tests.

It Strives for Simplicity

The commands you use in your WebdriverIO tests are concise and common sense.

What I mean is that WebdriverIO doesn’t make you write code to connect two parts together that are obviously meant for each other.

For example, if I want to click a button via a normal Selenium script, I have to use two commands. One to get the element and another to click it.

Why? It’s obvious that if I want to click something, I’m going to need to identify it.

WebdriverIO simplifies the ‘click’ command by accepting the element selector right in to the command, then converts that in to the two Selenium actions needed. That means instead of writing this:

driver.findElement(By.id('submit')).click();

I can just write this:

$('#submit').click();

It’s so much less mind-numbing repetition when writing tests…

Speaking of simple, I love how WebdriverIO integrates in to Selenium. Instead of creating its own Selenium implementation, it uses the common REST API that Selenium 2.0 provides.

If you haven’t worked with API endpoints before, this may not make sense. Don’t worry, it’s not necessary to understand. But if you’re interested, here’s how it goes.

WebdriverIO sees that you want to run a command (say “getUrl”). It takes that command and converts it into a request to the Selenium server (it would look like “/session/someSessionIdHere/url”). The Selenium server processes the request and returns the result to WebdriverIO, which then returns the found URL to your code.

Most of WebdriverIO is made up of these small commands living in their own separate small file. This means that updates are easier, and integration into cloud Selenium services like Sauce Labs or BrowserStack are incredibly simple.

Too many tools out there try to reinvent the wheel. I’m glad WebdriverIO keeps it simple and uses what is already out there. This, in turn, helps me easily understand what’s going on behind the scenes.

It’s Easily Extendable/Scalable

As someone who has spent a considerable portion of their career working for large organizations, it’s important to me that the tools I’m using are easily extendable.

I’ll have custom needs and will want to write my own functionality. WebdriverIO does a great job at this in two ways:

Custom Commands

There are many commands available by default via WebdriverIO, but there are times when you want to write a custom command just for your application.

WebdriverIO makes this really easy. Just call the “addCommand” function, and pass in your custom steps.

Here’s an example from their docs:

browser.addCommand('getUrlAndTitle', function () {
    // `this` refers to the `browser` scope
    return {
        url: this.getUrl(),
        title: this.getTitle()
    };
});

Now, any time I want both the URL and title in my test, I’ve got a single command available to get that data.

browser.url('http://www.github.com');
const result = browser.getUrlAndTitle();
Page Objects

With the 4.x release of WebdriverIO, they introduced a new pattern for writing Page Objects. For those unfamiliar with the term, Page Objects are a way of representing interactions with a page or component.

Rather than repeating the same selector across your entire test suite for a common page element, you can write a Page Object to reference that component.

Then, in your tests, request what you need from the Page Object and it handles it for you. This helps your tests be more maintainable and easier to read.

They’re more maintainable because updating selectors and actions occur in a single file.

When a simple HTML change to the login page breaks half your tests, you don’t have to find every reference to input[id="username"] in your code. You only have to update the Login Page Object and you’re ready to go again.

They’re easier to read because tests become less about the specific implementation of a page and more about what the page does.

For example, say we need to log in to our website for most of our tests. Without Page Objects, all the tests would begin with:

browser.url('login-page');
browser.setValue('#username', 'testuser');
browser.setValue('#password', 'hunter2');
browser.click('#login-btn');

With Page Objects, that can become as simple as:

LoginPage.open();
LoginPage.login('testuser', 'hunter2');

No reference to specific selectors. No knowledge of URLs. Just self-documenting steps that read out more like instructions than code.

Now, Page Objects aren’t a new idea that WebdriverIO introduced. But the way they’ve set it up to use plain JavaScript objects is brilliant. There is no external library or custom domain language to understand. It’s just JavaScript and a little bit of prototypical inheritance. (We’ll definitely cover Page Objects in more detail later in this book.)

Summing It Up

I wouldn’t call myself a real software tester. I’m far too clumsy to be put in charge of ensuring a bug-free launch.

Yet, I can’t help but love what WebdriverIO provides me, and I’m a fan of what’s going on with the project and its future. Hopefully this book helps you feel the same way.

1.1.3 Technical Details

Versions

Because technology changes fast, it’s good to cover what versions of software I used when creating these exercises.

  • Node.js: v18.12.1
  • WebdriverIO: 7.25.4
  • Java 8 (optional)
  • MongoDB 4.2 (optional)

Node.js Requirements

Check the Webdriver.io website for minimum Node.js version requirements. Normally, WebdriverIO works with the latest few LTS versions of Node.js.

Git Repository

To download code samples for the main part of the book, visit the official git repo.

Where to check for updates/corrections

I’ve written the book using the most recent version of these technologies as I could. However, with an ever changing landscape, updates will need to be made.

You can see a list of changes by visiting the book’s changelog.

Where to find help

I’ve done my best to make the material clear and understandable, but I’m sure to have fallen short in some areas. To get extra help, try one of these three options:

  • This Book’s GitHub Issues Page
  • Gitter - WebdriverIO runs an official chat room for folks seeking help with the tool itself.
  • My personal email - Although I’m usually slow to respond, you can reach out to me at kevin at learnwebdriverio dot com. I’ll do my best to get back to you in a timely manner.

Errata

I’ve worked to ensure that the content of this book is accurate and that the examples actally run. However, I definitely can make mistakes (that’s why I’m a fan of testing after all). Also, technology changes, and what worked when I wrote this doesn’t necessarily work anymore.

If you find errata, please submit an issue on the GitHub repository and I’ll work to get the error resolved.

Technical Knowledge Requirements

What sort of skill level do you need in order to understand the material covered in this book?

While I’ve worked to explain the many concepts introduced through UI testing, I do make the assumption that readers are familiar with the following technologies:

  • HTML (basic understanding of how HTML structure is composed)
  • CSS (basic understanding of how CSS selectors work)
  • JavaScript
  • NodeJS
  • Terminal/Shell commands

In regards to JavaScript and NodeJS, here are some important concepts to understand:

  • Data types: strings, objects, arrays…
  • Functions: how to call and create them
  • Conditionals: if/else, ternary
  • ES6 updates:
    • const & let
    • Array functions: map, forEach
    • Classes (we will cover this in more detail in the page objects section)

That said, you don’t need to know how to create a Node.js server, build a website or use the latest JavaScript framework.

Where can I freshen up?

While I won’t be covering it in this book, here’s a few free resources you might find helpful if you’d like to refresh your knowledge: