Testing the User Interface
I often get questions about testing “the user interface” or “the front end.” This comes up in all our technical Agile classes (Behavior Driven Development, Essential Test-Driven Development, Certified Scrum Developer), or most frequently during coaching.
There could be a few approaches here, depending on what is meant by front-end/UI testing.
For example, take the following two statements:
x = 5 + 3;
y = “5” + 3;
The results? Well, x is 8, and y is “53.” Fun stuff. That may make perfect sense looking at the code above, but what if you were faced with:
x = i + j;
y = k + m;
See where I’m going with this? Do i, j, k, and m currently reference an integer type or a string type? It depends on what came before. Which means, without tests, I have to know all the routes through all the code that leads to those statements. In my head. Forever.
Fast tests are often the only fast way to discover a simple mistake, before it gets forgotten amongst all the other code and business of our lives.
One can write unit-tests (aka “specs”), or developer-facing scenarios using Jasmine or Mocha. Those two look a lot alike, and the differences are subtle, so look carefully at both to choose one that’s right for your environment. For example, I’ve heard that Mocha is more appropriate for Node.JS teams.
End-to-end through the UI
Are you wanting to test a web app end-to-end via the UI?
First of all, this is usually a bad idea in the long run. It’s typically done as a quick and easy path from manual testing to “test automation,” but most tools in this space are record/playback tools. Therein lies much of the trouble:
- Recorded test scenarios are brittle: Move a button, change CSS, or use a different browser, a different OS, or a different who-really-knows-what-was-different-this-time, and the whole test fails.
- The tests repeatedly exercise many of the same behaviors (e.g., login), and they cover a big batch of behaviors (authentication, navigation, business rules, calculations). When they fail, they often don’t tell you what broke. They certainly don’t pinpoint the problem like discrete scenarios or unit-tests.
- The tests tend to be slow. They cover too much, they cross the network, they’re accessing all the dependencies including a database or two (hopefully all on a test environment).
It’s fine to have a handful of these within your comprehensive regression suite, but if all of your tests were to take more than a second to run, you’re not going to get the speedy feedback you need to make the automation as good as it can be.
Just last month, a client of ours reported that they had reduced a 3-4 week test cycle for their web applications down to…
Not via the recording and playback of manual test cases, to be sure.
Okay, so what did they do?
The brittleness of the test suite can be mitigated by using Cucumber with the Selenium browser API, and taking care to give your UI elements good, intention-revealing names. The client did some of this, but that wasn’t all.
They wrote more concise, targeted scenarios. They separated out different concerns. For example, they test login and authentication in one set of scenarios. In all other scenarios requiring login, they simply declare themselves to be an authenticated user. (I know, some of you are saying “But how…without logging in?!” Wait for it!)
They tested navigation as a separate rule from calculations or business logic.
And then (and here’s where these steps start to reveal true speed benefits), they altered their step-definitions (the “glue code”) for those tests to test as much of this behavior as close to where the behaviors live as possible (e.g., in-process if possible). Even authentication can be tested in-process, unless you’re wholly reliant on a third-party external framework.
They also used mocking frameworks and other forms of test-doubles within their Cucumber scenarios. Test-doubles aren’t just limited to in-process tests. You could create a Given step that directly populates a test-copy of the database, you invoke the behavior in the When step, then assert against changes to the database in your Then step.
Of course, if you limit one set of tests by skipping the UI or replacing cumbersome dependencies with test-doubles, you also have to remain very aware of what real behaviors are not tested by that feature file, and you may need to write separate scenarios around those rules. For example, what if there are stored-procedures doing important work in the database? You have to test those, too!
Be creative. One of the great super-powers of Cucumber, as a tool, and Behavior Driven Development (BDD) as a practice, is that they encourage us to decouple our test-first thinking-process from the technologies we use to implement the behaviors. Dale Emery, a friend of mine and Agile coach extraordinaire, said we could imagine that we have the greatest UI imaginable: We just think of what we want the application to do, and it does it. Write your tests like the technology doesn’t matter, and the technology won’t limit you.
Write your tests like the technology doesn’t matter, and the technology won’t limit you.
If we’re talking about testing code that gets added to an older flavor of MS Windows widgets, for example, I always recommend delegating our way out of poorly-architected vendor frameworks (e.g., ASP.Net prior to MVC.Net), then testing our code (and not Microsoft’s) as we would have, otherwise: Unit-tests, or perhaps business-facing tests that access the behaviors through an API or an application facade.
We rely on the compiler to check those lines of pure and simple delegation. If you’re not using TDD or BDD, you may first need to rely on some other way to do “Characterization Testing” (behavioral coverage by retrofitting tests onto existing untested code), using something like WinRunner (um…that should not be construed as a recommendation for WinRunner).
With all these forms of testing, we need to identify “What am I wanting to test?” and “How can I test that as close as possible to where it lives?”
The second question helps speed up the test suites and also makes them more likely to pinpoint a problem, instead of reporting back a vague “something broke!”
A simple example I use to clarify what someone means by “UI testing”: Are we trying to assert that (a) “There is a red STOP button in the upper right-hand corner?” or are we trying to assert that (b) “When the red STOP button is pressed, it becomes a green GO button.”
The first case (a) is an interface design detail, and we may want to check a fixed value or range. You could easily assume that having build-time automated tests for stuff like this would help, but I haven’t seen a big ROI. In fact, it quickly becomes a maintenance nightmare of duplicated values between tests and the HTML template.
Static look-and-feel (i.e., features of the UI that are not dynamically changing) are best tested by looking at, and using, the application. I.e., exploratory testing or UAT (or both). Often real UX problems that are found become requests for a change, and will likely need to be introduced as new user stories through the product backlog, prioritized by the product advocate (Scrum’s Product Owner, XP’s Onsite Customer).
The second case (b), in contrast, is a behavior, and most of our defects happen when we accidentally break an existing behavior while adding a new behavior in close proximity (e.g., when a yellow SLOW DOWN state is added to the “same” button.) So my rule of thumb is this: Don’t automate the tests of data (page location, formatting, configuration); and do test all system behaviors that your team builds (code that does things–changes things–wherever it lives).
What behavior lives in the UI? Decouple your feature behaviors from the vendor frameworks as clearly and as simply as possible. Test your behaviors, in isolation, and as close to where the code resides as possible.