The Quest for Better Tests

Over the past twenty years, I’ve written my fair share of unit tests, mostly just covering the happy path and sending in some bogus inputs to test the edges. Typically following a fat-model-thin-controller method (often recommended by me), I failed to understand the point of integration tests. I tried TDD at the beginning of several greenfield projects, but I was never successful in making it sustainable. Similarly, with Selenium, it worked at first but quickly proved to be too brittle to keep up with rapidly changing UIs. (In retrospect, bad CSS architecture on those projects probably deserved the blame more than Selenium per se.)

Despite my somewhat lackluster attitude toward testing, my employers and customers knew me as a big advocate for test automation—who always insisted that we never release anything to QA without at least a layer of unit tests. Oftentimes I was overruled by more senior leadership. As expected, from time-to-time, we all got burned by bugs that would have been easily caught by more comprehensive tests. We swore we’d write better test next time. But for a million reasons—”speed to market” being the peskiest of them—testing never became a consistent priority.

In February of 2016, I joined Lab Zero. The very first observation I made after starting on a project—a financial services application three years in the making—was the sheer volume of test code. Nearly everywhere I looked, I found at least a 10:1 ratio of lines of test code to lines of “real” code. Shortly after starting on my first story, it became readily apparent that at least a 10:1 ratio of developer effort was required to continue this pattern. We joked about developers who reported their status during daily standup by saying, “I’m done with the code and just need to write some tests,” because we knew that was a euphemism for being less than 10% done with the story!

It didn’t take long before realizing how much catching up I needed to do. In fact, the project leader told me it would take me “a year” to learn how to test properly. After first thinking that he sounded condescending, I came to realize that he was just being realistic. Testing is hard; testing effectively is even harder.

Ten months into my Test Quest, here are some important lessons I’ve picked up about automated testing.

Note: I used Ruby, Rspec and Cucumber to create my code samples, but the lessons learned will likely apply to other ecosystems.

The myth of 100% code coverage

Sure code coverage an important metric, but one that only tells part of the story. Test coverage is not the same as good test coverage. It’s remarkably easy to write tests that test nothing at all, that test the wrong things or that test the right things—but in ways that never fail.

Consider the following example, wherein the remove_employee method has a glaring error, one that will easily be caught by a unit test. Or will it?
class Company
  def initialize
    @employees ||=
  def add_employee(person)
    @employees << person
  def remove_employee(person)
    @employees.size - 1 #danger: incorrect implementation!
RSpec.describe Company, :type => :model do
  let(:subject) { }
  describe 'managing employees' do
    let(:person) { double(‘person’) } 
    it ‘removes an `employee`’ do
      employee_count = subject.add_employee(person)
      expect(subject.remove_employee(person)).to eq(employee_count - 1)

Because the test for removing employees naively compares only the outputs of the add and remove methods, it passes with flying colors even though the remove_employees method internals are totally wrong.

And this why it’s a good idea to…

Test internals instead of just inputs and outputs

In most—if not all—programming languages, there are many more ways to produce “outputs” than just the return values of method calls.

C/C++ developers can optionally pass primitives to functions by reference (e.g. int &param1), morphing those inputs into potential outputs. More modern languages restrict everything to pass-by-value, but most of the time what’s being passed “by value” is actually a reference to an instance of an object. As a result, it’s possible—and quite commonplace—to mutate the object instance itself in the context of a method, providing another sneaky way for methods to have unexpected “outputs.”

Unfortunately, testing internals can be challenging, but it doesn’t have to be.

Design and write testable code

A previous version of me believed that only a very limited set of circumstances should trump writing elegant code. I recently relaxed this constraint, adopting the belief that it’s okay to over-decompose code (and make other code design compromises) in order to serve the goal of writing code that’s more testable.

For example, I might replace a simple, elegant call to a setter with a method that wraps it, e.g.:

shape.color == :blue


def is_blue?(shape)
  shape.color == :blue

In the past, code like this would make my eyes bleed. However, it’s really easy now to stub out is_blue? so that it returns a mock object or performs some other test-only behavior.

This is a contrived example, but if figuring out if a shape is blue required a database read or a call to an underlying service object, then over-decomposition like this is small price to pay to make the code testable.

Test incrementally

I’ve found TDD (specifically a test-first methodology) to be overly prescriptive, usually leading to diminishing returns as the project gets more complex. If it helps clarify the specs and define edges more easily, then by all means, write tests first! However, I’ve found more productivity (and less head-scratching) comes from writing tests not necessarily first, but in short iterative bursts.

Every time I finish an “idea” in code (for lack of a better term), I switch over and edit the test, usually already open in a split-screen view next to the code. If the “idea” is too complex, I take a step back and flesh out more tests to help me clarify what I’m trying to accomplish in the code.

In the past I’ve also worked in a pairing setup where I wrote the code and switched back-and-forth with another developer writing tests. Though I haven’t done this recently, it’s another technique that’s worked well for me.

DRY code, wet lets

Don’t Repeat Yourself (DRY) is a great rule-of-thumb for writing code, but it can be disastrous  when memoizing test data, e.g. through calls to rspec’s let or let!

With the exception of some truly global concepts (e.g. user_id), all test data should be initialized in close proximity to (read: immediately before) the tests that use it and should not be reused between unrelated tests.

Thinking I was helping, I tried to DRY-up some lets, soonafter realizing that I had no idea what test data was getting passed to what tests. Even it feels cumbersome to repeatedly initialize the same data over and over before each test, it’s the right thing to do.

Re-use Cucumbers with Scenario Outlines

Unlike lets, some parts of the test ecosystem are actually designed for reuse. One example: Scenario Outlines. I recommend using these whenever possible.

With Cucumber, Scenario Outlines represent the “functions” in an otherwise functionless DSL. In addition to the obvious reduction in code bulk, thinking about how I can turn several tests into one test “template” helps me write more thoughtful, self-documenting tests.

Vary only what needs to be varied

It’s tempting to cut corners (and make tests run more efficiently) by favoring randomizing test data over creating different tests for different values. Often this practice is harmless, especially if the specific values—as long as they’re in range, e.g. a person’s age—are inconsequential. (If specific values matter, e.g. people 65 and over get medical benefits, they should of course get their own explicit tests.)

Randomizing test data can also be a trap. For example, a test for a get_birth_year method might start to “flicker” or “flap,” meaning that it passes and fails non-deterministically between test runs—all because of the decision to randomize ages.

To protect against this, it helps to treat each test as a controlled experiment, i.e. by keeping the scientific method in mind. Try to control everything that can be controlled and vary only the specific inputs getting tested. Of course, there are things we can’t control, like the system clock, the speed of the network and the availability and behavior of upstream systems. But whenever things can be controlled, control them.

Write meaningful, descriptive test names

Acknowledging the fact that I just recommended thinking like a scientist, I’m now going to suggest putting on a writer hat. When naming tests cases and writing Cucumber steps (which read like prose already), it’s super-important to be descriptive, concise and accurate.

In a place full of smart people like Lab Zero (#humblebrag), developers are not necessarily the only people looking at tests. Recently I had an agile product owner ask me how a certain feature handled different types of inputs. To answer the question, I walked him through my rspecs, reading each test name aloud and describing the expectations.

Writing coaches always say “show, don’t tell.” There is simply no better way to show—and prove—that a feature works than reading through the tests, which serve as the closest link between the specs and the code.

Putting the Science in “Computer Science”

One of my professors in college said that any discipline that has the word “science” in it is actually not a science. This is especially true for computer science, something that at some schools classify as a fine art (making it possible get a BA in CS). Writing code is a certainly a form of communication, at least to peers and future developers. Of course, they are not the customers. And the best way to “communicate” with customers is to provide something for them that works as designed.

How do we ensure that? With well-written tests.

Tests really put the science in computer science. Think of them as a series of carefully controlled experiments. The hypothesis is that the code implements the spec.

Without tests, there’s really no way to know if it does or not.

* * *

Originally published on Lab Zero’s blog.

Five First Impressions of LabZero

I recently joined Lab Zero as a software developer. My friend Brien Wankel, one of their Principal Engineers, had been encouraging me to interview here for more than a year. I hesitated because, to put it bluntly: What’s so special about another boutique software development agency? There are hundreds—if not thousands—of them in the Bay Area. Plus, I was still trying to strike gold playing the startup equity game and I had already run my own boutique software development agency for a decade.

At long last I took the plunge, and I’m really glad I did. Ten weeks in, these are my first impressions of Lab Zero.

1. “We pay for every hour worked, no exceptions.” —The CEO

Lab Zero’s culture in three words: “Life, then Work.” Everyone here, myself included, is a W-2 hourly employee. To prevent people from worrying about using PTO when they’re sick (which eats into vacation time), we’ve done away with the concept altogether. We get paid for every hour we work—and we don’t get paid when we’re not working. That also has the side benefit of discouraging people from coming to work when they’re contagious. As a substitute for PTO, we accrue personal/family sick time, bereavement and jury duty time.

Employment here includes all the usual benefits, but without the attached expectation of working 60-80 hours/week (or more) and getting paid for 40. I surf every Wednesday morning (if the weather conditions cooperate) and I volunteer at my daughter’s school in the afternoon. I might only bill for 4-5 hours on a Wednesday. I might put in a few more hours after dinner—or not.

I haven’t put this to the test yet, but I may need to scale back my hours at Lab Zero by 50% or more to run tech for another political campaign or to get more involved in the farm-to-table movement or maybe to start a side business—or not.

2. “We follow software best practices.” —Everybody

So we put life first and work second. But does that mean that we don’t care about what we do? Hells no!

Lab Zero embraces a documented set of methodologies that make great software development possible, if not pleasurable. We have 100% or near-100% test coverage on all our projects; we write unit tests, functional tests, automated UI tests—to the tune of roughly ten lines of test code for every one line of “real” code. We practice continuous integration; we have a stringent pull-request review process and we reject pull requests for even the slightest blemish, e.g. a typo in a commit message.

This culture of doing things right at all costs may sound too onerous to be practical, but what I learned after a couple weeks here is that the effort we put into rigorous testing pays us back in spades, measured by the very small number of issues that slip through the cracks, eventually needing to be caught by QA or found in production. Plus, as long as I can keep the test suites passing, I can refactor without fear that I’m going to break something.

And if I do break something incidentally, it usually just means I need to write a better test, which in turn will help overall quality in a virtuous cycle.

3. “We do Agile really, really well.” —Our Customers

Agile prides itself on being agile, per se. (How deliciously meta is that?) Take what you want, leave the rest. As a result, there are infinitely many ways to do agile well—and an equally-indeterminate number of ways to do it badly.

Last week, I heard a senior executive at one of our customer sites tell us (in front of a room of twenty people) that we were the gold standard for agile projects at their organization. Enough said.

4. “We care about having a beautiful, functional workspace.” (And it shows.)

We have top-shelf coffee, great snacks and drinks, a loaded kegerator, automatic standup/sitdown desks (each with four presets), Apple Cinema Displays, an office sound system, massive TVs, stylin’ chairs and Fluid Stance boards. If you need anything, within reason, it just shows up at the door.

In addition to that, Nicole Andrijauskas just finished painting an amazing mural spanning the entire south wall of the office.

We have catered lunch-and-learn sessions every other Friday. On the alternating Fridays, we descend in a hungry mob to a local restaurant (like Barbacco, this past Friday) and Lab Zero picks up the tab. In addition to Fridays and the regular bevy of snacks and beverages, there are also bagel Wednesdays, eclairs one day, coffee cake another, etc.

As much as I love our office, I also love my half-time Wednesdays working from home (and/or the beach). Which is totally fine, of course. I’ve even been finding a leftover bagel or two on Thursday morning for me.

5. “Diversity is woven into the very fabric of our culture.” —Me

The notion of full-time employment does not preclude hiring people who rawk at things besides their profession, but employers don’t explicitly benefit from it either.

At Lab Zero, where life comes first—and turnover is near nil—we’ve built an eclectic mix of developers, designers, writers, agile product owners and bizdev folks who double as parents, recovering chemists, musicians, surfers, teachers, artists, marathoners, photographers, LGBTQ folks, future real estate moguls and one of the world’s leading experts on tiki.

There’s no better testament to Lab Zero’s people than this: I could do my job almost exclusively at home. I could also bill an extra two hours instead of commuting to downtown SF from the North Bay. But I actually want to come to the office.

Ten weeks in. Zero regrets. Can you say this about your job? If not, maybe you should join us for lunch.