The Quest for Better Tests

Over the past twenty years, I’ve written my fair share of unit tests, mostly just covering the happy path and sending in some bogus inputs to test the edges. Typically following a fat-model-thin-controller method (often recommended by me), I failed to understand the point of integration tests. I tried TDD at the beginning of several greenfield projects, but I was never successful in making it sustainable. Similarly, with Selenium, it worked at first but quickly proved to be too brittle to keep up with rapidly changing UIs. (In retrospect, bad CSS architecture on those projects probably deserved the blame more than Selenium per se.)

Despite my somewhat lackluster attitude toward testing, my employers and customers knew me as a big advocate for test automation—who always insisted that we never release anything to QA without at least a layer of unit tests. Oftentimes I was overruled by more senior leadership. As expected, from time-to-time, we all got burned by bugs that would have been easily caught by more comprehensive tests. We swore we’d write better test next time. But for a million reasons—”speed to market” being the peskiest of them—testing never became a consistent priority.

In February of 2016, I joined Lab Zero. The very first observation I made after starting on a project—a financial services application three years in the making—was the sheer volume of test code. Nearly everywhere I looked, I found at least a 10:1 ratio of lines of test code to lines of “real” code. Shortly after starting on my first story, it became readily apparent that at least a 10:1 ratio of developer effort was required to continue this pattern. We joked about developers who reported their status during daily standup by saying, “I’m done with the code and just need to write some tests,” because we knew that was a euphemism for being less than 10% done with the story!

It didn’t take long before realizing how much catching up I needed to do. In fact, the project leader told me it would take me “a year” to learn how to test properly. After first thinking that he sounded condescending, I came to realize that he was just being realistic. Testing is hard; testing effectively is even harder.

Ten months into my Test Quest, here are some important lessons I’ve picked up about automated testing.

Note: I used Ruby, Rspec and Cucumber to create my code samples, but the lessons learned will likely apply to other ecosystems.

The myth of 100% code coverage

Sure code coverage an important metric, but one that only tells part of the story. Test coverage is not the same as good test coverage. It’s remarkably easy to write tests that test nothing at all, that test the wrong things or that test the right things—but in ways that never fail.

Consider the following example, wherein the remove_employee method has a glaring error, one that will easily be caught by a unit test. Or will it?
class Company
  def initialize
    @employees ||= Set.new
  end
  def add_employee(person)
    @employees << person
    @employees.size
  end
  def remove_employee(person)
    @employees.size - 1 #danger: incorrect implementation!
  end
end
RSpec.describe Company, :type => :model do
  let(:subject) { Company.new }
  describe 'managing employees' do
    let(:person) { double(‘person’) } 
    it ‘removes an `employee`’ do
      employee_count = subject.add_employee(person)
      expect(subject.remove_employee(person)).to eq(employee_count - 1)
    end
end

Because the test for removing employees naively compares only the outputs of the add and remove methods, it passes with flying colors even though the remove_employees method internals are totally wrong.

And this why it’s a good idea to…

Test internals instead of just inputs and outputs

In most—if not all—programming languages, there are many more ways to produce “outputs” than just the return values of method calls.

C/C++ developers can optionally pass primitives to functions by reference (e.g. int &param1), morphing those inputs into potential outputs. More modern languages restrict everything to pass-by-value, but most of the time what’s being passed “by value” is actually a reference to an instance of an object. As a result, it’s possible—and quite commonplace—to mutate the object instance itself in the context of a method, providing another sneaky way for methods to have unexpected “outputs.”

Unfortunately, testing internals can be challenging, but it doesn’t have to be.

Design and write testable code

A previous version of me believed that only a very limited set of circumstances should trump writing elegant code. I recently relaxed this constraint, adopting the belief that it’s okay to over-decompose code (and make other code design compromises) in order to serve the goal of writing code that’s more testable.

For example, I might replace a simple, elegant call to a setter with a method that wraps it, e.g.:

shape.color == :blue

vs.

def is_blue?(shape)
  shape.color == :blue
end

In the past, code like this would make my eyes bleed. However, it’s really easy now to stub out is_blue? so that it returns a mock object or performs some other test-only behavior.

This is a contrived example, but if figuring out if a shape is blue required a database read or a call to an underlying service object, then over-decomposition like this is small price to pay to make the code testable.

Test incrementally

I’ve found TDD (specifically a test-first methodology) to be overly prescriptive, usually leading to diminishing returns as the project gets more complex. If it helps clarify the specs and define edges more easily, then by all means, write tests first! However, I’ve found more productivity (and less head-scratching) comes from writing tests not necessarily first, but in short iterative bursts.

Every time I finish an “idea” in code (for lack of a better term), I switch over and edit the test, usually already open in a split-screen view next to the code. If the “idea” is too complex, I take a step back and flesh out more tests to help me clarify what I’m trying to accomplish in the code.

In the past I’ve also worked in a pairing setup where I wrote the code and switched back-and-forth with another developer writing tests. Though I haven’t done this recently, it’s another technique that’s worked well for me.

DRY code, wet lets

Don’t Repeat Yourself (DRY) is a great rule-of-thumb for writing code, but it can be disastrous  when memoizing test data, e.g. through calls to rspec’s let or let!

With the exception of some truly global concepts (e.g. user_id), all test data should be initialized in close proximity to (read: immediately before) the tests that use it and should not be reused between unrelated tests.

Thinking I was helping, I tried to DRY-up some lets, soonafter realizing that I had no idea what test data was getting passed to what tests. Even it feels cumbersome to repeatedly initialize the same data over and over before each test, it’s the right thing to do.

Re-use Cucumbers with Scenario Outlines

Unlike lets, some parts of the test ecosystem are actually designed for reuse. One example: Scenario Outlines. I recommend using these whenever possible.

With Cucumber, Scenario Outlines represent the “functions” in an otherwise functionless DSL. In addition to the obvious reduction in code bulk, thinking about how I can turn several tests into one test “template” helps me write more thoughtful, self-documenting tests.

Vary only what needs to be varied

It’s tempting to cut corners (and make tests run more efficiently) by favoring randomizing test data over creating different tests for different values. Often this practice is harmless, especially if the specific values—as long as they’re in range, e.g. a person’s age—are inconsequential. (If specific values matter, e.g. people 65 and over get medical benefits, they should of course get their own explicit tests.)

Randomizing test data can also be a trap. For example, a test for a get_birth_year method might start to “flicker” or “flap,” meaning that it passes and fails non-deterministically between test runs—all because of the decision to randomize ages.

To protect against this, it helps to treat each test as a controlled experiment, i.e. by keeping the scientific method in mind. Try to control everything that can be controlled and vary only the specific inputs getting tested. Of course, there are things we can’t control, like the system clock, the speed of the network and the availability and behavior of upstream systems. But whenever things can be controlled, control them.

Write meaningful, descriptive test names

Acknowledging the fact that I just recommended thinking like a scientist, I’m now going to suggest putting on a writer hat. When naming tests cases and writing Cucumber steps (which read like prose already), it’s super-important to be descriptive, concise and accurate.

In a place full of smart people like Lab Zero (#humblebrag), developers are not necessarily the only people looking at tests. Recently I had an agile product owner ask me how a certain feature handled different types of inputs. To answer the question, I walked him through my rspecs, reading each test name aloud and describing the expectations.

Writing coaches always say “show, don’t tell.” There is simply no better way to show—and prove—that a feature works than reading through the tests, which serve as the closest link between the specs and the code.

Putting the Science in “Computer Science”

One of my professors in college said that any discipline that has the word “science” in it is actually not a science. This is especially true for computer science, something that at some schools classify as a fine art (making it possible get a BA in CS). Writing code is a certainly a form of communication, at least to peers and future developers. Of course, they are not the customers. And the best way to “communicate” with customers is to provide something for them that works as designed.

How do we ensure that? With well-written tests.

Tests really put the science in computer science. Think of them as a series of carefully controlled experiments. The hypothesis is that the code implements the spec.

Without tests, there’s really no way to know if it does or not.

* * *

Originally published on Lab Zero’s blog.

Why We Shouldn’t Compare Vault 7 to Snowden’s Leaks

For seven years I worked as a government contractor developing software for CIA. Although I was not briefed into as many compartments as a systems administrator like Snowden, I held a TS/SCI clearance and had the same ability to access classified information as any “govie,” just with a different color badge.

Also unlike Snowden, I didn’t knowingly compromise any classified material. That being said, what Snowden did is ultimately good for civil liberties in this country. Moreover, the courage and bravery of his actions make him a true patriot, an American hero and the mother of all whistleblowers.

This is simply not the case for the anonymous leaker(s) behind Vault 7.

The reason for this lies not in the specific methods of cyberwarfare that were leaked today, but rather in who was the target and by whom were they targeted. In other words, CIA using cyber attacks against foreign nations is very different from NSA violating American citizens’ 4th Amendment rights with wholesale data collection from wireless carriers.

Spying on Americans is simply not in CIA’s charter. We have plenty of ways to fuck with Americans: NSA, FBI, DOJ, IRS, state and local police, metermaids and a million other authorities. But unless you’re communicating with ISIS, CIA could care less about what’s happening in your living room.

What CIA does care about is gathering intelligence around the world to keep Americans safe at home and abroad. Of course there are boundaries. Sometimes those boundaries get crossed. Cyber attacks, however, do not violate the Geneva Conventions or any other rules of engagement. It’s 2017, ffs. If our country wasn’t exploiting hostile nations’ computer networks and systems, I would be disappointed in us. If Alan Turing didn’t “hack” the Enigma code during WWII, this post would probably be written in German.

There are two big arguments against this, two reasons why people are saying this release of information is good for America and her freedoms.

The first argument is that CIA did us a disservice by not sharing these exploits with the private sector, thereby leaving the doors open for bad guys.

That is true, but only in part. Hackers would need to independently find these same vulnerabilities and find ways to exploit them. It’s not like they’re gonna call CIA’s helpdesk for virus installation instructions. Furthermore, we in the open source community have a long history of whitehat hacking, the process of finding and reporting vulnerabilities back to vendors to make the digital world more safe and secure.

The second (and related) argument is that viruses and other malware could fall into the wrong hands. This is also true, just like it’s true for assault weapons, hard drugs and prostitution. They’re all illegal af, yet the bad guys still have ways to get them. This doesn’t mean we should stop cyber espionage, any more than it means we should stop making military assault rifles. Like with all our spying activities—and with spying activities in general—we should just do a better job covering them up, in much the same way we protect the real identities of (human) assets in the field.

In sharp contrast with what Snowden did, this release will have a net negative impact on our intelligence-gathering capabilities, weakening our ability to engage with potentially dangerous foreign powers.

 

Perhaps the worst part of this disclosure is that it further undermines CIA and erodes confidence in the intelligence community, already under fire from the so-called Trump Administration. It also comes, conveniently, just after Trump claimed he was inappropriately wiretapped.

Technically, this leak has no bearing upon wiretapping, but it’s safe to assume that Trump will take this as an opportunity to further belittle CIA and the intelligence claims about Russian interference in the election.

We will probably never know, but I strongly suspect a Russian source provided some if not all of these leaked materials. Let’s not forget: even though Snowden lives in exile in Russia, he’s as American as apple pie.

Good on You, Good Eggs

Ordering is a piece of cake using Good Eggs’ responsive web site or iOS app

Even the most saintly among us have experienced schadenfreude, the act of taking pleasure in someone else’s misfortune. More often than not, however, I find myself seeking a way to empathize with someone’s achievements.

Unfortunately, the American English lexicon falls short in this capacity. We’re fraught only with the phrase “Good for you” which is as likely to carry authenticity as it is sarcasm, envy or ridicule.

To properly express myself under these circumstances, I must turn to British English and their lovely idiom “Good on you,” which leaves little room for misinterpretation.

This foray into the subtleties of English dialogue might seem silly and off-topic, but I assure you it’s the only way I can possibly reflect my feelings about this matter, namely: There is quite literally nothing that isn’t good about Good Eggs, the online grocer that has returned to my daughter’s elementary school for a second joint fundraiser.

As they did in the fall, Good Eggs plans to offer, for a limited time, 10% of gross sales back to participating Bay Area schools. At Hidden Valley in Marin County’s quaint town of San Anselmo, those funds go directly to the school garden. To participate, just sign up and use the code HIDDENVALLEY at checkout. As an added bonus, Good Eggs will also apply a credit of $15 at the outset—and another $15 for customers who place orders before March 15th.

Good on you, Good Eggs. And good on all of us who participate in this amazing program that benefits local farmers/producers and local schools while putting great food on the table with unparalleled convenience.

Good Eggs offers same day grocery delivery (for orders placed by 1pm) or next-day delivery (for orders placed by midnight). They have a web site and an iOS app that make ordering a breeze. Their extensive catalog of products makes it possible for them to be the sole-source of groceries for even the most discerning families of foodies.

A Good Egg carefully inspects some dino kale before packing

I recently had the pleasure of touring the Good Eggs facility in San Francisco. While soothing music played through the warehouse PA, I marveled at the discipline applied to each food product from the four different temperature zones at it gets hand-inspected before packing. They reject any item with even the slightest imperfection and relegate it to the Good Eggs kitchen, where master chefs repurpose it into lunch for fellow staff members. This virtuous cycle results in food waste numbers of about 4%, besting most grocery stores by a factor of ten, according to my host.

Their packaging department demonstrates a comparable concern for Mother Earth by using compostable, reusable and recycle-able packaging where-ever possible. Customers can leave their packing materials at their door; when the next delivery comes around, they’ll get retrieved and repurposed.

Master chefs at work in the Good Eggs kitchen

As I was treated to a revitalizing turmeric, ginger and almond milk “tea” from the Good Eggs kitchen, I learned how they intend to enter the market for school lunches and pre-packaged meals with minimal preparation and that they plan to start selling alcohol in the near future.

Good Eggs offers pricing similar to a high-end grocer like Whole Foods with free delivery for orders over $60. They also carry speciality items like Tartine bread and Bi-Rite ice cream, for which they charge a premium.

Small price to pay for not having to queue up for two hours for a loaf of bread or a scoop of ice cream.

There I go with my British English again.