The Transformation Priority Premise

I just read the most brilliant new idea that I’ve seen since I learned TDD and Refactoring ten years ago. Naturally, it comes from the brain of Uncle Bob Martin. It’s called “The Transformation Priority Premise.” I’ll describe it briefly here. But, if you want to really understand it, you should go read Uncle Bob’s blog yourself:

Ok… So, what is the Transformation Priority Premise?

If you are familiar with Test Driven Development, you know the phrase “red, green, refactor.” That phrase encapsulates the process of writing tests before you write the code:

  • Red = Write a failing test and watch it fail.
  • Green = Write just enough code to make the test pass.
  • Refactor = Remove any duplication in the code without changing it’s behavior (all tests should still pass).

And, if you’re familiar with Martin Fowler’s book “Refactoring,” you know that a refactoring is a simple change to the structure of the code that is guaranteed not to change the behavior of the code. These are things like:

  • Introduce Variable – Replaces a constant with a variable.
  • Extract Method – Extracts one of more lines of code from one method into another (new) method.
  • Rename – Renames a variable, method or class.

These changes are so easy to implement that we can actually build tools that do them for us. Visual Studio comes with a handful of built-in refactorings. ReSharper adds significantly to that list.

What Uncle Bob is proposing is that there are similarly simple “Transformations” that change the behavior of a code base. Furthermore, he states that there is an order or priority in which we should apply these transformations – from simple to complex. He proposes the following (admittedly incomplete and probably mis-ordered) list:

  • [] –> nil
  • nil –> constant
  • constant –> constant+
  • constant –> scalar
  • statement –> statements
  • unconditional –> if
  • scalar –> array
  • array –> container
  • statements –> recursion
  • if –> while
  • expression –> function
  • variable –> assignment

Don’t worry if that doesn’t make any sense. I really just blogged it so I’ll have a quick reference. But, what’s cool about them is that they can actually help us pick which tests we should write in what order. For example, if we decide to write a test that requires us to add an if statement to our code, that test requires the “unconditional –> if” transformation. Knowing that, we can try to think of other tests that rely solely on transformations higher up the list.

The ultimate reward is that by writing our tests using transformation priority order, we should not ever get to a point where one test causes us to rewrite an entire method. Furthermore, because these transformations are all relatively simple in and of themselves, smart tools vendors may be able to build support for them directly into tools like ReSharper.

I’ve probably just confused the heck out of you with all this. Suffice to say that Uncle Bob’s posts are excellent. They include source code that he uses to show you how not using the priority order causes pain, and how using the priority order resolves that pain.

What are you waiting for? Go read those posts!

(Oh, and for those of you unfamiliar with Uncle Bob’s work, he references something called a Kata several times. The term comes from martial arts. It refers to a movement that is practiced over and over and over until it becomes second nature. Think “wax on, wax off”. Uncle Bob is a big fan of solving the same problems over and over and over to gain new insights into the potential solutions. And, he’s published several Katas ranging from calculating a bowling score to identifying prime factors of an integer. They’re worth your time, too.)

Zero Defects™ – Part 2

As I suspected, my notion of Zero Defects was mildly controversial. I’ll try to clear up some of the questions in this post. But, rather than coming up with a cohesive argument, which I tried to make the first time, I’ll respond to some of the responses I received on the first post. (NOTE: Comments did not come with the posts when I manually imported them to Posterous.)

  • Defects vs. stories is just a semantic layer; they both boil down to work that needs to be done.

Semantics are important. Yes, they’re both just work items. But, a “defect” carries with it a negative connotation, whereas a “user story” does not.

I believe strongly that using the words “defect” or “bug” sets the inappropriate expectation with business partners that all defects/bugs can and will be removed from the software prior to promotion to production. And, I’ve seen first hand how using the term “defect” led to teams killing themselves trying to fix “bugs” that were never called out in test cases by the end of a Sprint. This is a recipe for failure and burnout.

I much prefer to set the expectation that anything that wasn’t explicitly called out by a user story and/or test case in the current Sprint is new work that needs to be prioritized with all other work. The best way to do this is to label that work the same way all other new work is labeled – as a new user story.

  • Defects tells me the QA's are doing their job.

To me, the statement above is akin to saying “Lines of Code tell me my programmers are doing their job.” Defect counts are meaningless. Point me at any modern website or program, I can probably identify 100 defects in an hour. Most of them will be garbage. But, if I file a defect on them, someone (or more likely, some team) will have to triage those things, beat down the stupid ones and prioritize the rest – probably outside their normal work item prioritization process.

  • My experience is our business partners are not be concerned with defects, unless they are not caught and are promoted into production.

My experience differs. I’ve worked with multiple PO’s, both at my current employer and elsewhere, who tracked defect counts and berated the team over them. In fact, my experience is that this is more common than not.

  • The only business partner of ours looking at Rally is our PO.

Yeah, that seems to be common here. But, there’s nothing preventing you from calling a meeting at the end of each Sprint where you demo the new functionality to a broader community of stakeholders. In fact, Scrum recommends this practice. It’s called a Sprint Review Meeting.

  • The requirements in agile are much less defined than in waterfall.

I disagree 100%. There should be zero theoretical difference in the quality of requirements between agile and waterfall. Agile just prefers to discover and document requirements iteratively, one at a time; whereas waterfall tries to define all requirements up front. In practice, this means that agile methods discover and document requirements closer to the time of implementation. They also allow for new requirements to be discovered by looking at how the software actually works. Waterfall doesn’t provide for that iterative feedback, meaning that the requirements get stale.

  • Defects provide us with a good method to track these issues to resolution. If the issue is new scope... then an user story is a much better alternative.

The problem I see in tracking defects separate from user stories is prioritization. How does the developer know what to work on next? Sure, Rally (and other tools) allow you to see all work in one bucket. And, that helps. But, in my experience, teams end up with silly rules like “defects found during the current sprint will be resolved in the current sprint.” That seems like an innocent enough rule, but what happens when you find more bugs than you can fix?

  • My interpretation of reading Alan's blog was that those items that are actual defects (do not meet defined requirements)... should be a new user story.

Ah, therein lies the misperception. In my original post, I meant to state that every requirement MUST be expressed as a test cases on a user story, and that all test cases MUST be passing before I show the work to my business partner(s). Therefore, by definition, the code has met all of the “defined requirements” and can only be deficient if there remain undefined requirements, which should be defined as new test cases on a new user story.

  • Bottom line, i do not think defects are bad...

I disagree. I’ve experienced enough problems from the semantic difference between defect and user story, that I no longer want to use the term defect ever again. (It’s not the IT folks that generally misinterpret the word “defect.”)

But, that's just my opinion. What's yours?

Zero Defects™

Last week, in a conversation with a coworker, he stated that defects do not make developers look bad. Basically, he asserted the following:

  1. Defects are not used to judge developer competence; and,
  2. Defects are simply artifacts for tracking work that needs to be done.

I agree with these points. However, experience teaches me that defects can and do make teams look bad. I’ve seen defects become a point of contention between a development team and their business partner(s). And, once a rift forms between the two, it is very difficult to heal.

So, how can we simultaneously encourage defect creation (to track work that needs to be done) without creating a rift between the development team and their business partners? My preferred approach is to create Zero Defects. Here’s how:

  1. Manage your work with a backlog of user stories.
  2. Create test cases for each user story before you begin development. Review the test cases with your business partner as well as QA to make sure you didn’t miss any. (This can happen iteratively. There’s no need to create test cases for stories you’re not going to work on right away.)
  3. As you develop a story, automate the test cases to prove that the software does what the tests say it should. When all the tests pass, the code is complete. (If possible, the developer who writes the code should not be the developer/tester who automates the tests. And, it’s very helpful to automate the tests before the code exists.)
  4. When you finish a story (and the associated tests), review both the software and the automated tests with your business partner to make sure everything is as they wanted. If your business partner wants to make changes at that point – either new requests, or things we might typically call defects – write them up as new user stories and prioritize them in the backlog.

That’s all there is to it. You, the developer/tester, get to take credit in the current Sprint/iteration for the work you completed. And, your business partner gets to manage a single backlog with all the outstanding work in it. Plus, no one has to fight over what’s a defect and what’s a user story. It’s a win/win+!

Now, I realize that this might be considered controversial. So, I’ll explain my thought process in a future post. Feel free to tell me how this’ll never work in the comments!

Lean Software Development

In my previous life, as a developer on the Microsoft Visual Studio Team System Process Team, I helped design workflows for user stories and other types of work items. These workflows were to be integrated into the process as a status field on the work item. At one point, we identified a workflow that boiled down to something like this:

  • New
  • Grooming
  • Groomed
  • Developing
  • Developed
  • Testing
  • Tested
  • Deploying
  • Deployed
  • Accepting
  • Accepted

At the time, we felt that it was important to make a distinction between items that were actively being worked and those that were waiting for the next stage of the process. In the end, we simplified the status field by dropping all the active verbs (those ending in “ing”), in favor of assuming that all work in the current sprint was active.

Part of me likes that decision – it certainly simplified the workflow. But, part of me, to this day, bemoans that decision because it limits a team’s ability to identify waste within the process that needs to be eliminated.

Imagine a process for building widgets. In this process, each widget starts life as raw materials. These materials get flobbled, marned, and volmed before becoming actual widgets. As materials move through the process, sometimes there are too many to marn all of them at once. So, those “in-process” materials sit and wait for their turn at the marner. Similar delays may also happen at the flobbling and volming workstations. And, of course, at any time, one of the machines could break causing downstream process points to run out of work.

This is a typical manufacturing process. And, it’s just like the software process described above.

In Lean manufacturing (a process improvement method invented by Toyota) the time a piece of work has to wait for a machine is considered waste. (It was a waste of money to buy more materials than the process can handle.) Likewise, the time a machine has to sit idle waiting for the next piece of work is also considered waste. (Idle infrastructure is a waste of the investment incurred to obtain and operate said infrastructure.) Lean manufacturing seeks to eliminate all of that waste. Out of this movement have come innovations like just-in-time inventory.

Likewise, in Lean software development, time spent in the inactive states (those verbs ending in “ed”) is considered waste to be eliminated. In other words, why groom 20 stories when the developers can only work on five at a time. The Product Owner may never prioritize some of the other 15 groomed stories high enough for the team to work on them, rendering moot the effort spent grooming those abandoned stories. Likewise, why continue developing code when there are broken tests? Until the tests are fixed, nothing can move forward in the process.

Scrum lends itself to Lean software development through its iterative, incremental approach to building software. So, the team only needs to groom enough work for the next sprint, for example, reducing the potential for time to be spent grooming stories that will end up being abandoned.

The metaphor breaks down a bit, though, when you run into a particularly urgent but poorly understood request from the business. It may take so much time to groom that development stalls, waiting for well understood requirements. This is wasteful in that the developers and testers (the infrastructure of software development) are idled. Extreme Programming’s approach to this situation would be to punt the poorly understood requirement until such a time that it can be adequately articulated. Scrum is less, um, confrontational, but no less dependent upon adequately articulated requirements.

As with everything else on this blog, I just needed to get this out of my head. I hope you found it interesting and/or useful. But, if not, at least my brain can now move on to something else.

The Matrix Reloaded

I wrote this post quite a long time ago – right on the heels of my original test matrix posts. Why I never posted it is beyond me. I’m posting it now to get it out of my “drafts.”

---

A few posts back, I discussed The Marick Test Matrix and my minor modifications to the matrix. In those posts, I described how to classify different types of testing into the four quadrants of the matrix. It turns out that you can also use the same matrix to classify testing tools, like this:

image

Let’s look at each quadrant, in more detail, starting on the right hand side:

Business/Defects

This quadrant represents those types of tests that identify defects in business (or non-technical) terms. In other words, you don’t need to be a programmer to figure out that there is a defect.

Typically, these tests are not automated. So, there are no automation tools to discuss, here.

Technology/Defects

This quadrant represents those types of tests that identify defects in technical terms. In other words, you probably need to be a programmer to figure out that there is a defect. Therefore, one would expect the tools in this quadrant to be highly technical and to require specialized skills. In fact, there are people who specialize in this work. They are typically called testers; but, their knowledge of programming is often greater than the average developer.

The dominant tool in this space is Mercury LoadRunner. Microsoft also has tools in this space, including the Visual Studio Team Test Load Agent and the Microsoft Web Application Stress tool (MS WAS).

Business/Requirements

This quadrant represents those types of tests that define requirements in business terms. As such, you would expect the tools in this category to be (relatively) non-technical. Business Analysts and end users should be able to use these tools to create automated tests without knowledge of computer programming. In fact, these tests should be written by those team members with the most business expertise.

FIT, FitNesse, STiQ, WebTest and Selenium are all examples of tools that allow tests to be expressed in business terms. All of these tools are well suited to use by Business Analysts.

Technology/Requirements

The testing that takes place in this quadrant defines requirement in technical terms. Therefore, you would expect to see lots of low-level, code-based tools, here. These tools are generally used by computer programmers (e.g. developers and testers).

JUnit and NUnit are the big dogs in this space. Other tools include MSTest, WatiN, MBUnit, xUnit, RSpec (for Ruby), and NUnitASP.

Votebox

As much as I love Rally for managing work, I think I may have found something better. Votebox is where Dropbox users can request features and vote for those they’d most like to see. What I like about it is how simple it is:

image

Here you see the three most popular feature requests on their site. Each feature request contains a title, a short description, a category (not shown) and some metadata (in red). Users may vote for and comment on feature requests (in blue). And, Dropbox can update the status of a specific request (in green).

This is the simplest, most elegant site I’ve ever seen for managing a backlog of work. It simultaneously a) houses the backlog, b) tracks feedback, c) gauges interest in competing priorities, d) communicates progress, and e) manages expectations. It’s like a suggestion box that worked out really hard, but never took any steroids. It has all the muscle it needs, without all the bulging veins and other side effects (er, features) of larger sites like Rally.

The only thing I’d be hesitant to do with a tool like this is turn over product decisions to the crowd. What makes Votebox work for Dropbox is that Dropbox has stayed true to their original product vision – a simple, 100% reliable way to backup files to the Internet and sync them across multiple computers. Feature requests outside that vision may be popular, but they would dilute the brand, causing more harm than good to the product.

Rather, I see Votebox as a tool to help talented Product Owners with strong visions for their products interact with their audiences.

Coding Standards

One of the original practices of Extreme Programming was coding standards. Back in 2000, when I read my first XP book, I tensed up a bit at the thought of having to make my code look like everyone else’s. But, over the years, I’ve come to see the wisdom of it: once all the code follows the same standard, it becomes much easier to read.

In fact, now a days, I often reformat a block of code to meet a coding standard before I try to figure out what it’s doing. (But, only when I have strong unit tests!) Long story short, I haven’t had a good knock-down-drag-out argument about curly braces in nearly ten years! (Anyone want to have one for old time’s sake?)

With that in mind, I needed to look up Microsoft’s latest guidelines for .NET class libraries, today. Specifically, I knew they said something about using abbreviations and acronyms when naming things. But, I couldn’t remember exactly what the recommendation was. So, I looked it up. Turns out, Microsoft has done a rather nice job of structuring their design guidance on the web.

It’s not something you’ll need to refer to everyday. But, it’s good to know it’s out there…

http://msdn.microsoft.com/en-us/library/czefa0ke(v=VS.71).aspx

An Agile Approach to Mission Control

NASA’s Cassini spacecraft has been studying Saturn, it’s rings and moons for nearly six years, despite the original mission being slated to expire after only four years. But, due to better than expected performance and lower than expected fuel consumption, NASA has extended the mission for an additional seven years, to 2017.

In order to negotiate the flight path for the next seven years, the engineers in charge of planning the orbital maneuvers consulted with the five science teams affiliated with the project. Each team is assigned to study different things: Saturn, Titan (Saturn’s largest moon), the rings, the icy satellites, and the magnetosphere. Each team presented their wish list for the places they’d like to see over the next seven years, and the engineers got busy running the numbers:

The first time [the engineers] met with the discipline teams, they offered three possible tours. The next time, they offered two, and, in January 2009, the scientists picked one of them. Last July, after six months of tweaking by [the engineers], the final “reference trajectory” was delivered. It now includes 56 passes over Titan, 155 orbits of Saturn in different inclinations, 12 flybys of Enceladus, 5 flybys of other large moons — and final destruction.*

In essence, this team of engineers had to balance the wishes of the five research teams with the remaining fuel and gravity boosts available. The approach they took was to present alternatives and iterate on them until they found the best solution, given the requirements and the constraints:

“It’s not like any problem set you get in college, because you have so many factors pulling in different directions,” Mr. Seal said. “The best way to measure it is to look at how much better the next iteration is than the previous one” until “you’re only making slight improvements.” Then you stop.*

I can’t think of a better way to describe the iterative development process espoused by most agile software development methodologies, including Scrum and XP. You know when to stop when the remaining improvements are no longer worth the investment.

I wonder how many of our customers would like to see three options, then two, then one…

* http://www.nytimes.com/2010/04/20/science/space/20cassini.html

On Clarity and Abstraction in Functional Tests

Consider the following tests:

[Test]
public void LoginFailsForUnknownUser1()
{
    string username = "unknown";
    string password = "password";

    bool loginSucceeded = User.Login(username, password);

    Assert.That(loginSucceeded == false);
}
[Test]
public void LoginFailsWithUnknownUser2()
{
    using (var browser = new IE(url))
    {
        browser.TextField(Find.ById(new Regex("UserName"))).Value = "unknown";
        browser.TextField(Find.ById(new Regex("Password"))).Value = "password";

        browser.Button(Find.ById(new Regex("LoginButton"))).Click();
        bool loginSucceeded = browser.Url.Split('?')[0].EndsWith("index.aspx");

        Assert.That(loginSucceeded == false);
    }
}

Note the similarities:

  • Both methods test the same underlying functional code; and,
  • Both tests are written in NUnit.
  • Both tests use the Arrange / Act / Assert structure.

Note the differences:

  • The first is a unit test for a method on a class.
  • The second is a functional test that tests an interaction with a web page.
  • The first is clear. The second is, um, not.

Abstracting away the browser interaction

So, what’s the problem? Aren’t all browser tests going to have to use code to automate the browser?

Well, yes. But, why must that code be so in our face? How might we express the true intention of the test without clouding it in all the arcane incantations required to automate the browser?

WatiN Page Classes

The folks behind WatiN answered that question with something called a Page class. Basically, you hide all the browser.TextField(…) goo inside a class that represents a single page on the web site. Rewriting the second test using the Page class concept results in this code:

[Test]
public void LoginFailsWithUnknownUser3()
{
    using (var browser = new IE(url))
    {
        browser.Page<LoginPage>().UserName.Value = "unknown";
        browser.Page<LoginPage>().Password.Value = "password";

        browser.Page<LoginPage>().LoginButton.Click();
        bool loginSucceeded = browser.Page<IndexPage>().IsCurrentPage;

        Assert.That(loginSucceeded == false);
    }
}
public class LoginPage : Page
{
    public TextField UserName
    {
        get { return Document.TextField(Find.ById(new Regex("UserName"))); }
    }
    public TextField Password
    {
        get { return Document.TextField(Find.ById(new Regex("Password"))); }
    }
    public Button LoginButton
    {
        get { return Document.Button(Find.ById(new Regex("LoginButton"))); }
    }
}

Better? Yes. Now, most of the WatiN magic is tucked away in the LoginPage class. And, you can begin to make out the intention of the test. It’s there at the right hand side of the statements.

But, to me, the Page Class approach falls short. This test still reads more like its primary goal is to automate the browser, not to automate the underlying system. Plus, the reader of this test needs to understand generics in order to fully grasp what the test is doing.

Static Page Classes

An alternative approach I’ve used in the past is to create my own static classes to represent the pages in my web site. It looks like this:

[Test]
public void LoginFailsWithUnknownUser4()
{
    using (var browser = new IE(url))
    {
        LoginPage.UserName(browser).Value = "unknown";
        LoginPage.Password(browser).Value = "password";

        LoginPage.LoginButton(browser).Click();
        bool loginSucceeded = IndexPage.IsCurrentPage(browser);

        Assert.That(loginSucceeded == false);
    }
}
public static class LoginPage
{
    public static TextField UserName(Browser browser)
    {
        return browser.TextField(Find.ById(new Regex("UserName")));
    }
    public static TextField Password(Browser browser)
    {
        return browser.TextField(Find.ById(new Regex("Password")));
    }
    public static Button LoginButton(Browser browser)
    {
        return browser.Button(Find.ById(new Regex("LoginButton")));
    }
}

This is the closest I have come to revealing the intention behind the functional test without clouding it in all the arcane incantations necessary to animate a web browser. Yes, there are still references to the browser. But, at least now the intention behind the test can be inferred by reading each line from left to right. Furthermore, most of the references to the browser are now parenthetical, which our eyes are accustomed to skipping.

What do you think?

I’d like to know what you think. Are your functional tests as clear as they could be? If so, how’d you do it? If not, do you think this approach might be helpful? Drop me a line!