Roles on Agile Teams

Elisabeth Hendrickson, in Testing is a Whole Team Activity:

Testing is an activity. Testers happen to be really good at it. We need testers on Agile teams. But, if we want real agility, we need to see that completing testing as part of the sprint is the responsibility of the whole team, not just the testers.

To which my brain immediately responded:

Programming is an activity. Programmers happen to be really good at it. We need programmers on Agile teams. But, if we want real agility, we need to see that completing programming as part of the sprint is the responsibility of the whole team, not just the programmers.

Thing is, you could put any traditional software development role (e.g. “planner”, “builder”, etc.) into that paragraph and it would work. Try it. It really works.

My approach to agile testing...

I’ve talked about agile testing before, here, here and here. But, a recent thread on the Alt.Net Seattle Google Group got me thinking about it again. Here’s the response I sent to the thread:

Testing is a huge domain. If you’re familiar with Marick’s testing quadrant, you know that there are four basic areas that testing covers:

  • Business Facing tests in Support of Programming (Business Requirements testing – Does the code do what it should?)
  • Business Facing tests to Critique the Product (Business Defect testing – Does the code do something it shouldn’t? Are there missing requirements?)
  • Technology Facing tests in Support of Programming (Technical Requirement testing – Does this method do what the developer intended?)
  • Technology Facing tests to Critique the Product (Technical defect testing – Are there leaks? Can it handle a load? Is it fast enough?)

Typically, testers focus on the business facing tests. And, people with specialized technical skills focus on the technology facing tests. (Developers on the support programming side; Performance testers on the critique product side.)

None of these tests can be run before the software is written. But, the tests in support of technology can be written before the code. And, metrics for perf/load/stress can be defined before the code is written. I recommend doing all of that (unless perf/load/stress isn’t important to you). Obviously, exploratory testing is something that has to wait for the code to be written.

If I were designing an agile team from scratch, I would propose the following approach:

During planning:

  • Track requirements as user stories.
  • Document acceptance criteria with each story, including perf/load/stress criteria (on the back of the 3x5 card, in Rally or TFS, etc.)

During an iteration:

  • One pair works on one story at a time.
  • Acceptance tests are automated first, based on acceptance criteria.
  • Code is written using TDD
  • Story is not functionally complete until all acceptance tests are passing (for the right reasons – no hard coded answers left)

After story is functionally complete:

  • Original pair leverages existing acceptance tests in perf/load/stress tests to determine if those criteria are met.
  • Tweak code as necessary to meet perf/load/stress acceptance criteria.
  • Story is not perf/load/stress complete until all perf/load/stress acceptance tests are passing

Exploratory testing should happen outside the constraints of a single story:

  • Limiting it to a single story would put blinders on that could negatively impact the effort. But, it is important that it happen.
  • Perhaps the team sets aside time during the day or iteration for banging on the software.

Once all acceptance tests are passing:

  • Ship it!

Variations:

  1. Have the entire team bang out the acceptance tests at the beginning of the iteration.  I’ve seen this done. It works. But, quite often, tests get written for stories that end up getting cut from the iteration due to time constraints. That is excess inventory sitting on the production floor until those stories make it into another iteration. In other words, doing this encourages the accumulation of waste.
  2. If you’re concerned about a single pair working a story from beginning to end, mix it up. Give pairs one day to work on something, or 4 hours, or two, whatever works for you. Then switch things up – preferably by keeping one person on the story and bringing in a new pair. Then, the next time you switch, bring the older pair leaves.
  3. Even though exploratory testing should not be constrained by a single story, it really is important to do it before shipping the software. Microsoft calls this a bug bash. They give away prizes for the most bugs, and the hardest to find bugs. But, they don’t do it until very late in their process. It would be most agile to do it continuously.

How do you do agile testing?

Mocks and fakes and stubs, oh my!

Yesterday, I started writing an article about mocks and fakes and stubs, but ended up writing an article about Inversion of Control (IoC) / Dependency Injection. Today, I’ll take up the topic of how to isolate your code from that of other objects in your unit tests. But, first, what are mocks, fakes and stubs?

In his article Mocks Aren’t Stubs, Martin Fowler defines mocks, fakes, stubs and dummy objects as follows:

  • Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
  • Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
  • Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test. Stubs may also record information about calls, such as an email gateway stub that remembers the messages it 'sent', or maybe only how many messages it 'sent'.
  • Mocks are objects pre-programmed with expectations which form a specification of the calls they are expected to receive.

All of these objects are useful when unit testing your code. But, each type of test object has it’s own strengths and weaknesses. Let’s look at how to use each type of test object (except for dummy objects) to isolate the Add method from the Logger object that it uses.

Here, again, is the code for the IocCalculator class (modified slightly, to actually throw overflow exceptions):

public class IocCalculator
{
    public IocCalculator(ILogger logger)
    {
        Logger = logger;
    }

    private ILogger Logger { get; set; }

    public int Add(params int[] args)
    {
        Logger.Log("Add");

        int sum = 0;
        try
        {
            foreach (int arg in args)
                sum = checked(sum + arg);
        }
        catch (OverflowException ex)
        {
            Logger.Log("OverflowException: Add");
            throw ex;
        }

        return sum;
    }
}

Dummy Objects

First, let’s try testing this code with a dummy object:

public class DummyLogger : ILogger
{
    public void Log(string message)
    {
    }
}
[Test]
[ExpectedException(typeof(OverflowException))]
public void AddThrowsExceptionOnOverflow()
{
    var calculator = new IocCalculator(new StubLogger());
    calculator.Add(int.MaxValue, 1);
}

Take a look at the dummy object, above. It doesn’t do anything. It just swallows the calls to the logger from the Add method. Is this useful? Actually, yes. It allowed us to test that the exception was thrown correctly. Is it a good fit for this test? Yes. Can it do everything a mock, fake or stub can do? No.

Mock Objects

For example: What if we actually wanted to ensure that the Add method actually called the logger? In that case, a mock object might be more useful:

[Test]
public void AddLogsExceptionOnOverflow()
{
    var mocks = new Mockery();
    var mockLogger = mocks.NewMock<ILogger>();
    Expect.Exactly(2).On(mockLogger).Method("Log");

    var calculator = new IocCalculator(mockLogger);

    Assert.Throws(typeof(OverflowException), () => calculator.Add(int.MaxValue, 1));
    mocks.VerifyAllExpectationsHaveBeenMet();
}

This test replaces the dummy logger object with a mock logger object that the test itself creates (using the NMock framework). The first three lines setup the mock object by instantiating the mock object framework, instantiating the mock logger object from the ILogger interface, and telling NMock to expect two calls to the “Log” method on the mock logger object. Behind the scenes, mockLogger.Log counts the number of times it gets called. The final line of the test then compares the number of times we expected to call mockLogger.Log with the actual number of times it was called.

(Note: The Assert statement above uses a lambda expression to invoke an anonymous method, which in and of itself is a delegate. This syntax was introduced in C# 3.0. If you find it a bit obtuse, you’re not alone. Perhaps it’d make a good blog post. Anyone want to write it?)

One final note on mocks: Many in the Test Driven Development community are enamored with mock objects. Some developers use mocks to the exclusivity of all other types of test objects. Personally, I prefer to do state-based testing rather than behavior based testing. In other words, I want to setup my object under test, call the method under test, and assert that the state of the object changed in some observable way. I don’t want my test to know about the underlying implementation details of the method under test. Mock objects, in my opinion, have a way of getting cozy with implementation details that makes me uncomfortable.

Fake Objects

But, what if you want to test that the Add method logs the correct messages? In that case, you may need to implement a fake logger and validate it’s contents, like this:

public class FakeLogger : ILogger
{
    public readonly StringBuilder Contents = new StringBuilder();

    public void Log(string message)
    {
        Contents.AppendLine(message);
    }
}
[Test]
public void AddLogsCorrectExceptionOnOverflow()
{
    var fakeLogger = new FakeLogger();
    var calculator = new IocCalculator(fakeLogger);

    Assert.Throws(typeof(OverflowException), () => calculator.Add(int.MaxValue, 1));
    Assert.AreEqual("Add\nOverflowException: Add\n", fakeLogger.Contents.ToString());
}

Note that the fake logger is actually a real logger. It actually logs the messages it receives in a string (using a StringBuilder object). Essentially, this implementation is an “in memory” logger, similar to what Fowler described as an “in memory database.” In fact, if this weren’t an example, I would probably have named the class InMemoryLogger or StringLogger, rather than FakeLogger. That’s more in line with what the code actually does.

So, is the fake logger useful? Absolutely. In fact, this is the approach I would actually take, since the dummy and mock loggers cannot check the text that was logged.

Stub Objects

But, what about stub objects? Well, it turns out that I chose a poor example for illustrating stub objects. As you’ll recall from above, stubs return hard-coded values to trigger specific code paths within the code under test. My logger example doesn’t need this kind of functionality. But, if a method were to call the Add method, it might be handy to use a stub to hard code a response, like this:

So, let’s test some code that needs to call the Calculator object. First, here’s the code:

public class Decider
{
    public bool Decide(params int[] args)
    {
        return ((new IocCalculator(new DummyLogger()).Add(args) % 2) == 1);
    }
}

Hmmm… Well, we can’t isolate this code from the IocCalculator code, yet. Let’s refactor:

public class Decider
{
    private readonly ICalculator _calculator;

    public Decider(ICalculator calculator)
    {
        _calculator = calculator;
    }

    public bool Decide(params int[] args)
    {
        return ((_calculator.Add(args) % 2) == 1);
    }
}

Now, we can pass in a couple off different stub objects to test the Decider.Decide method:

public class EvenCalculator : ICalculator
{
    public int Add(params int[] args)
    {
        return 2;
    }
}

public class OddCalculator : ICalculator
{
    public int Add(params int[] args)
    {
        return 1;
    }
}

[TestFixture]
public class DeciderFixture
{
    [Test]
    public void DecideReturnsFalseWhenEven()
    {
        var decider = new Decider(new EvenCalculator());
        Assert.False(decider.Decide());
    }

    [Test]
    public void DecideReturnsTrueWhenOdd()
    {
        var decider = new Decider(new OddCalculator());
        Assert.True(decider.Decide());
    }
}

Conclusion

So, that (finally) concludes my look at mocks, fakes, stubs and dummies. I hope you found it understandable. And, I hope you find it helpful the next time you need to test a piece of code in isolation. In fact, this same Inversion of Control / Dependency Injection approach can be leveraged in many different ways. My focus in these articles was to demonstrate using the approach in unit testing. But, it can also be applied to functional/system testing.

Say you have a website that displays data returned from a web service. Say you want to hide the Address2 line if it is empty. You could look for test data from the web service that meets your needs. Or, you could program your tests to run against a stub service that returns just the values you need. Not only will you be guaranteed that the test data won’t change, but the test will run much faster, to boot. Sure, an end-to-end that calls the actual web service test will be necessary before going live. But, why pay that price every time you run your test suite.

Furthermore, IoC can also be used in production code to enable things like plug-ins or composite user interfaces. In this scenario, an installed plug-in is generally just a piece of code that implements a specific interface – one that allows a third party to develop software that integrates with your code at runtime via configuration.

As always, feel free to post corrections and questions in the comments.

Zero Defects™ – Part 2

As I suspected, my notion of Zero Defects was mildly controversial. I’ll try to clear up some of the questions in this post. But, rather than coming up with a cohesive argument, which I tried to make the first time, I’ll respond to some of the responses I received on the first post. (NOTE: Comments did not come with the posts when I manually imported them to Posterous.)

  • Defects vs. stories is just a semantic layer; they both boil down to work that needs to be done.

Semantics are important. Yes, they’re both just work items. But, a “defect” carries with it a negative connotation, whereas a “user story” does not.

I believe strongly that using the words “defect” or “bug” sets the inappropriate expectation with business partners that all defects/bugs can and will be removed from the software prior to promotion to production. And, I’ve seen first hand how using the term “defect” led to teams killing themselves trying to fix “bugs” that were never called out in test cases by the end of a Sprint. This is a recipe for failure and burnout.

I much prefer to set the expectation that anything that wasn’t explicitly called out by a user story and/or test case in the current Sprint is new work that needs to be prioritized with all other work. The best way to do this is to label that work the same way all other new work is labeled – as a new user story.

  • Defects tells me the QA's are doing their job.

To me, the statement above is akin to saying “Lines of Code tell me my programmers are doing their job.” Defect counts are meaningless. Point me at any modern website or program, I can probably identify 100 defects in an hour. Most of them will be garbage. But, if I file a defect on them, someone (or more likely, some team) will have to triage those things, beat down the stupid ones and prioritize the rest – probably outside their normal work item prioritization process.

  • My experience is our business partners are not be concerned with defects, unless they are not caught and are promoted into production.

My experience differs. I’ve worked with multiple PO’s, both at my current employer and elsewhere, who tracked defect counts and berated the team over them. In fact, my experience is that this is more common than not.

  • The only business partner of ours looking at Rally is our PO.

Yeah, that seems to be common here. But, there’s nothing preventing you from calling a meeting at the end of each Sprint where you demo the new functionality to a broader community of stakeholders. In fact, Scrum recommends this practice. It’s called a Sprint Review Meeting.

  • The requirements in agile are much less defined than in waterfall.

I disagree 100%. There should be zero theoretical difference in the quality of requirements between agile and waterfall. Agile just prefers to discover and document requirements iteratively, one at a time; whereas waterfall tries to define all requirements up front. In practice, this means that agile methods discover and document requirements closer to the time of implementation. They also allow for new requirements to be discovered by looking at how the software actually works. Waterfall doesn’t provide for that iterative feedback, meaning that the requirements get stale.

  • Defects provide us with a good method to track these issues to resolution. If the issue is new scope... then an user story is a much better alternative.

The problem I see in tracking defects separate from user stories is prioritization. How does the developer know what to work on next? Sure, Rally (and other tools) allow you to see all work in one bucket. And, that helps. But, in my experience, teams end up with silly rules like “defects found during the current sprint will be resolved in the current sprint.” That seems like an innocent enough rule, but what happens when you find more bugs than you can fix?

  • My interpretation of reading Alan's blog was that those items that are actual defects (do not meet defined requirements)... should be a new user story.

Ah, therein lies the misperception. In my original post, I meant to state that every requirement MUST be expressed as a test cases on a user story, and that all test cases MUST be passing before I show the work to my business partner(s). Therefore, by definition, the code has met all of the “defined requirements” and can only be deficient if there remain undefined requirements, which should be defined as new test cases on a new user story.

  • Bottom line, i do not think defects are bad...

I disagree. I’ve experienced enough problems from the semantic difference between defect and user story, that I no longer want to use the term defect ever again. (It’s not the IT folks that generally misinterpret the word “defect.”)

But, that's just my opinion. What's yours?

Zero Defects™

Last week, in a conversation with a coworker, he stated that defects do not make developers look bad. Basically, he asserted the following:

  1. Defects are not used to judge developer competence; and,
  2. Defects are simply artifacts for tracking work that needs to be done.

I agree with these points. However, experience teaches me that defects can and do make teams look bad. I’ve seen defects become a point of contention between a development team and their business partner(s). And, once a rift forms between the two, it is very difficult to heal.

So, how can we simultaneously encourage defect creation (to track work that needs to be done) without creating a rift between the development team and their business partners? My preferred approach is to create Zero Defects. Here’s how:

  1. Manage your work with a backlog of user stories.
  2. Create test cases for each user story before you begin development. Review the test cases with your business partner as well as QA to make sure you didn’t miss any. (This can happen iteratively. There’s no need to create test cases for stories you’re not going to work on right away.)
  3. As you develop a story, automate the test cases to prove that the software does what the tests say it should. When all the tests pass, the code is complete. (If possible, the developer who writes the code should not be the developer/tester who automates the tests. And, it’s very helpful to automate the tests before the code exists.)
  4. When you finish a story (and the associated tests), review both the software and the automated tests with your business partner to make sure everything is as they wanted. If your business partner wants to make changes at that point – either new requests, or things we might typically call defects – write them up as new user stories and prioritize them in the backlog.

That’s all there is to it. You, the developer/tester, get to take credit in the current Sprint/iteration for the work you completed. And, your business partner gets to manage a single backlog with all the outstanding work in it. Plus, no one has to fight over what’s a defect and what’s a user story. It’s a win/win+!

Now, I realize that this might be considered controversial. So, I’ll explain my thought process in a future post. Feel free to tell me how this’ll never work in the comments!

The Matrix Reloaded

I wrote this post quite a long time ago – right on the heels of my original test matrix posts. Why I never posted it is beyond me. I’m posting it now to get it out of my “drafts.”

---

A few posts back, I discussed The Marick Test Matrix and my minor modifications to the matrix. In those posts, I described how to classify different types of testing into the four quadrants of the matrix. It turns out that you can also use the same matrix to classify testing tools, like this:

image

Let’s look at each quadrant, in more detail, starting on the right hand side:

Business/Defects

This quadrant represents those types of tests that identify defects in business (or non-technical) terms. In other words, you don’t need to be a programmer to figure out that there is a defect.

Typically, these tests are not automated. So, there are no automation tools to discuss, here.

Technology/Defects

This quadrant represents those types of tests that identify defects in technical terms. In other words, you probably need to be a programmer to figure out that there is a defect. Therefore, one would expect the tools in this quadrant to be highly technical and to require specialized skills. In fact, there are people who specialize in this work. They are typically called testers; but, their knowledge of programming is often greater than the average developer.

The dominant tool in this space is Mercury LoadRunner. Microsoft also has tools in this space, including the Visual Studio Team Test Load Agent and the Microsoft Web Application Stress tool (MS WAS).

Business/Requirements

This quadrant represents those types of tests that define requirements in business terms. As such, you would expect the tools in this category to be (relatively) non-technical. Business Analysts and end users should be able to use these tools to create automated tests without knowledge of computer programming. In fact, these tests should be written by those team members with the most business expertise.

FIT, FitNesse, STiQ, WebTest and Selenium are all examples of tools that allow tests to be expressed in business terms. All of these tools are well suited to use by Business Analysts.

Technology/Requirements

The testing that takes place in this quadrant defines requirement in technical terms. Therefore, you would expect to see lots of low-level, code-based tools, here. These tools are generally used by computer programmers (e.g. developers and testers).

JUnit and NUnit are the big dogs in this space. Other tools include MSTest, WatiN, MBUnit, xUnit, RSpec (for Ruby), and NUnitASP.

On Clarity and Abstraction in Functional Tests

Consider the following tests:

[Test]
public void LoginFailsForUnknownUser1()
{
    string username = "unknown";
    string password = "password";

    bool loginSucceeded = User.Login(username, password);

    Assert.That(loginSucceeded == false);
}
[Test]
public void LoginFailsWithUnknownUser2()
{
    using (var browser = new IE(url))
    {
        browser.TextField(Find.ById(new Regex("UserName"))).Value = "unknown";
        browser.TextField(Find.ById(new Regex("Password"))).Value = "password";

        browser.Button(Find.ById(new Regex("LoginButton"))).Click();
        bool loginSucceeded = browser.Url.Split('?')[0].EndsWith("index.aspx");

        Assert.That(loginSucceeded == false);
    }
}

Note the similarities:

  • Both methods test the same underlying functional code; and,
  • Both tests are written in NUnit.
  • Both tests use the Arrange / Act / Assert structure.

Note the differences:

  • The first is a unit test for a method on a class.
  • The second is a functional test that tests an interaction with a web page.
  • The first is clear. The second is, um, not.

Abstracting away the browser interaction

So, what’s the problem? Aren’t all browser tests going to have to use code to automate the browser?

Well, yes. But, why must that code be so in our face? How might we express the true intention of the test without clouding it in all the arcane incantations required to automate the browser?

WatiN Page Classes

The folks behind WatiN answered that question with something called a Page class. Basically, you hide all the browser.TextField(…) goo inside a class that represents a single page on the web site. Rewriting the second test using the Page class concept results in this code:

[Test]
public void LoginFailsWithUnknownUser3()
{
    using (var browser = new IE(url))
    {
        browser.Page<LoginPage>().UserName.Value = "unknown";
        browser.Page<LoginPage>().Password.Value = "password";

        browser.Page<LoginPage>().LoginButton.Click();
        bool loginSucceeded = browser.Page<IndexPage>().IsCurrentPage;

        Assert.That(loginSucceeded == false);
    }
}
public class LoginPage : Page
{
    public TextField UserName
    {
        get { return Document.TextField(Find.ById(new Regex("UserName"))); }
    }
    public TextField Password
    {
        get { return Document.TextField(Find.ById(new Regex("Password"))); }
    }
    public Button LoginButton
    {
        get { return Document.Button(Find.ById(new Regex("LoginButton"))); }
    }
}

Better? Yes. Now, most of the WatiN magic is tucked away in the LoginPage class. And, you can begin to make out the intention of the test. It’s there at the right hand side of the statements.

But, to me, the Page Class approach falls short. This test still reads more like its primary goal is to automate the browser, not to automate the underlying system. Plus, the reader of this test needs to understand generics in order to fully grasp what the test is doing.

Static Page Classes

An alternative approach I’ve used in the past is to create my own static classes to represent the pages in my web site. It looks like this:

[Test]
public void LoginFailsWithUnknownUser4()
{
    using (var browser = new IE(url))
    {
        LoginPage.UserName(browser).Value = "unknown";
        LoginPage.Password(browser).Value = "password";

        LoginPage.LoginButton(browser).Click();
        bool loginSucceeded = IndexPage.IsCurrentPage(browser);

        Assert.That(loginSucceeded == false);
    }
}
public static class LoginPage
{
    public static TextField UserName(Browser browser)
    {
        return browser.TextField(Find.ById(new Regex("UserName")));
    }
    public static TextField Password(Browser browser)
    {
        return browser.TextField(Find.ById(new Regex("Password")));
    }
    public static Button LoginButton(Browser browser)
    {
        return browser.Button(Find.ById(new Regex("LoginButton")));
    }
}

This is the closest I have come to revealing the intention behind the functional test without clouding it in all the arcane incantations necessary to animate a web browser. Yes, there are still references to the browser. But, at least now the intention behind the test can be inferred by reading each line from left to right. Furthermore, most of the references to the browser are now parenthetical, which our eyes are accustomed to skipping.

What do you think?

I’d like to know what you think. Are your functional tests as clear as they could be? If so, how’d you do it? If not, do you think this approach might be helpful? Drop me a line!

My Marick Test Matrix

I introduced the Marick Test Matrix in the previous post. As much as I like it and have come to rely on it. One thing has always bugged me about the matrix: The terms Brian used to describe his horizontal axis seem, IMHO, obtuse. Instead, I prefer these simpler terms:

  • Support Programming = Define Requirements
  • Critique Product = Identify Defects

Given those changes, here’s my updated matrix:

I prefer these axes because now I can refer to each quadrant by a simple name:

  • Business/Requirements
  • Business/Defects
  • Technology/Requirements
  • Technology/Defects

The Marick Test Matrix

One of the leading voices in agile testing is a guy named Brian Marick. Brian is an independent consultant and an author. I find his blog to be an invaluable resource.

The Marick Test Matrix

Back in 2003, Brian published an influential series of articles on agile testing. He was attempting to point the way forward for agile testers. But, in the process, he came up with an elegant method of cataloguing testing methods that has become known as the “Marick Test Matrix.”

I’d like to introduce the matrix here in the hopes of fostering a discussion about what we test and how we test it.

Brian’s work categorized tests by asking two questions:

  1. Is the test business facing or technology facing?
  2. Does the test support programming or critique a product?

When you combine the two questions (or axes), you get a grid (or matrix), like this one:

But, what the heck do these things mean? That’s what the remainder of this post is about…

Business Facing Tests

A business facing test is one that is expressed in terms that are well understood by a business expert. For example:

  • If you withdraw more money than you have in your account, the system should automatically extend you a loan for the difference. (Notice that the italicized words are business terms.)

Business facing tests are best authored by people who understand the business (e.g. product owner, business analyst, etc.).

Technology Facing Tests

A technology facing test is one that is expressed in terms that are well understood by a technology expert. For example:

  • Different browsers implement Javascript differently, so we test whether our product works with the most important ones. (Notice that the italicized words are technology terms.)

Technology facing tests are best authored by people who understand the technology (e.g. developer, tester, etc.).

Tests that Support Programming

A test that supports programming is one that defines what the software should do. For example:

  • Clicking on the Account Details link should take the user to the Account Details screen.
  • Calling the Add method with 2 and 2 should return 4.

These tests may be written before the software exists. These tests are often automated and executed after a change is made to the software to ensure that the software still works as it should (i.e. regression). Once one of these tests passes, it should never be allowed to fail again.

Tests that Critique a Product

A test that critiques a product is one that tries to identify problems in completed software. In other words, this is the class of tests where the tester is actively trying to break the software in order to find bugs. For example:

  • When I logged on as Joe, I saw Tom’s data.
  • When I clicked the blue button after clicking the red button 400 times, the system threw an error.
  • When I configured the load testing tool to send 1,000 simultaneous users to the site, average response times increased to over 10 seconds.

In general, these tests are not automated – until a problem is identified, at which point an test that clearly reproduces the problem can be added to the tests that support programming.

So what?

Let’s take a look at where some standard types of testing might go on the matrix:

Unit Tests

Unit tests are used by developers to ensure that the code they are writing does what they expect it to. In essence, these tests form a specification for a single unit of code. By that definition, unit tests are tests that “support programming.” These tests are (or should be) very close to the code under test, which makes them “technology facing.” So, unit tests belong in the lower left quadrant of the matrix. The benefit of automating these tests is very high.

Functional Tests

Functional tests are used by development teams to ensure that the software they are writing does what they expect it to do. In essence, these tests form a specification for an entire system. That means that these are tests that “support programming.” But, functional tests are (or should be) written in a way that business users understand, making them “business facing.” So, functional tests belong in the upper left quadrant of the matrix. The benefit of automating these tests is high.

Exploratory Tests

Exploratory testing is the practice of trying to identify problems in an application. (Microsoft refers to this practice as a “Bug Bash” where many people are invited to use the software and prizes are given out to the person who identifies the most/worst bugs.) By definition, this makes these tests that “critique a product.” Exploratory tests are considered “business facing” due to the fact that the testers are using the software the same way a real end-user might. So, exploratory tests belong in the upper, right quadrant of the matrix. There is no benefit to automating exploratory tests – until a defect is identified. At that point, a new functional test can be added to ensure that the defect is resolved.

Performance Tests

Performance tests are used to determine the speed of an application within a specific set of parameters. Specialized tools are used to perform this testing. As such, these tests require a good deal of technical knowledge and are therefore “technology facing.” Performance tests require working software, and are therefore tests that “critique a product.” So, performance tests belong in the lower, right quadrant of the matrix.

Putting it all together

My blog editor now tells me that I’m approaching 1000 words. So, to prove the axiom, here’s the best summary I can think of:

That’s enough for one day. In a subsequent post, I’ll dive into why it’s important to cover all four quadrants.

For more information, check out Brian Marick’s original Agile Testing posts, or Google Marick Test Matrix.