Agile 2006: xUnit Test Patterns and Smells

xUnit Test Patterns and Smells
Gerard Meszaros and Greg Cook
Tuesday afternoon

Book: “XUnit Test Patterns” — currently in second-draft review. Hopefully coming out this fall.

xunitpatterns.com

This session is going to be hands-on with a fair number of exercises.

Terminology

  • Test vs SUT vs DOC
    • Test, which verifies the
    • System Under Test, which may use
    • Depended-on Component(s)
  • Unit vs Component vs Customer Testing
    • Unit testing: single class
    • Component: aggregate of classes
    • Customer: entire application
  • Black Box vs White Box Testing
    • Black box: know what it should do. xUnit tests are usually black-box tests of very small boxes.
    • White box: know how it is built inside

What does it take to be successful?

  • Programming experience
  • + xUnit experience
  • + Testing experience (What test should I write?)
  • != Robust Automated Tests

It’s crucial to make tests simple, and easy to write

Expect to have at least as much test code as production code!

  • Can we afford to have that much test code?
  • Can we afford not to?
  • Challenge: How to prevent doubling the cost of software maintenance?

Testware must be easier to maintain than the production code. The effort to maintain should be less than the effort saved by having tests.

Goals of automated tests:

  • Before code is written
    • Tests as Specification
  • After code is written
    • Documentation
    • Safety net
    • Defect localization (minimize debugging)
  • Minimizing cost of running tests
    • Fully automated
    • Repeatable
    • Robust: should work today, tomorrow, next month, should work in leap years, etc. Not brittle. Shouldn’t have to revisit tests unless the code it tests has changed.

What’s a “Test Smell”?

  • Set of symptoms of an underlying problem in test code
  • Smells must pass the “sniff test”
    • Should be obvious that it’s there (not necessarily obvious why — trust your gut)
    • Should “grab you by the nose”
    • Symptom that may lead you to the root cause
  • Common kinds of smells:
    • Code smells — visible problems in code
    • Behavior smells — test behaves badly
    • Project smells — project-level, visible to Project Manager
  • Code Smells can cause Behavior Smells can cause Project Smells

Patterns: Recurring solutions to recurring problems

  • Criterion: Must have been invented by three independent sources
  • Patterns exist whether or not they’ve been written up in “pattern form”

Examples of test smells:

  • Hard to understand
  • Coding errors that result in missed bugs or erratic tests
  • Difficult or impossible to write
    • No test API
    • Cannot control initial state
    • Cannot observe final state
  • Sniff test: Problem is visible (in your face)
  • Conditional test logic (if statements in tests)
  • Hard to code
  • Obscure
  • Duplication
  • Obtuse Assertion, e.g. AssertTrue(False) instead of Fail().
  • Hard-wired test data. Can lead to fragile tests.
  • Conditional test logic. Ridiculous example: if (condition) { … } else Assert.Fail(); Why not just do Assert(condition)?
  • Most unit tests are single-digit lines of code.
  • Smell: Obscure Test
    • Verbose
    • Eager (several tests in one method)
    • General fixture (setting up stuff you don’t need for every test)
    • Obtuse assertion
    • Hard-coded test data
    • Indirect testing when there’s a simpler way
    • Mystery Guest
  • Conditional Test Logic
  • Test Code Duplication

Patterns used so far:

  • Expected Objects: compare entire objects, rather than each individual property
  • Guard Assertions: if you’ve got test code that won’t work under certain cases, assert those cases
  • Custom Asserts
    • Improve readability
    • Simplify troubleshooting

Get the conditional code out of the test method where you can’t test it

Finally and delete/free/etc. inside a test method: Is our test really about testing the destructors? (C++Unit, at one point, checked for leaks, so this might matter.) Housekeeping code doesn’t add value to the test, because we don’t know whether it works, it creates maintenance, and it creates coupling between what happens inside the test and the test. Housekeeping code is a test smell.

Naive solution: Move housekeeping code to teardown. Does mean you have to move everything to fields. Don’t do this just out of habit.

Automated Fixture Teardown: AddTestObject() and DeleteAllTestObjects(), which frees everything (or deletes test data from the database, or whatever) even if there are exceptions.

Transaction Rollback Teardown: Another way to do cleanup. Start a database transaction in SetUp, roll it back in TearDown. But make sure the system under test doesn’t commit. (And only do this if you’re already using a database. Don’t use database in your test just to use this pattern!)

  • Complex Undo Logic
    • Complex fixture teardown code
    • More likely to leave test environment corrupted, leading to Erratic Tests
  • Patterns used: Inline Teardown, Implicit Teardown (hand-coded), Automated Teardown, Transaction Rollback Teardown

Hard-coded test data / Obscure test. Creating objects that are only there because you have to pass them to other objects. Can also cause unrepeatable tests, especially if you’re inserting a customer into the database to run your test.

If the values don’t matter to the test, you can just generate them. Call GetUniqueString(), etc. This tells the person reading the test that it doesn’t matter.

But if it’s irrelevant, why even have it there? Don’t create an Address and pass values to it, just make a CreateAnonymousAddress(), and then a CreateAnonymousCustomer(), etc.

If it’s not important to the test, it’s important that it not be in the test.

Smells:

  • Obscure test because of irrelevant information.
  • Patterns: Generated values, creation method (anonymous and parameterized), testcase class per feature, custom assertions

Suggestion: Call a method “AssertFoo” if it just asserts, “VerifyFoo” if it does some setup and then asserts.

Hard to Test Code

  • Too closely coupled to other software
  • No interface provided to set state, observe state
  • Only asynchronous interfaces provided. E.g., GUI.
  • Root cause is lack of design for testability
  • Temporary workaround: Test Hook. E.g., if (IsTesting) { … } else { … }

Test Double Patterns (different things we replace real code with when we’re testing)

  • Kinds of Test Doubles
    • Test Stubs return test-specific values
    • Mock Objects also verify method calls and arguments
    • Fake Objects provide (apparently) same services in a “lighter” way, e.g., in-memory database for speed
  • Need to be “installed”
    • Dependency Injection
    • Dependency Lookup

Testability Patterns

  • Humble Object
    • Objects closely coupled to the environment should not do very much
    • Should delegate real work to a context-independent testable object
    • Example: Humble Dialog. Don’t test the dialog itself, instead have it immediately create a helper that has the logic.
  • Dependency Injection: client passes depended-on objects
  • Dependency Lookup: code asks another object for its dependencies. Service Locator, Object Factory, Component Registry.
  • Test-specific subclass. Descend from the real object, and override the stuff you want to fake.

Test Logic in Production

  • if (Testing) { … }
  • Test code gets compiled in because production code depends on it

Behavior Smells

  • We don’t have to look for them. They come knocking. Usually at the most inopportune time.
  • Tests fail when they should pass (or pass when they should fail)
  • Problem is with how tests are coded, not a problem with the code under test
  • Occasionally compile-time, usually test-time

Slow Tests

  • Tests must run “fast enough”.
  • Impact
    • Impacts productivity.
    • Lost quality due to running tests less frequently.
  • Causes
    • Using slow components (e.g., database)
    • Asynchronous
    • Building a general fixture (that sets up too much stuff)

Brainstormed ways to avoid slow tests

  • Share the fixture between tests
  • Avoid DB / external service access
  • Fake Objects
  • Eliminate redundant tests (fewer tests)
  • Run slow tests less often
  • Make individual tests run faster
  • Faster hardware
  • Multi-threaded execution
  • Smaller datasets (fixtures or inputs)
  • Faster production code
  • Run fewer tests (subset)
  • Test logic more directly
  • Faster test execution (high-performance language)
  • Run less production code in each test (more focused)

Shared Test Fixture

  • Not recommended
  • The theory: Improves test run times by reducing setup overhead
  • Forces a standard test environment that applies to all tests
    • which probably forces the fixture to be bigger
  • Bad smell alert: Erratic tests

Smell: Erratic Tests

  • Interacting tests (depend on side effects of earlier tests)
  • Unrepeatable tests (later test changes the initial state that an earlier test depends on, so you get different behavior when you run all the tests twice)
  • Test Run War (if e.g. everyone uses the same database: random test failures when multiple people run tests at once)
  • Non-Deterministic Test (passes, then run it ten minutes later and it fails)
  • Resource Optimism (all pass on my PC, but fail on build machine)

Persistent Fresh Fixture

  • Rebuild fixture for each test and tear it down
    • At end of this test
    • At start of next test that uses it (just in time)
  • Build different fixture for each test (e.g., different key value)
  • Give each developer their own database sandbox
  • Don’t change shared fixture
    • Immutable Shared Fixture
    • What constitutes a “change” to a fixture? Which objects need to be immutable? If you add a flight, is that a change to the airport?
  • Build a new Shared Fixture each time you run the test suite

Fragile Tests

  • Stops working after a while
  • Interface Sensitivity
    • Every time you change the code, tests won’t compile or start failing
    • Need to modify lots of tests to get things green again
    • Greatly increases cost of maintaining the system
  • Behavior sensitivity
    • Behavior of code changes but should not affect test outcome
    • Cuased by depending on too much of code’s behavior
  • Centralize object creation, so when you change the constructor signature, you only have to fix it once
  • Date sensitivity (aka Fragile Fixture)
    • If your test depends on what data is in the database, then someone else can change it and your tests will fail
  • Context Sensitivity
    • Something changes outside the code (date/time, contents of another app)
    • If current date is this, then expected result is this; else …
  • Use stable interfaces
  • Bypass GUI
  • Encapsulate API from yourself (creation methods, etc.)
  • Minimal fresh fixture
    • Custom design for each test
  • Test stubs

Assertion Roulette

  • Failure shows up in output, and you can’t tell which assertion failed
  • When you can’t reproduce in your IDE, you have no idea what’s going wrong
  • Solution: Add assertion messages

Leave a Reply

Your email address will not be published. Required fields are marked *