Agile 2006: Working Effectively with Legacy Code

Working Effectively with Legacy Code
Michael Feathers

  • Traditional definition of “Legacy Code”: if you need to find your compiler on Ebay
  • Feathers’ definition: Legacy code = code without tests
    • How you test around your code influences what you can do with that code
    • If you have no tests, you pay a price for that
    • Hard to modify, hard to refactor

Unit testing

  • Allows rapid change of existing code
  • Meaningful regression
  • Always been around, but we’ve heard a lot more about it lately because of Agile
  • Enables refactoring (safety net)

Testability

  • “I thought I was a good designer until I tried to take arbitrary classes I’d written and place them under test”
  • Easy to talk about cohesion and coupling, but unit testing forces you to deal with them
    • Cohesion = are things that should be together together?
    • Coupling = are things that should be able to vary independently separate?

How testable is your code?

  • Most real-world (and non-unit-tested) code has very tangled dependencies: “glued together”
  • We usually don’t know this until we try to test pieces in isolation

Kinds of Glue

  • Singletons aka Global Variables
    • If there can only be one of an object, you’d better hope that one is good for your tests too
    • Problem: it’s something your unit can touch that’s outside that unit, and that can affect that unit. Hard to test a single unit of code.
  • Internal instantiation
    • When a class creates an instance of a hard-coded class, you’d better hope that class runs well in tests
    • What if that class modifies global state?
    • Opens a socket? (The Internet isn’t part of a unit.)
    • Watch out for setup and speed
  • Concrete Dependency
    • Depending on a concrete class instead of an interface (see Dependency Inversion Principle, Dependency Injection)
    • When a class uses a concrete class, you’d better hope that class lets you know what’s happening to it.
  • How to fix these kinds of things?
    • Design for tests
    • TDD pushes this

A Set of Unit Testing Rules

  • A test is not a unit test if:
    • It talks to the database
    • It communicates across the network
    • It touches the file system
    • It can’t run at the same time as any of your other unit tests
    • You have to do special things to your environment (like editing configuration files) to run it
    • These things can all happen in integration tests, etc., but keep them out of unit tests as a general rule

Breaking Dependencies

  • Seams: Places where you can plug in different functionality
    • Object seam / virtual function dispatch: use virtual methods
    • Macro expansion seam: can use #define to redefine a function
    • Template instantiation seam
    • Link seam: can link to a different library
    • Some can be varied on a test-by-test basis, some can’t be varied without recompiling

Uninstantiable class #1

  • Constructor depends on a constructor that depends on a constructor that depends on a constructor that depends on a constructor that depends on a…
  • Foo constructor takes a Config, whose constructor takes a Reader, whose constructor takes a FileProvider…

Uninstantiable class #2

  • Constructors that do something nasty that shouldn’t happen in a test
  • Example: Foo constructor creates a database connection
    • Code smell: This violates Single-Responsibility Principle

Uncallable method

  • Parameters you can’t instantiate or things you shouldn’t do in a test

Breaking dependencies

  • Manifest dependencies
    • Dependency which is present at the interface of a class. You can see some parameter of some method being problematic.
    • Extract interface / extract implementer (make the class actually implement the interface)
    • Adapt parameter (add a wrapper)
    • Choice between these two may depend on whether you have the source, and can change the class to implement a new interface
  • Hidden dependencies
    • Things hidden inside the class that you don’t see at the interface
    • Introduce static setter
    • Parameterize constructor / method
    • Extract and override getter
    • Introduce instance delegator

Working with Care

  • “Oh crap, I could be breaking that!”
  • Mitigating practices
    • Lean on the compiler (aka Ask the compiler)
      • Introduce deliberate errors, so you can find places you need to make changes
      • Example: Change one B reference to IB, then compile, and fix everything that breaks: change types, add methods to the interface, etc.
      • Problematic when you have a descendant class that shadows the method from the base class (and even worse in Java, where overrides are problems too)
    • Preserve signatures
      • Copy/paste parameter lists verbatim. Minimal editing.
      • Prevents errors
      • Makes it easy to change calls to X into calls to a delegating method
      • Don’t do too much at once
    • Pair programming

Parameterize Constructor

  • Inject dependencies through a constructor
  • Consider using a default constructor to avoid affecting non-test clients

Encapsulation

  • We’re breaking encapsulation. This object that used to read its own configuration file now takes the config as a parameter. Callers now need to care about how this class is implemented.
  • We’re going to break encapsulation even more as we break dependencies
  • Why is encapsulation important?
    • Encapsulation helps you understand your code
    • So do tests, so it’s not that bad a tradeoff

Parameterize Method

  • Same as Parameterize Constructor: if a method has a hidden dependency on a class because it instantiates it, make a new method with an argument
  • Call it from the other method
  • Steps:
    • Create a new method with the internally created object as an argument
    • Copy all the contents into the new one, deleting the creation code
    • Delete the contents of the old one, replacing with a call to the new one

Extract and Override Getter

  • If a class creates an object in its constructor and doesn’t use it, you can extract a getter and override it in a testing subclass
  • Make the getter lazy-initialize the field and return it
  • Change everyone to use the getter instead of the field
  • Override the getter in the testing subclass, and do something different
  • Parameterize Constructor is usually preferable

Extract and Override Call

  • When we have a bad dependency in a method and it is represented as a call
  • Can extract the call to a method and override it in a testing subclass
  • E.g., replace Banking.RecordTransaction(…) with delegating method RecordTransaction(…), with same name and same arguments (Preserve Signatures)

Expose Static Method

  • If the class is hard to instantiate, you may be able to extract some of the logic into a static method
  • Is making the method static a bad thing?
    • No. The method will still be accessible on instances. Depending on the language, clients of the class might not even know the difference.

Introduce Instance Delegator

  • Static methods are hard to fake because you don’t have a polymorphic call seam (unless you’re using class methods in Delphi)
  • Add an instance method that calls the static method
  • Have clients call the instance method instead

Development Speed

  • Mars Spirit Rover
  • Signal takes 7 minutes to get to Mars, 7 minutes to get back
  • In languages like C++, build times can be like this
  • But if you break dependencies so you can unit-test each class in isolation, you can build those unit tests quickly and remove the whole problem

Supercede Instance Variable

  • Supply a setter, instead of passing to the constructor
  • Good way to get past the construction issue
  • Seems horrible from an encapsulation point of view, but becoming more popular among the greenhorns who use Dependency Injection
  • Supply defaults
  • “SupercedeFoo” naming convention, to suggest that it’s not an ordinary set-me-anytime setter

Proactively Making Design Testable

  • TDD

“The Deep Synergy”

  • Whatever you do to make your code testable invariably makes it better
    • Forces decoupled construction
    • Forces fine-grained factoring (which leads to better cohesion)
    • Global state is painful
    • The list goes on
    • A test is a microcosm

Avoid “iceberg classes”

  • Number/weight of private methods overcomes the public methods. Most of its mass is underwater.
  • Usually (always?) has single-responsibility violations
  • You can /always/ extract a class.
  • Again, encapsulation is good, but it is not an unqualified good.

Rules for designing an API

  • Leave your users an “out”
    • Provide them with interfaces and stubs if you can
  • Make sure you provide sufficient seams
  • Be very conscious of your use of:
    • Non-virtuals
    • Static methods (esp. creational methods)
    • Mutable static data
    • Classes that users must inherit
  • Don’t just write tests for your code; also write tests for code that uses your code

Leave a Reply

Your email address will not be published. Required fields are marked *