TechEd 2008 notes: Evolving Frameworks

This session was aimed at people who write frameworks: low-level code used by thousands of people. When you’re writing a low-level framework, you have to be very cautious about how you change APIs, lest you break code in the field. If nobody outside your department consumes your code, and you compile all your code every time you build — probably the most common case — then most of this stuff is only of academic interest to you. But it’s interesting nonetheless.

This is my last session-notes post about TechEd 2008. I’ll probably post more about the con later — I think it’d be interesting, for example, to contrast the philosophies different presenters had about unit-testing best practices — but it’ll probably be a few days or weeks before I get back to that; writing 22 blog posts that are up to my own editorial standards is a lot of work, and I need a break!

Once again, those of you reading via DelphiFeeds are only getting the posts about general programming topics. If you want to also see the ones about features not likely to be directly relevant to Delphi anytime soon (e.g., lambda expressions, expression trees, LINQ, .NET-style databinding, add-ins, F#, the Provider pattern), you can look at my entire list of TechEd 2008 posts.

Evolving Frameworks
Krzysztof Cwalina
Program Manager, .NET Framework team
Microsoft

Team has dual charter:

  • Basic APIs (used to be on BCL team, now on higher-level application model, cross-cutting features)
  • Architectural and design quality of the whole framework
  • Framework produced by many (over 1,000) people. Goal to make it look like it was designed by one person. Consistency guidelines.
  • More recently looking into evolving APIs and improving the evolution process.

Frameworks deteriorate over time

  • OO design community has already done much research into how to change requirements
  • It’s even worse with APIs
  • Still many forces require changes over time
    • Requirements change
    • Ecosystem changes: new tools, language changes
    • People change

No silver bullet. But there are some techniques to design APIs that will be easier to evolve, and some tricks that allow modifications that used to be breaking.

Slow down framework deterioration

  • With thoughtful architecture
  • With proper API design (micro-design guidelines)
  • With framework evolution idioms

Libraries, Abstractions, Primitives

  • Three different kinds of types in frameworks

Library types

  • Definition: types that are not passed between components. Instantiate, use, then maybe keep a reference or maybe let the GC collect it.
  • Examples: EventLog, Debug.
  • Easy to evolve: leave old in, add new one.
  • Cost to consumers, of introducing duplication, is nonzero. Shouldn’t be done lightly, but is doable.

Primitive types

  • Definition: types that are passed between components and have very restricted extensibility (i.e., no subtype can override any members).
  • Examples: Int32, String, Uri
  • Hard to evolve
  • Little need to evolve. Usually very simple. Not much policy went into designing them.

Abstractions

  • Definition: types that are passed between components and support extensibility (i.e., interfaces or classes with members that can be overridden)
  • Examples: Stream, IComponent
  • Lots of policy; contracts usually quite strict
  • Hard to evolve
  • Unfortunately, there’s quite a bit pressure to evolve abstractions
  • Extremely difficult to design abstractions out of the blue
    • The most successful abstractions in the .NET Framework are those that have been around for many years
    • “What should a stream do?” is pretty well established.
    • Interface with too few members won’t be useful. Interface with too many members will be hard to implement.

Evolving libaries

  • Can write a new class and tell people to start using it. Problematic if there isn’t a good migration path.
  • Architecture
    • Dependency management
  • Design
  • Toolbox
    • Type forwarders — lets you move a type from one assembly to another without breaking binary compatibility
    • EditorBrowsableAttribute
    • ObsoleteAttribute
  • Some people say a library should be at least 10 times better before you should consider replacing the old one.

Dependency management

  • Mostly applicable to APIs with more than one feature area, esp. if they evolve at a different pace or are used for different scenarios.

Framework Layering

  • Within each layer, have “components” (groups of classes) that each evolve together
  • Manage dependencies between the components
  • Lower layers shouldn’t depend on higher layers

Basics of dependency management

  • API dependency: A depends on B if a type in B shows in the publicly accessible (public or protected) API surface of a type in A. Might be parameter type, base type, even an attribute.
  • Implementation dependency: type in A uses a type in B in its implementation.
  • Circular dependency (including indirectly)
  • Dependency going to a lower layer: OK
  • Dependency going to a higher layer: Not allowed
  • Dependency within a layer: discussed by architects to see if it makes sense

Design principles

  • Focus on concrete customer scenarios
    • Much easier to add to something simple
    • Does this minimal component meet your needs?
  • Keep technology areas in separate namespaces
    • Mainly applies to libraries
    • Single namespace should be self-contained set of APIs that evolve on the same time schedule and in the same way
  • Be careful with adopting higher level APIs (usually libraries) for lower layers
    • E.g., Design a high-level API, then realize you can make it general, so you try to move it to a lower layer.
    • This rarely works when it’s not thought through from the beginning.
    • Don’t do it just because you can.
  • Don’t assume that your library is timeless
    • XML DOM should not be in System.Xml namespace

Toolbox: Type forwarders

[assembly:TypeForwardedTo(typeof(SomeType))]
  • Lets you move a type to a different assembly without breaking already-compiled code
  • Put in assembly where the type used to be
  • Forces a compile-time dependency on the assembly the type has been moved to
    • Can only be used to move a type down?

Toolbox: ObsoleteAttribute

[Obsolete(...)]
public void SomeMethod() {...}
  • Take the API out of developers’ minds. Present simplified view over time of “This is the framework”.
  • Caution: many people think Obsolete is non-breaking, but that’s not entirely true because of “Treat warnings as errors”.
    • “Yes,” you may say, “but that’s only when you recompile.” True, but some application models, like ASP.NET, recompile on the fly.

Toolbox: EditorBrowsableAttribute

[EditorBrowsable(EditorBrowsableState.Never)]
  • Hides from Intellisense, but you can still use it without warnings.
  • Often this is good enough.

Evolving primitives

  • Minimize policy (keep them simple)
    • Int32 should be no more than 32 bits on the stack
  • Provide libraries to operate on primitives
    • Consider using extension methods to get usability

Extension methods and policy

// higher level assembly (not mscorlib)
namespace System.Net {
    public static class StringExtensions{
        public static Uri ToUri(this string s) {...}
  • Policy-heavy implementation in a library that’s isolated from the primitive
  • High usability because it’s an extension method

Evolving abstractions

  • HARD!
  • Plan to spend ~10x as long designing abstractions as you do designing policies or libraries
  • Right level of policy
  • Right set of APIs

Interfaces vs. abstract classes

  • Classes are better than interfaces from an evolution point of view
  • Can’t add members to interfaces, but can add them to classes
  • That’s why it’s Stream instead of IStream
  • Were later able to add timeouts to streams, and it was much easier to add than it would have been with an IStream.
  • Imagine that it had been IStream from the beginning, and later they’d decided to add timeouts.
    • Adding members to an existing framework interface is never allowed.
    • When adding timeout, would have had to make a new descendant interface ITimeoutEnabledStream.
    • Wouldn’t need CanTimeout.
    • Problem is, base types proliferate (e.g. Stream property on a StreamReader). So casts would proliferate as well. And your “is it the right type” is effectively your CanTimeout query.
    • Less usability, since new member doesn’t show up in Intellisense.

Summary

  • Primitives, abstractions, libraries
  • Dependency management
  • Controlling policy
  • API malleability
    • Classes over interfaces, type forwarders, etc.

Q&A:

Q: Have there been times you did an abstract class and later wished it had been an interface?
A: Really not yet; he’s still waiting to hear from a team who’s done a class and later wishes they hadn’t. There are some situations where you do need interfaces (e.g. multiple inheritance). Sometimes it’s still a judgement call.

Q: Guidance on when to use extension methods?
A: Working on some guidelines for the next version of the Framework Design Guidelines book. There are some proposed guidelines at LINQ Framework design guidelines (scroll down to section 2, then look for the list of “Avoid” and “Consider” bullet points); if those stand the test of time, they’ll eventually become official guidelines.

Q: When would you split a namespace into a separate assembly?
A: When you design assemblies and namespaces, they should be two separate design decisions. Feature areas have a high correlation with namespaces. Assemblies are for packaging, servicing, deployment, performance. Make the decisions separately.

Q: Why not fix design flaws when moving from 1.1 to 2.0?
A: As current policy, they don’t remove APIs. (Not promising that it will never happen.) They think they can evolve the framework in a relatively healthy way. They’re even brainstorming ways to add more things like type mappers, e.g. moving static methods from one type to another (but no, it’s not in a schedule). Didn’t have some of these mechanisms when they were writing 2.0.

Q: How does the CLR team resolve conflicts when reviewing a design? Consensus? Vote?
A: Many processes at MS revolve around “orb”. One for compatibility, one for side-by-side, etc. Groups of four roles: owner, participants, reviewers, approver (escalation point). Try to concentrate on owner and participants, to reach a conclusion by consensus. When that fails, go to the reviewers, then the approver. Approver rarely has to make the decision; more likely to educate than override.

Q: Long overloaded parameter lists vs. parameter objects?
A: They’ve done overloads in that case. Ideally, each shorter one just loses one parameter from a longer one (be consistent about ordering, etc.) Best if the leading parameters are similar, for Intellisense usability reasons. They do use parameter objects in a few cases, but mostly in cases where you don’t want to, or cannot, have overloads; e.g., an event. Also don’t want an interface with lots of overloads.

TechEd 2008 notes: How LINQ Works

How LINQ Works: A Deep Dive into the Implementation of LINQ
Alex Turner
C# Compiler Program Manager

This is a 400-level (advanced) talk about the implementation of LINQ.

  • What’s the compiler doing behind the scenes? Layers it translates down to
  • Differences between the translation in the object world vs. remote store

Example of LINQ syntax: GetLondoners()

var query = from c in LoadCustomers()
            where c.City == "London"
            select c;
  • They didn’t want to bake any knowledge of how to do queries into the compiler; instead they use libraries, so you could even use your own implementation of Where() if you really wanted to

Where() as it would look with .NET 1.x delegates:

bool LondonFilter(Customer c)
{
    return c.City == "London";
}
...
var query = LoadCustomers().Where(LondonFilter);
  • You don’t really want to make a new method for each filter
  • Solved in .NET 2.0 with anonymous delegates, but they were too wordy to encourage use of functional libraries
  • Rewritten with C# 3.0 lambdas:
var query = LoadCustomers().Where(c => c.City == "London");

Proving what it’s actually compiled to:

  • Use Reflector
  • In Reflector Options, set Optimization to “.NET 1.0”, so it doesn’t try to re-create the LINQ syntax for us
    • Interestingly, it does still show extension-method syntax and anonymous-type instantiations. Have to turn optimizations off entirely to see those, but then you’ll go crazy trying to read the code.
  • Anonymous delegates make:
    • A cache field with a wacky name and a [CompilerGenerated] attribute
    • A method with a wacky name and a [CompilerGenerated] attribute
    • Generated names have characters that aren’t valid in C# identifiers, but that are valid for CLR. Guarantees its generated names don’t clash with anything we could possibly write.
  • Implementing Where: you don’t really want to build a whole state machine. Use iterators instead:
static class MyEnumerable
{
    public static IEnumerable<TSource> Where<TSource>(
        this IEnumerable<TSource> source, Func<TSource, bool> filter)
    {
        foreach (var item in source)
            if (filter(item)
                yield return item;
    }    
}
  • I didn’t realize .NET 2 iterators were lazy-initialized. Cool.

Side note: You can set a breakpoint inside an anonymous delegate, or on a lambda expression, even if it’s formatted on the same line of source code as the outer call. Put the cursor inside the lambda and press F9; I don’t think you can click on the gutter to set a breakpoint on anything other than the beginning of the line.

Side note: When you step into the c.City == "London" in the LINQ where clause, the call stack shows it as “Main.Anonymous method”.

var query = from c in LoadCustomers()
            where c.City == "London"
            select new { c.ContactName, c.Phone };
  • Anonymous type:
    • Generated with another C#-impossible name, and it’s generic.
    • Immutable.
    • Default implementations for Equals, GetHashCode, ToString.

LINQ to SQL: We don’t want to do any of this anonymous-delegate generation. Instead, want to translate the intent of the query into T/SQL, so the set logic runs on the server.

Side note: Generated NorthwindDataContext has a Log property. Set it to Console.Out and you’ll get info about the query that was generated for us.

Func<Customer, bool> myDelegate = (c => c.City == "London");
Expression<Func<Customer, bool>> myExpr = (c => c.City == "London");
  • The first is just a delegate.
  • The second is a parse tree.
    • C# samples have an Expression Tree Visualizer that you can download.
  • This runs a different Where method. That’s because here we’ve got an IQueryable<T>, rather than just an IEnumerable<T>.
  • Where() takes an Expression<Func<TSource, bool>> predicate. So the compiler generates an expression tree.
  • Where() just returns source.Provider.CreateQuery(...new expression...), where the new expression is a method call to itself, with parameters that, when evaluated, become the parameters it was called with. (Is your head spinning yet?) It basically just builds the expression-tree version of the call to itself, which is later parsed by LINQ to SQL and turned into an SQL query.
  • LINQ to Objects: code that directly implements your intent
  • LINQ to SQL: data that represents your intent

The difference is all in the extension methods.

TechEd 2008 notes: How not to write a unit test

How not to write a unit test
Roy Osherove
Typemock
Blog: ISerializable.com

All the things no one ever told you about unit testing.

Will have two parts: presentation-like (what really works), and interactive (questions, prioritized).

Early questions:

  • Data access
  • Legacy code (what not to do)
  • Duplication between unit tests and functional tests
  • Testing non-.NET code, e.g. ASP.NET
  • Testing other languages, e.g. F#, IronRuby)
  • Unit tests and refactoring
  • Testing UI
  • How do you mock the world?
  • How important are tools? Mocking tools, refactoring, etc. Can you write unit tests with just what VS provides?
  • Did you bring your guitar? — No. Wanted as much time for information as possible.
  • Where did you get your T-shirt? — Was being given away at another conference.

A unit test is a test of a small functional piece of code

  • If a method returns a boolean, you probably want at least two tests

Unit testing makes your developer lives easier

  • Easier to find bugs.
    • That’s the common line. But not necessarily — e.g. if your test has bugs, or if you’re testing the wrong things
    • If you can’t trust your tests to find bugs (and especially if you don’t know you can’t trust them), then the opposite may be true — you may be confident you don’t have bugs when you do.
    • If you don’t trust them, then you won’t run them, they’ll get stale, and your investment in writing them was wasted.
  • Easier to maintain.
    • But 1,000 tests = 1,000 tests to maintain
    • Change a constructor on a class with 50 tests — if you didn’t remove enough duplication in the tests, it will take longer than you think to maintain the tests
    • We will look at ways to make tests more maintainable
  • Easier to understand
    • Unit tests are (micro-level) use cases for a class. If they’re understandable and readable, you can use them as behavior documentation.
    • Most devs give really bad names to tests. That’s not on purpose.
    • Tests need to be understandable for this to be true.
  • Easier to develop
    • When even one of the above is not true, this one isn’t true.

Make tests trustworthy

  • Or people won’t run them
  • Or people will still debug for confidence

Test the right thing

  • Some people who are starting with test-driven development will write something like:
[Test]
public void Sum()
{
    int result = calculator.Sum(1, 2);
    Assert.AreEqual(4, result, "bad sum");
}
  • Maybe not the best way to start with a failing test
  • People don’t understand why you want to make the test fail
  • Test needs to test that something in the real world is true: should reflect the required reality
  • Good test fails when it should, passes when it should. Should pass without changing it later. The only way to make the test pass should be changing production code.
  • If you do TDD, do test reviews.
    • Test review won’t show you the fail-first. But you can ask, “So can you show me the test failing?”

Removing/Changing Tests

  • Don’t remove the test as soon as it starts passing
    • If it’s a requirement today, chances are it’ll still be a requirement tomorrow
    • Duplicate tests are OK to remove
    • Can refactor a test: better name, more maintainability
  • When can a test fail?
    • Production bug — right reason (don’t touch test)
    • Test bug (fix test, do something to production code to make the corrected test fail, watch it fail, fix production code and watch it pass)
      • Happens a lot with tests other people wrote (or with tests you don’t remember writing)
    • Semantics of using the class have changed (fix/refactor)
      • E.g., adding an Initialize method that you have to call before you use the class
      • Why did they make that change without refactoring the tests?
        • Make a shared method on the test class that instantiates and Initializes
    • Feature conflict
      • You wrote a new test that’s now passing, but the change made an old test fail
      • Go to the customer and say, “Which of these requirements do you want to keep?”
      • Remove whichever one is now obsolete

Assuring code coverage

  • Maybe unorthodox, but Roy doesn’t like to use code-coverage tools
    • 100% code coverage doesn’t mean anything. Finds the exceptions, but doesn’t prove the logic.
  • Better: change production code and see what happens
  • Make params into consts
  • Remove “if” checks — or make into consts (if (true)). Will a test fail? If not, you don’t have good coverage.
  • Do just enough of these kinds of tweaks to make sure the test is okay.
  • Test reviews are still valuable if you do pair programming, just maybe less often. Good to bring in someone else who didn’t write the code, with an objective eye.
  • Quick test review of yesterday’s code at end of stand-up meeting?

Avoid test logic

  • No ifs, switches or cases
    • Yes, there are always exceptions, but it should be very rare
    • Probably only in testing infrastructure
    • Most of the time, there are better ways to test it
    • Sometimes people write conditionals when they should be writing two tests
    • Don’t repeat the algorithm you’re testing in the test. That’s overspecifying the test.
  • Only create, configure, act and assert
  • No random numbers, no threads
  • Test logic == test bugs
  • Fail first also assures your test is correct

Make it easy to run

  • Integration vs. Unit tests
  • Configuration vs. ClickOnce
  • Laziness is key
  • Should be able to check out, run all the unit tests with one click, and have them pass.
  • Might need to do configuration for the integration tests, so separate them out.
  • Never check in with failing tests. If you do, you’re telling people it’s okay to have a failing test.
  • Don’t write a lot of tests to begin with, and have them all failing until you finish everything. If you do that, you can’t check in (see previous point). Write one test at a time, make it pass, check in, repeat.

Creating maintainable tests

  • Avoid testing private/protected members.
    • This makes your test less brittle. You’re more committed to public APIs than private APIs.
    • Testing only publics makes you think about the design and usability of a feature.
    • Publics are probably feature interactions, rather than helpers.
    • Testing privates is overspecification. You’re tying yourself to a specific implementation, so it’s brittle, and makes it hard to change the algorithm later.
    • Sometimes there’s no choice; be pragmatic.
  • Re-use test code (Create, Manipulate, Assert) — most powerful thing you can do to make tests more maintainable
  • Enforce test isolation
  • Avoid multiple asserts

Re-use test code

  • Most common types:
    • make_XX
      • MakeDefaultAnalyzer()
      • May have others: one already initialized, with specific parameters, etc.
    • init_XX
      • Once you’ve already created it, initialize it into a specific state
    • verify_XX
      • May invoke a method, then do an assert on the result. Pulling out common code.
  • Suggestion: by default, the word new should not appear in your test methods.
    • As soon as you have two or more tests that create the same object, you should refactor the new out into a make method.

Suggestion: don’t call the method directly from Assert.AreEqual(...). Introduce a temp variable instead. (This relates back to the 3A test pattern.)

Aside: Test structure

  • One possibility: Each project, e.g. Demo.Logan, has tests Demo.Logan.Tests. Benefit: they’re next to each other in Solution Manager.
  • Test files: one file per tested class?
    • That’s a good way to do it. Aim for a convention like MyClassTests so it’s easy to find.
    • If you have multiple test classes for one test, make multiple classes.
    • Consider nested classes: making a MyClassTests, and putting nested classes, one per feature. Make the nested classes be the TestFixtures.
      • Be careful of readability, though.
      • Roy said his preference would be to keep one test class per source file, rather than using nested classes to put them all in one file.
      • Decide for yourself whether you’d prefer one class per file, or all tests for one class in one place.

Enforce test isolation

  • No dependency between tests!
  • If you run into an unintended dependency between tests, prepare for a long day or two to track it down
  • Don’t run a test from another test!
  • You should be able to run one test alone…
  • …or all of your tests together…
  • …in any order.
  • Otherwise, leads to nasty “what was that?” bugs
  • Almost like finding a multithreading problem
  • The technical solution (once you find the problem) is easy. That’s why God created SetUp and TearDown. Roll back any state that your test changed.

Avoid multiple asserts

  • Like having multiple tests
  • But the first assert that fails — kills the others. If one test fails, the others still run. Asserts aren’t built that way.
  • Exception: testing one big logical thing. Might have three or four asserts on the same object. That’s possible, and doesn’t necessarily hurt.
  • Consider replacing multiple asserts with comparing two objects (and also overriding ToString so you can see how they’re different when the test fails).
    • My experience: this doesn’t work well when the objects get really big and complicated. Could work well for small objects, though.
  • Hard to name
  • You won’t get the big picture (just some symptoms)

Don’t over-specify

  • Interaction testing is risky
  • Stubs -> rule. Mocks -> exception.
  • Mocks can make you over-specify.
  • “Should I be able to use a stub instead of a mock?”
    • A stub is something you never assert against.
  • There’s only one time when you have to use mocks: when A calls a void method on B, and you have no way to later observe what B was asked to do. Then you have to mock B and verify that it was called with the right parameter.
    • If the other class does return a value, then you can test what your class did with that result. You’re testing your class, after all, not that other object — that’s why you’re faking it.

Readability

  • If you do another test that tests basically the same thing, but with different parameters, he suggests appending “2” to the end of the test name. But that’s assuming you already have a really good naming convention for the base test name! (Remember the serial killer.)
  • Bad: Assert.AreEqual(1003, calc.Parse("-1"));
  • Better:
int parseResult = Calc.Parse(NEGATIVE_ILLEGAL_NUMBER);
Assert.AreEqual(NEGATIVE_PARSE_RETURN_CODE, parseResult);
  • If you can send any kind of number, and the specific value you pass doesn’t matter, either use a constant, or use the simplest input that could possibly work (e.g. 1).

Separate Assert from Action

  • Previous example
  • Assert call is less cluttered

TechEd 2008 notes: Best Practices with the Microsoft Visual C# 3.0 Language Features

Still catching up on posting my notes from TechEd last week. I probably would’ve gotten this up last night if I hadn’t been in the basement most of the evening for tornado warnings.

Best Practices with the Microsoft Visual C# 3.0 Language Features
Mads Torgersen
Program Manager for the C# Language
Microsoft

He’s the guy who figures out what features go in the next version of the language, to keep us on our toes.

Goals of this talk

  • Show new features
  • Important do’s and don’ts
  • Introduce LINQ

Despite the name of the talk, more time will be given to C# 3 features than to best practices. Best practices are in there, but they’re not the star of the show. If you’re going to be annoyed by that, start being annoyed now, rather than waiting until the end.

C# 3 in a Nutshell

  • Imperative => Declarative
    • Before: modify state in little bits
    • Leads to a lot of detail in describing how you want things done
    • New: say what you want, rather than how you want it done
    • MS has freedom to give us performance and flexibility

  • How => What
  • Make queries first-class

(Incomplete) list of new features

  • Auto properties
  • Implicitly typed locals
  • Object and collection initializers
  • Extension methods
  • Lambda
  • Queries
  • Anonymous types
  • Expression types
  • …a couple not shown in this talk

Automatically Implemented Properties

  • Just sucking up to programmers’ laziness; nothing deep
class Customer
{
    public string CustomerID { get; set; }
    public string ContactName { get; set; }
}
  • Simplify common scenario
  • You can see that they’re trivial
  • Limitations
    • No body -> no breakpoints
    • No field -> no default value
  • There can be serialization issues if you change an automatic property to a real property, since the autogenerated field has a magic name that’s stored in your serialized data

Lure of Brevity: Best practices for auto properties

  • Only use this for things that really are simple get/set properties
  • Hold on to your…
    • Get-only and set-only properties
    • Validation logic
  • Private accessors (get; private set;) are usually not the answer — too easy to forget you didn’t intend for them to be set capriciously, and add code a year from now that sets them in an unsafe way
  • Be careful what you make settable:
// Bad
class Customer {
    public string CustomerKey { get; set; }
    // Key really shouldn't be settable

Implicitly Typed Locals

  • var keyword, type inference
  • I won’t bother quoting his code snippet, you’ve seen it before
  • Intellisense can show you the actual type — hover over the var
  • Remove redundancy, repetition, clutter
  • Allow focus on code flow
  • Great for experimentation: you can change something’s return type and there’s a much better chance that everything will still compile (Roy would probably say there’s more essence and less ceremony)
  • “Surprisingly liberating experience”

Redundancy is not always bad: best practices for var

  • Explicit types on locals (i.e., not using var) will…
    • Improve readability of complex code, esp. if method name doesn’t make its return type clear
    • Allow typechecking on right-hand side (when you want that)
    • Can be more general than the right-hand side
  • Think: Who is the reader?
  • Find your own compromise between the two extremes

Side note: ObjectDumper class from samples (kind of like .inspect in Ruby)

Object and collection initializers

  • Traditionally very imperative. Start with empty collection, then create an empty Customer, then initialize it, then add it.
  • Lots of intermediate results lying around.
static IEnumerable<Customer> GetCustomers()
{
    var custs = new List<Customer>()
    {
        new Customer {
            CustomerID = "MADST",
            ContactName = "Mads Torgersen",
            City = "Redmond"
        }
    };
}
  • Can omit empty parens after new if you use an object initializer
  • Code-result isomorphism
    • Structure of code parallels structure of object you want.
  • Expression-oriented
    • Can be used in expression context
  • Atomic
    • No intermediate results
    • Create object and collection in one fell swoop. Don’t need temporary variables. Don’t expose any intermediate states at all.
  • Compositional
  • May not need as many constructor overloads

Constructors are still good: best practices for object and collection initializers

  • Constructors…
    • Show intent
    • Enforce initialization
    • Initialize get-only data
  • Initializers and constructors compose well
var c = new Customer("MADST"){
    ContactName = ...

Extension Methods

  • You’ve seen these demos too (well, maybe not GetLondoners() specifically)
  • Dilemma with very general types: you use them in a specific setting, and sometimes you want a special view on it and wish you could add a couple more methods to the original declaration, just for your use in that setting
  • One really interesting benefit: can add methods to a generic of only certain types, e.g. can have a method on IEnumerable<Customer> that isn’t there on the general IEnumreable<int>. I like this!
  • Declared like static methods, can call like instance methods
  • New functionality on existing types
  • Scoped by using clauses
  • Interfaces and constructed types

Cluttering your Namespace: best practices for extension methods

  • Consider making them optional (separate namespace), so people can use your library without necessarily needing your extension methods (extension methods for working with types from MyNamespace.Foo should be in their own namespace, not right in MyNamespace.Foo)
  • Don’t put them on all objects!
  • Make them behave like instance methods.
namespace System
{
    public static class MyExtensions
    {
        // Don't do this
        public static bool IsNull(this object o) {
            return o == null;
        }
    }
}
  • That’s a worst practice. It violates all three of the above guidelines. Don’t do it just because it’s cool.

Lambda Expressions

  • Predicate<T> — function that takes T and returns bool
  • =>: Some call this the “fat arrow”
  • Terse anonymous functions
  • Parameter types inferred from context
  • Closures: capture local state (also true of anonymous methods)

Condensed Power: best practices for lambda expressions

  • Keep them small
    • That’s the point of making them terse
    • Yank them out if they get too big
  • Watch that capture (of local variables, and using them inside the lambda)
    • Can have unexpected results
    • Exposing private state
  • Watch the complexity
    • Functions of functions returning functions…
  • Think: Who executes this lambda, and when?

Queries

  • Functional: doesn’t mutate the original collection; instead returns a new collection
  • using System.Linq; == “Linq to Objects”
  • Extension methods give you pipelining: customers.Where(...).Select(...)
  • Language integrated — use anywhere! (if you’re using C#)
  • Query expressions for common uses
  • Mix and match query and method syntax
  • Expect deferred execution (can do ToArray)

Beware monadic complexity hell: best practices for queries

  • Another powerful complexifier
  • Do you need to roll your own query provider?
  • Use query pattern for queries only!
    • Avoid abusing query syntax for other magic
    • Even if you know about monads! (Your users don’t)

Anonymous types

  • select new { Name = c.ContactName, c.City } — smart enough to call the second property City
  • Temporary local results
  • Shallow immutability and value equality
  • Does a nice job on the generated classes
    • Value-based equality
    • Good hashcodes

Keep it local: best practices for anonymous types

  • If you need a type, make one! Don’t use an anonymous type and work around problems. Only use where they don’t limit you.

Expression trees

  • Runtime object model of code
  • Created from lambda expressions
  • Language independent. LINQ to SQL doesn’t know about C#; it just knows about expression trees.
  • Compile back into delegates on demand. .Compile() method — even if you created it with factories instead of a lambda.

The Lure of Doing Magic: best practices for expression trees

  • You can interpret expression trees any way you like.
  • Don’t!
    • Stay close to expected semantics
    • Avoid special magic names, etc.

Final words

  • C# 3 and LINQ can change the way you code…
    • Declaratively: more of the what, less of the how
    • Eloquently
    • And with lots of queries. Don’t think of queries as something heavyweight for external data.
  • …but they don’t have to!

TechEd 2008 notes: Create Your Own Providers for the Ultimate Flexibility

This was an interesting session. The Provider model seems simple enough. There’s a lot of classes involved, each of which is pretty simple (I think he stuck pretty well to the single-responsibility principle), although I had a bit of a hard time keeping track of which class went where in the chain. Sometime later I’ll look back at this and draw some sort of class-interaction diagram to help me figure it out.

One note, though: apparently there’s a ProviderBase class that does some of the boilerplate code for you. He didn’t mention that until the very end (after someone asked him about it). So this is mostly theory about the bits and pieces, rather than necessarily a list of what you need to implement on your own. I haven’t looked at ProviderBase to see how much of a base it gives you to build on.

Create Your Own Providers for the Ultimate Flexibility
Paul D. Sheriff
President
PDSA, Inc.

Samples at pdsa.com/teched, in both VB.NET and C#.

Agenda

  • What’s a provider and why use one
  • Dynamically loading assemblies
  • Creating a custom data provider
  • Implement an ASP.NET Membership provider
    • Store users/roles in XML file

What is a provider?

  • Components that can be loaded at runtime
  • Allows you to switch components without recompiling
  • Patterns
    • Strategy
    • Abstract factory

Why use a provider?

  • Keep dev API consistent
    • Developer always calls a method named SendAFax
  • Implementation can be different

Provider examples

  • ADO.NET Data Provider
    • MSSQL
    • Oracle
    • OLE DB
    • DB2
  • Each implements an interface
    • IDBCommand
    • IDBConnection
  • ASP.NET Membership System
    • Front end always same
      • Login control, CreateUserWizard control
    • User storage can change
    • Change the provider by deploying a new DLL (not recompiling old) and changing the config file
  • ASP.NET session state

How to…

  • Create interfaces and/or base classes
  • Dynamically load
  • Read config files
  • Create providers

Dynamically create the class

  • System.Type.GetType(className)
  • System.Activator.CreateInstance(type)

Dynamically load assembly

  • Assembly class
    • Use Load if in GAC or current directory
    • Use LoadFile if not
  • Then use the Assembly‘s CreateInstance

Reading the config file

  • ConfigurationManager.AppSettings[“ProviderName”]
  • Better: add your own section. But that’s a little more work.

Demo

<configuration>
  <configSections>
    <section name="ConfigSettings" type="ConfigSimple.ConfigSectionHandler, ConfigSimple">
  </configSection>
  <ConfigSettings type="..." location="...">

class ConfigSectionHandler : ConfigurationSection
{
    [ConfigurationProperty("type")]
    public string Type
    {
        get { return (string) this["type"]; }
    }
    [ConfigurationProperty("location")]
    public string Location
    {
        get { return (string) this["location"]; }
    }
}
abstract class ConfigProviderBase
{
    public abstract string GetSetting(string key);
    public string Location { get; set; }
}
class ConfigSettings
{
    private static ConfigProviderBase _configProvider;
    private static void InitProvider()
    {
        object section = ConfigurationManager.GetSection("ConfigSettings");
        ConfigSectionHandler handler = (ConfigSectionHandler) section;
        _configProvider = (ConfigProviderBase)
            Activator.CreateInstance(Type.GetType(handler.Type));
        _configProvider.Location = handler.Location;
    }
    public static string GetSetting(string key)
    {
        if (_configProvider == null)
            InitProvider();
        return _configProvider.GetSetting(key);
    }
}

So, here’s provider #1:

class ConfigAppSettings : ConfigProviderBase
{
    override string GetSetting(string key)
    {
        return ConfigurationManager.AppSettings[key];
    }
}

And, another one that reads from a separate XML file instead:

class ConfigXML : ConfigProviderBase
{
    public override string GetSetting(string key)
    {
        string result = "";
        var xe = XElement.Load(this.Location);
        var setting = (from elem in xe.Elements["Setting"]
            where elem.Attribute("key").Value == key
            select elem).SingleOrDefault();
        if (setting != null)
            result = setting.Attribute("value").Value;
        return result;
    }
}

Could have used a whiteboard with some boxes and arrows, because it seems like the naming conventions are confusing. But it’s just a matter of following the pattern.

Has an example of how to do this with a collection of providers specified in the XML (e.g., <providers> <add ... /> <add ... /> <add ... /> </provider>)

Once you start the process, get the boilerplate going, writing a new provider becomes pretty easy.

TechEd 2008 notes: Advanced Unit Testing Topics

He repeated a lot of his favorite patterns that I already took notes about in his earlier session, “Design and Testability“. If you haven’t read that one yet, go check it out.

Advanced Unit Testing Topics
Ray Osherove
Typemock
Blog: http://iserializable.com/

The Art of Unit Testing (book he’s working on)

Assumption for this session: You have some experience with writing tests

Mocking 101

  • Replace dependencies with fakes so you can test something
  • If the Storage class calls a non-virtual void method on EmailSender, you can’t test that interaction.
  • Pass an interface instead of a class. “Dependency injection”. Then we can write a FakeEmailSender and pass that in, and it can do whatever we want.
    • Do nothing
    • Set a flag saying “I got asked to send the e-mail”
  • Can create the fake manually or with a mock-object framework
  • Fake vs. mock
    • Fake: fake to make the system more testable. They won’t fail your test. (In another session, I heard this called a “stub”, with “fake” as a top-level term for both stubs and mocks.)
    • Mock: something we will assert against. These can be used to detect conditions where your test should fail.
  • More complex scenario: force EmailSender to throw an exception, so you can test how StorageManager interacts with ILogWriter.
    • FakeEmailSender, FakeLogWriter — where the LogWriter is a mock (you’ll assert that it actually got asked to log the error), and the e-mail sender is a fake (you don’t assert against it, it just replaces the real one)
    • Usually, in a test, you’ll only have one mock, and maybe many fakes

Side note: Test names should be extremely readable. Imagine that the person reading your code is a serial killer who knows where you live. Don’t make him mad.

Mock-object libraries

TypeMock:

[Test]
[VerifyMocks]
public void Typemock_Store_StringContainsStar_WritesToLog()
{
    ILogger log = RecorderManager.CreateMockedObject(typeof(ILogger));
    // Tell the mock object what you expect to happen in the future
    using (var r = new RecordExpectations())
    {
        log.Write("*");
        // Tell it to check parameter values, rather than just expecting
        // the method to be called
        r.CheckArguments();
    }
    // Now run the code under test
    var sm = new StorageManager(log);
    sm.Store("*");
}
  • The assert isn’t in the code — it’s implicit in the [VerifyMocks] attribute. Can also do it explicity through a method call.

RhinoMocks:

[Test]
public void Typemock_Store_StringContainsStar_WritesToLog()
{
    MockRepository mocks = new MockRepository();
    ILogger log = mocks.CreateMock();
    using (mocks.Record())
    {
        log.Write("*");
        // RhinoMocks checks arguments by default.
    }
    // Now run the code under test
    var sm = new StorageManager(log);
    sm.Store("*");
}
  • Looking back at that, I wonder if I missed something, because I don’t see any sort of VerifyMocks() call in that code snippet. I probably missed copying it down from the slide.

What happens when you can’t extract an interface?

  • Derive from the class under test

What happens when you can’t change the design?

  • Can cry to our managers
  • Typemock lets you mock things that are static or private, without modifying the design.
    • So suppose the Storage class instantiates the Logger itself, and you can’t change it. How do you break the dependency so you can test the code?
[Test]
[VerifyMocks]
public void Typemock2_Store_StringContainsStar_WritesToLog()
{
    using (var r = new RecordExpectations())
    {
        new Logger().Write("*");
        r.CheckArguments();
    }
    var sm = new StorageManager2();
    sm.Store("*");
}
  • Yep, it can record, and later fake out, a new call.
  • Uses the .NET Profiling API to do this. Deep black magic.

Mock object frameworks save time and coding, but sometimes fake objects are the simpler solution

Testing ASP.NET WebForms

  • Integration testing: use tools that invoke the browser and check stuff out
  • To test it in-process, and fake out the HttpContext and all that: Ivonna (which uses Typemock underneath)
    • Lets you set values on controls, and then process postbacks in-process

Extending the test framework with extension methods

  • Create a domain-specific language on your objects
"abc".ShouldMatch("\\w");
List<int> ints = new List<int> { 1, 2, 3, 4, 5, 6 };
ints.ShouldContain(4);
3.ShouldBeIn(ints);
5.ShouldBeIn(ints);
  • Make a static class with static methods
  • First parameter has this
  • Call asserts from the extension method
  • Import the namespace in order to be able to use the extension methods
public void ShouldBeIn(this object o, IEnumerable items)

Testing (and Mocking) LINQ Queries

  • LINQ is not a test-enabling framework by default
  • Use Typemock, then duplicate the LINQ query in your test (to record the calls)
  • The problem is, you’re over-specifying your tests, because you have to know what query is being run underneath — you’re specifying the internal behavior
  • So, avoid doing this unless you have to
  • May be better to abstract away the queries
  • What about projections? E.g. selecting an anonymous type?
    • Typemock can record anonymous types. He already had a demo for this.
    • Can be really, really weird.
using (RecordExpectations rec = new RecordExpectations())
{
    var p1 = new { Name = "A", Price = 3 };
    rec.CheckArguments();
    string a = p1.Name;
    rec.Return("John");
}
var target = new { Name = "A", Price = 3 };
Assert.AreEqual("John", target.Name);
  • The above actually passes. Yes. Look closer. They’re mocking the return value of the anonymous type, so even though you construct it with “A”, it returns “John”. Like I said above, deep black magic.

Testing WCF

  • Again, maybe the best thing is not to unit-test it. Consider an integration test instead.
  • Integration test: create a ServiceHost, ChannelFactory
    • Still might want to use Typemock for things like the security context
    • Be pragmatic about things. If it’s one line of code to use Typemock, vs. introducing lots of interfaces, maybe it’s worthwhile (as long as it’s maintainable).
  • Worth making a WCFTestBase<T, TContract> to pull out common code.
    • If you do integration testing, get used to base test-fixture classes to remove duplication.

Database-related testing

  • To mock or not to mock the database?
    • Mock:
      • Tests will be fast, no configuration, don’t need a real database.
      • But the database itself also has logic. Keys, indexes, integrity rules, security, triggers, etc.
  • Team System has a tool for testing against databases, but it’s not really “there” yet.
  • Can do an integration test against the database.
  • If you change external data, roll back between tests. Good solution: lightweight transactions with TransactionScope. Create in SetUp (with TransactionScopeOption.RequiresNew), and call Dispose in TearDown.
    • Will work even if the test creates its own transaction (as long as it doesn’t also do RequiresNew).

Put integration tests in a separate project, so it’s easy to run the unit tests (which are fast) all the time, without being slowed down by the integration tests (which are slow, require configuration, etc.)

Testing events

  • Hook the event as usual
  • Hook it with an anonymous delegate
  • Do your asserts inside that anonymous delegate
  • Make an eventFired variable and set it true from inside the delegate
  • Assert eventFired in the main test body
  • Yay closures!

Musical postlude

TechEd 2008 notes: Lessons Learned in Programmer (Unit) Testing

This one was a mixed bag. Some of his lessons learned are worth looking at, like the 3A pattern, and he covered classics like inversion of control and avoiding ExpectedException. But he was pretty dogmatic about a couple of things that just didn’t make much sense.

Lessons Learned in Programmer (Unit) Testing: Patterns and Idioms
James Newkirk, CodePlex Product Unit Manager

Been writing software for close to 25 years.
When he was first hired after college:

  • Company made EEs work in manufacturing for 2 years
    • Goal: teach them how to build things that could be manufactured
  • Software people just started writing stuff
    • Never taught how to build things that were testable, maintainable. Up to them to discover that.

JUnit Test Infected: Programmers Love Writing Tests
(Has anyone actually met programmers who love writing tests?)

  • Programmers weren’t writing tests
  • Couldn’t prove that what they had worked
  • James ported JUnit to .NET as NUnit, released as open-source
  • How can we use open-source to enable different kinds of innovation?

“Unit testing” means something to testing people. Some think, if cyclomatic complexity is 5, then I need 5 unit tests. Not the same thing as tests for test-driven development. So we’ll probably not use the term “unit test”.

What is programmer testing? Per Brian Marick

  • Technology vs. Customer
  • Support vs. Critique
  • Support + technology: Programmer tests
  • Support + customer: Customer tests
  • Critique + customer: Exploratory tests (should still happen during the development process)
  • Critique + technology: “ilities” — non-functional. Scalability, usability, performance, etc.

Why do programmer testing?

  • “There is no such thing as done. Much more investment will be spent modifying programs than developing them initially.” [Beck]
    • Whenever we think we’re finished, we’re probably wrong. There can always be more features.
    • Can define “done” locally: “We’re done with this release.”
    • Programmer testing lets me say as a programmer, “I believe that I am done.”
  • “Programs are read more often than they are written.” [Beck, “Implementation Patterns” book]
    • You’re writing a book on your code.
    • The only artifact of any value is the code. It has to be the communication mechanism.
  • “Readers need to understand programs in detail and concept.” [Beck]

Total development cost:

  • Develop (illustrated in green): small. Fun!
  • Extend/Maintain (orange): big. Sucks, especially if you didn’t write the green part!

“I might break something!”

  • Fear lowers our productivity
    • Programmer tests help, because if you broke something, you know. You know the consequences of your actions.
  • Where do I start?
    • Programmer tests are very useful in this understanding phase

Lesson #1: Write tests using the 3A pattern

  • Attributed to Bill Wake (xp123.com)
    • Arrange — Set up the test harness
    • Act — Run the thing you actually want to test
    • Assert — Verify the results
[Fact]
public void TopDoesNotChangeTheStateOfTheStack()
{
    // Arrange
    Stack<string> stack = new Stack<string>();
    stack.Push("42");
    
    // Act
    string element = stack.Top;
    
    // Assert
    Assert.False(stack.IsEmpty);
}
  • Benefits
    • Readability
    • Consistency
  • Liabilities
    • More verbose
    • Might need to introduce local variables
  • Related issues
    • One assert per test?
      • Much agile dogma says there should only be one assert
      • Pragmatic view: test should only test one thing, but if that one thing takes multiple asserts, that’s fine

Lesson #2: Keep your tests close

  • Tests should be as close as possible to the production code
  • Same assembly?
  • Treat them like production code
  • They have to be kept up-to-date
  • Visibility (let you see internal)
    • (I would argue that, if you think you need to test your internal state, what’s really going on is that you’ve got another object that wants to get out.)
  • Liabilities
    • Should you ship your tests?
      • If so, you need to test your tests. Whole new level.
      • Need to make sure your tests don’t modify the state of the system
      • If No, how do you separate the tests from the code when you release?
    • Various tool issues

Lesson #3: ExpectedException leads to uncertainty

[Test]
[ExpectedException(typeof(InvalidOperationException))]
public void PopEmptyStack()
{
    Stack<string> stack = new Stack<string>();
    stack.Pop();
}
  • Problem #1: where’s the assert? A: it’s in the framework. Violates 3A.
  • Obfuscates the test to some extent.
  • Better:
[Fact]
public void PopEmptyStack()
{
    Stack<string> stack = new Stack<string>();
    Exception ex = Record.Exception(() => stack.Pop());
    Assert.IsType(ex);
}
  • Makes the exception more explicit.
  • Lets you inspect the exception object.
  • Another possibility is NUnit’s Assert.Throws():
[Fact]
public void PopEmptyStack()
{
    Stack<string> stack = new Stack<string>();
    Assert.Throws<InvalidOperationException>(delegate {
        stack.Pop(); });
}
  • Downside: Act and Assert are in the same spot.
  • Other problems with ExpectedException:
    • Don’t know which line of code threw the exception. Test can pass for the wrong reason. (This, IMO, is the most compelling reason to avoid ExpectedException.)
    • ExpectedException forces people to use TearDown. With Record.Exception or Assert.Throws, you can keep the teardown in the test if you like.

Alternatives to ExpectedException

  • Benefits
    • Readability
    • Identify and isolate the code you expect to throw
  • Liabilities
    • Act and Assert are together in Assert.Throws
    • Anonymous delegate syntax leaves something to be desired

Lesson #4: Small fixtures

  • Your code should be a book for people to understand
  • If the fixture has 1,000 tests, it’s hard for someone to know where to start
  • Better: fixtures focused around a single activity
    • Maybe even a fixture for each method
    • Can use nested classes for this. Outer class is named for the class you’re testing, with nested classes for each behavior.
  • Benefits
    • Smaller, more focused test classes
  • Liabilities
    • Potential code duplication
      • May consider duplicating code if it’s for the purpose of communication (but use this carefully)
    • Issues with test runners — not all can deal with nested classes
  • Related issues
    • Do you need SetUp and TearDown?

Lesson #5: Don’t use SetUp or TearDown

(I smell a holy war)

  • If the tests are not all orthogonal, you can end up with a test that doesn’t need anything from the SetUp method.
  • If SetUp takes a long time, you’d be paying the price for every test, even those that don’t need all of it.
  • If there are tests that don’t need all of SetUp, readability suffers.
  • It’s a hacking point. It’s a place where people can add code that you might or might not like.
  • When asked about the duplication, he suggested pulling the duplicated code into a method, and calling it at the beginning of each test. That wouldn’t necessarily be a totally bad idea, if SetUp was the only thing he was trying to get rid of.
  • Benefits
    • Readability (if it removes duplication)
    • Test isolation (but only if it means you get rid of all fixture state, and put absolutely all your state into local variables)
  • Liabilities
    • Duplicated initialization code
    • Things he didn’t mention, that I will (having run into all of them):
      • Duplicated cleanup code
      • Try..finally in every test to make sure your cleanup code gets run
      • Signal/noise ratio: if a test gets too long, it’s much harder to tell what it’s actually trying to accomplish
      • Poor test isolation: if you have any fields on your fixture class, you will forget to initialize some of them in some of the tests, which will introduce test-order dependencies (since NUnit shares the same fixture instance among all the tests in the suite)
  • Related issues
    • Small fixtures

Lesson #6: Don’t use abstract base test classes

He’s apparently specifically referring to the practice of putting tests on a base class, and inherit them in descendants. We use this technique, though sparingly; you do take a big hit in readability, so it really has to be worth having the same tests (including newly-written tests) apply in more than one place.

What’s wrong with base classes?

  • If the tests don’t all apply to all the derived classes, you’d have a problem. (Um, duh. You’d notice when they failed, wouldn’t you?)
  • Removes duplication at the expense of readability. (Fair point.)
  • He wants us to put the test logic into a utility function instead, and call that from the different descendants. (Trouble is, that doesn’t help make sure that all new tests get added to all the fixtures. You don’t use this pattern unless you want all the tests in all the descendants.)
  • Benefits
    • Readability
    • Test isolation
  • Related issues

Lesson #7: Improve testability with Inversion of Control

  • Martin Fowler article: Inversion of Control Containers and the Dependency Injection pattern
  • Dependency Injection
    • Constructor injection
    • Setter injection
  • Don’t want errors to cascade across all parts of the program.
  • Benefits
    • Better test isolation
    • Decoupled class implementation
  • Liabilities
    • Decreases encapsulation
    • Interface explosion
  • Related issues
    • Dependency injection frameworks are overkill for most applications

Side note: Mocks are good for interaction testing, bad for state testing.

TechEd 2008 notes: Understanding C# Lambda Expressions

This was a lunch session, so pretty short. It turned out to be pretty introductory-level, just about grokking the syntax (which I pretty much already did).

Understanding C# Lambda Expressions
Scott Cate
myKB.com

Operators: most are very mnemonic, very self-explanatory. But what about =>?

Read => as “reads into” or “feeds into”.

BuildMatrix((a, b) => (a+b).ToString());

History of Delegates in .NET

  • Forget the compiler for a minute. What does the word “delegate” mean? “I don’t have time to (or don’t know how to) do this; I’m going to offload it to someone else and take the credit.”
  • You’ve mastered delegates if you could write a class with an event, and wire up that event, in Notepad, without Intellisense.
  • Define the delegate type. No implementation, just a signature.
  • .NET 1.0: Pass the method name as a parameter (or whatever). (Actually, in 1.0 I think maybe you had to do new DelegateType(MethodName).)
  • .NET 2.0: anonymous delegates. BuildMatrix(delegate(int a, int b) { return (a+b).ToString(); }
    • Return type is inferred.
  • .NET 3.0: BuildMatrix((a, b) => (a+b).ToString());
    • The stuff after => is an anonymous delegate. The stuff before it is the parameter list.
    • You can define the parameter types, or the compiler can infer them.
    • Can omit parentheses if there’s only one parameter.
  • Same thing without predefining the delegate type: void BuildMatrix(Func<int, int, string> operate)
  • Don’t need curly braces and don’t need return — can just put an expression. You can also put braces with statements (in which case you do need return).

Errata — details the presenter got mixed up:

  • Built a grid in HTML, and it looks like he got his X and Y loops reversed. Yes, this is nitpicky.
  • Told us you can make two delegates with the same name but different parameters (one with two ints and one with two decimals). You can’t; it won’t compile (I double-checked). They can have the same name if they have a different number of generic parameters (since the number of generic parameters is part of the name), but you can’t do it if they’re non-generic.
  • Told us the compiler generates a new delegate type behind the scenes when you pass an anonymous delegate (it doesn’t, it uses the actual delegate type that the function expects; I double-checked)

TechEd 2008 notes: How to Make Scrum Really Work

This was a small group, in a small room with a whiteboard, so it was fairly interactive. That means lots of Q&A, which means we jumped all over the place and it looks pretty haphazard in written form. Oh well.

How to Make Scrum Really Work
Joel Semeniuk
Imaginet Resources Corp

Scrum teams are 6.5 times more effective than waterfall teams. (Pity they didn’t cite a source. Anyone got a reference?)

How Scrum works

  • Lots of feedback mechanisms: between team members, re quality of software, with user community
  • Processes that support feedback mechanisms: daily scrums, sprint
    • Sprint = iteration. Generally 2-4 weeks. Design, code, test, deploy.
  • Sprint review: demo to customers, get feedback.
  • Sprint retrospective: Not necessarily every sprint (though that’s debatable, see Juan‘s comments). How did it go? How was the process? Did we feel like this was a successful sprint? What made it successful? If weak, what was the problem? Be constructive. Make sure everybody knows what the team did well.
  • Scrum is a process framework: there are no absolutes beyond the key principles. Continuous process improvement (with the retrospect).
  • House: “You don’t tell your dog to stop peeing on the carpet once a year.”

Roles:

  • Scrum Master. Coach. They take stuff out of the way of the team, to make you more productive. Take away impediments.
  • Team members (everyone involved in building the software, so includes QA, people who set requirements, etc.)
  • Pigs and Chickens
    • Chicken doesn’t have deliverables.
    • Pig has skin in the game.
    • When you make ham and eggs, the chicken is involved, but the pig is committed.
    • In some organizations, at the daily scrum, only the pigs talk. Chickens can observe.
    • Pass a token; only the person with the token can talk.

Is the scrum first thing in the morning? — Do all your devs get in at the same time? (Ha.)

How do you keep a daily scrum short? (Especially when remote.) — Scrum Master is the moderator, and will say, “Rat hole.”

  • What did we do yesterday?
  • What are we going to do today?
  • What are our impediments?

Backlog

  • Bucket of stuff that needs to get done
  • Can assign backlog items to sprints
  • Sprint planning: reconfirm what you have, do decomposition to make it more real
  • If you do internal development, you can plan as you go. If you do contracting or fixed-bid work, you need to spend more time on planning.

Scrum is about what’s next. What about management wanting deliverable dates, when Agile tends to be about discovering stuff as we go?

  • Ken Schwaber: Have a bigger preparation phase, lay out a vision
  • Convince the customer that you can allow change
  • Change is gonna happen
  • We’re allowing the customer to change their mind, by re-prioritizing, changing the schedule, replacing features with others of the same size
  • Prioritizing of backlog is absolutely necessary
  • Track everything: changes in priority, changing out features. Just because we’re doing agile doesn’t mean we throw out best practices about change management.
  • Might be able to win projects even if you refuse to do a fixed bid. Can’t fix all three aspects of the Iron Triangle, which makes for an adversarial customer relationship. Can say, “I understand your budget. We think it’ll be this much. Let’s keep features flexible, and stay aware of the business value of each one.”
  • Sometimes Scrum isn’t the right model, especially when there’s a lack of trust.
  • Aside: Every team should have a nap room.

Suggested story pattern: “As a <role> I want <ability> so that <benefit>.”

Team System plug-ins to help manage sprints electronically:

  • Conchango
  • MSF for Agile
  • Lightweight Scrum Process Template
  • eScrum template (Microsoft)

User stories

  • Three things:
    • What I’m trying to do
    • Conversation about that user story: what fields? what reports? Record that in the work item.
    • Acceptance test, in the terminology of the user.
  • Suggestion: “Sprint 0” = planning sprint.
  • Product backlog: User Stories.
  • Sprint backlog: tasks that need to be done to complete those user stories.

Estimation. One possible technique:

  • Rate each story for:
    • Complexity (1 to 5)
    • Business value (1 to 5)
  • If it’s got a complexity of 5, you must decompose.
  • Rock/paper/scissors-style estimation. If you’re off by more than one, we don’t have the same understanding of the problem.
  • Why use a made-up scale instead of hours? — Don’t want it to turn into a budget for the developers. “Student syndrome”. If it was estimated at a day, the programmer thinks they have a day to do it. Instead, the dev should do the necessary work, no less and no more.
  • Try to discourange single-point estimation. (I think they meant single-axis, which is why they suggest rating both complexity and value. It’s been a few days, though, so I might be remembering wrong.)
  • Another suggestion: minimum / most probable / maximum time.
  • Another suggestion: estimate + certainty.
  • The guy who did order-of-magnitude estimation does hours during execution: how many hours spent, how many estimated hours remaining.

Impediments (obstacles)

  • They have a flag on their work-item database called “Issue”. That seems way too slow-response to me — shouldn’t you just walk over and talk to the customer?
  • Risk management.
  • Mitigate risk. How to lower impact or lower probability?
  • Trigger. When is it not a risk anymore, and now a Problem? What’s the contingency?
  • Mitigations are tasks in your backlog.
  • Can become a rat-hole.
  • Bottom line: anticipate problems.

Scrum of Scrums

  • Each team has their own scrum meeting
  • A few people from each team do a combined scrum, so the teams have some idea of where other teams are

What do you do when your backlog is hundreds of stories long?

  • Consider feature-driven development. Major features, feature sets, features. Structure your backlog that way.

They don’t like teams bigger than 7.

Scrum scaling: “team of teams”. But even so, after about 50 people, scaling on team of teams degrades quickly. FDD scales better.

Amazon: Two-large-pizza rule. A team can’t be bigger than can be fed by two large pizzas. (So, two people, right?)

How to deal with scope creep? — Need a big, burly scrum master. Bring the customer in, lay the cards on the table, and ask what they want to give up. As long as you haven’t started the task, you can change the sprint by trading something out.

Time, Resources, Features: Pick 2.

Side quote: “Drive-by chickens”

So you do the estimates, make the commitment, more requirements emerge, and they don’t want to take more time for them? — Maybe you can’t do agile. Do requirements up front and then change-report the snot out of it.

If your velocity suddenly changes and you figure it out mid-sprint, figure out why. Don’t re-estimate; adjust your velocity.

Some metrics:

  • Stories per sprint
  • Complexity points per sprint
  • Burndown chart

Audience suggestion: Anyone can stop the sprint and call a meeting to resolve something.

End of iteration and not done? — Cover it in the review, move it to the next sprint. Drop from sprint release; remove from the build. That work didn’t get done. Milestones: nothing is 82% done.

If you really don’t want wiggle room at the end of the sprint, put the end of the sprint on a Wednesday. Student syndrome — not thinking about it until the last minute.

TechEd 2008 notes: Busy .NET Developer’s Guide to F#

I’m a bit late getting my Thursday and Friday notes posted. Thursday night was Universal Studios, so I didn’t get any posting done then, and by Friday night I was just plain worn out. So here we go again, starting with Thursday morning.

Busy .NET Developer’s Guide to F#
Ted Neward (talking about the slides)
Neward & Associates
www.tedneward.com

Luke Hoban (doing the demos)
F# Program Manager
blogs.msdn.com/lukeh

F# is a functional, object-oriented, imperative and explorative programming language for .NET.

  • OO you already know.
  • Imperative = sequential execution. C#, batch files, etc.
  • Explorative = REPL loop / interactive prompt. Executing code on the fly in Visual Studio.
  • Functional
    • This doesn’t mean “F# is functional” as opposed to “C++ is dysfunctional”.
    • Mathematical definition of “functions”.
    • f(x) = x + 2. So if you pass 5, you get 7, no matter how many times you do it.
    • Immutable; no shared state.
    • This means we can do some substitution. Can break down into pieces, solve for part of it, put it together and get the right answer.
    • Look to create functions that can be composed into higher-order functions.
    • Very concurrency-friendly.

F#: The combination counts!

  • Strongly typed. Makes sure you didn’t do anything really, really stupid. Historically, compilers have been rather stupid about this: if I say s = "Luke", why do I have to tell the compiler it’s a string? It already knows, after all.
  • Succinct
  • Scalable
  • Libraries
  • Explorative
  • Interoperable
  • Efficient

What is F# for?

  • General-purpose language
    • Can be used for a broad range of programming tasks
  • Some particularly important domains
    • Financial modeling and analysis
    • Data mining
    • Scientific data analysis
    • Domain-specific modeling
    • Academic
  • Speed and power

Install F# addon into Visual Studio

“F# Interactive” window (REPL prompt). This is a dockable window in the IDE. Just type code in. E.g.

1+1;;

Can also highlight code in the code editor and press Alt+Enter, which runs that code in the interactive window; then you don’t need ;; to tell it you’re done. (So what do you do if you use ReSharper and Alt+Enter already does lots of cool stuff? I wonder if keybindings in the editor are language-sensitive.)

// Turn on the significant whitespace option
#light

System.Console.WriteLine "Hello World"

System.Windows.Forms.MessageBox.Show "Hello World"

printfn "Hello World"

The Path to Mastering F#

  • Covered today:
    • Scoping and “let”
    • Tuples
    • Pattern matching
    • Working with functions
    • Sequences, lists, options
    • Records and unions
    • Basic imperative programming
    • Basic objects and types
    • Parallel and asynchronous. Comparing C# examples from MSDN to their F# equivalents.
  • Not covered today:
    • F# libraries
    • Advanced functional/imperative
    • Advanced functional/OO
    • Meta-programming
  • They can’t teach us a programming langugae in 75 minutes. They just want to make us dangerous. Get us to where we can read the examples.

“let” and scoping

Let: binds values to identifiers

let data = 12
let f x =
     let sum = x + 1
     let g y = sum + y*y
     g x
  • No semicolons. Language can figure out where the statement ends. Also makes use of significant whitespace.
  • No parentheses.
  • Type inference. The static typing of C# with the succinctness of a scripting language. We can specify types when we want to be unambiguous. There’s also some generic stuff that can happen.
  • No real distinction between variables and function declarations.
  • Nested functions. sum and g aren’t visible outside the scope of f. So we have additional encapsulation that C# and VB don’t have.
  • Closures.
  • Last value calculated becomes the return value.
  • Immutability (by default).
let PI = 3.141592654

PI <- 4.0

That’s an error: “This value is not mutable.”

  • Is this a property or a field? We don’t care.
  • <- is assignment.
  • You can make something mutable. But they do the right thing, by default, with respect to concurrency.
open System
open System.IO
open System.Net

let req = WebRequest.Create("http://www.live.com")
let stream = req.GetResponse().GetResponseStream()
let reader = new StreamReader(stream)
let html = reader.ReadToEnd()

html
let http(url: string) =
    let req = WebRequest.Create(url)
    use resp = req.GetResponse()
    use stream = resp.GetResponseStream()
    use reader = new StreamReader(stream)
    let html = reader.ReadToEnd()
    html
  • Actually did provide type annotation in the above example. Sometimes it can’t infer the type. This is one such case, because WebRequest.Create is overloaded. These are fairly rare in the bulk of F# code.
    • I’m curious what happens if you don’t annotate this example; that’s a detail they didn’t show. Compiler error?
  • use is like C#’s using block.
    • I’m curious how it knows how big the scope is, i.e., where in the code to insert the call to Dispose.
let html2 = ""
html2 <- http("http://www.live.com")
  • Red squiggle under the second html2. Background compilation. This is why you can hover over an identifier and see its type.
  • If you really want to make it mutable:
let mutable html2 = ""
html2 <- http("http://www.live.com")

Lists, Options and Patterns

  • Lists are first-class citizens
  • Options provide a some-or-nothing capability
  • Pattern matching provides particular power
let list1 = ["Ted"; "Luke"]

let option1 = Some("Ted")
let option2 = None

match Option1 with
| Some(x) -> printfn "We got an %A" x
| None -> printfn "Nope, got nobody"
  • Semicolons as delimiters inside lists. Yes, it’s silly that it’s not comma. Red squigglies will help you figure out when you get it wrong.
  • Option type: you can either have some of a particular value, or none. (This looks pretty similar in intent to nullable types in C#.)
    • Looks like Some and None are predefined as part of the language or the library, though they didn’t explicitly go into this.
  • Some is a generic; it can contain any type. Generics are very lightweight, almost invisible.
  • Pattern matching. Note that it not only matches Some(x), it also extracts the value (x) for us to use.

Functions

  • Like delegates + unified and simple
(fun x -> x + 1)

let f x = x + 1

let g = f

val g : int -> int
  • First: Anonymous function.
  • Second: same thing, but we’re giving it a name.
  • Third: creating another identifier that’s bound to the same value as f.
    • We’re starting to treat functions as values.
    • This lets us do interesting things like higher-order functions, which are very difficult to express in a language like C#.
    • Nobody’s saying you can’t do this in C#. But F# syntax is more succinct, and sometimes a little bit more powerful.
    • Nobody’s saying you need to stop using C#. If you want to continue creating tomorrow’s legacy code, nobody’s going to stop you.
  • I think the fourth one isn’t actually code, but rather the thing F# displays when it’s telling you the type. I didn’t quite catch that bit, though, so I’m not sure.

Lists

let sites = ["http://live.com"; "http://yahoo.com"]
let sites' = "http://live.com"::"http://yahoo.com"::[]
let sites'' = ["http://live.com"] @ ["http://yahoo.com"]
  • :: means “prepend this element to this list”
  • @ appends two lists.
  • Names can have prime and prime-prime suffixes.

Arrays

let sitesArray = [|"http://live.com"; "http://yahoo.com"|]
  • Arrays can be useful. They’re still immutable, but can be very performant.
  • They did not go into any details on how arrays are different from lists.

Tuples

let nums = (1,2,3)
let info = (1,true,"hello")
  • Packages up some data. Type is reported as int * int * int.

Computed lists

[ 0 .. 10 ]
[ for i in 0 .. 10 -> (i, i*i) ]
[ for i in 0 .. 20 -> (i, sin(float(i)/20.0*Math.PI)) ]
  • First: just returns a list
  • Second and third: list of tuples

Example: Web crawling

open System.Text.RegularExpressions

let linkPath = "href=\s*\"[^\"h]*(http://[^&\"]*)\""

let msft = http("http://www.microsoft.com")

let msftLinks = Regex.Matches(msft, linkPat)

[ for m in msftLinks -> m.Groups.[1].Value ]

let getLinks (txt:string) =
    [ for m in Regex.Matches(txt,linkPat) -> m.Groups.[1].Value ]

let getFirstLink html =
    let links = getLinks html
    match links with
    | [] -> None
    | firstLink :: _ -> Some firstLink

Unions and pattern matching

type Expression =
    | Num of int
    | Neg of Expression
    | Add of Expression * Expression

let expr = Add(Add(Num 1, Neg(Num 2)), Num 3)
  • Unions are like an enum, but one that can carry data along with it. Very succinct and clear.
let rec evaluate expr =
    match expr with
    | Num(n) -> n
    | Neg(e') -> -(evaluate e')
    | Add(e', e'') -> (evaluate e') + (evaluate e'')

evaluate expr
  • let rec means it’s a recursive function. You have to make that explicit.

Functions and Composition

let htmlForSites = List.map http sites
  • First argument is a function; second argument is the list. Returns a new list with the function results.
  • If you look at the type of List.map, it’s val map : ('a -> 'b) -> 'a list -> 'b list. The prepended ' mean it’s a generic.
sites
|> List.map http
|> List.map (fun x -> x.Length)
|> List.fold_left (+) 0
  • |> is the piping operator.
  • List.fold_left is used to sum all the lengths.
  • Three different kinds of functions:
    • Defined explicitly, with a name: http
    • Anonymous function: (fun x…
    • Operator, passed around and used as data: (+)

GUI and databinding

open System.Drawing
open System.Windows.Forms

let form = new Form(Visible = true,
                    Text = "A simple F# form",
                    TopMost = true,
                    Size = Size(500, 400))

let data = new DataGridView(Dock = DockStyle.Fill)
form.Controls.Add(data)

data.DataSource <- [| ("http://www.live.com", 1) |]
// more stuff to set column headers

let limit = 50
let collectLinks url = getlinks (try http(url) with _ -> "")
  • An array of tuples can be bound to a datagrid. How cool is that?
  • try keyword is used to say “ignore anything that throws an exception”. If F# has anything like try..finally or try..catch, they didn’t demo it today.
let rec crawl visited url =
    data.DataSource <- List.to_array visited
    Application.DoEvents()
    if (visited |> List.exists (fun (site,_) -> site = url))
      or visited.Length >= limit
    then
        visited
    else
        let links = collectLinks url
        let visited' = (url, links.Length) :: visited
        List.fold_left crawl visited' links

crawl [] "http://live.com"

Object Oriented + Functional

#light

namespace Library

type Vector2D(dx:double, dy:double) =
  member v.DX = dx
  member v.DY = dy
  member v.Length = sqrt(dx*dx+dy*dy)
  member v.Scale(k) = Vector2D(dx*k, dy*k)
  • F# doesn’t differentiate between properties and methods. They want you thinking about the problem, not about software engineering.
  • Constructor parameters are in the type declaration itself. That’s the primary constructor that all others defer to, ’cause that’s usually what you do. No wasted syntax initializing the fields; it’s constructor signature, field declarations, and assignments all in one.
  • When you build types in F#, you can see them in C#. X, Y, and Length are properties; Scale is a function.
  • F# doesn’t automatically realize that fields are immutable and it can precompute the length. F# is immutable by default, but it’s hard to figure out how aggressive to be. (That doesn’t make a whole lot of sense based on what I know so far — you have to explicitly say when you want something to be mutable — but I guess there could be more going on somewhere.) If calculating Length is expensive, you have to precompute it yourself, thusly:
type Vector2D(x : float, y : float) =
    let norm = sqrt(x*x + y*y)
    member this.Length = norm
  • That let becomes constructor code.
override this.ToString() = "hello"
  • Intellisense works — type override this. and a completion list pops up.

Case Study

  • The Microsoft adCenter Problem
  • Cash-cow of Search
  • Selling “web space” at www.live.com and www.msn.com.
  • What’s the chance, if I put up a certain ad, that a user will click on it?
  • Internal competition for a better clickthrough-prediction algorithm
  • Winning team used F#

Their observations

  • Quick coding. Less typing, more thinking.
  • Agile coding
  • Scripting. Interactive “hands-on” exploration of algorithms and data over smaller data sets.
  • Performance. Immediate scaling to massive data sets.
  • Memory-faithful
  • Succinct. Live in the domain, not the language. They were able to have non-programmers helping with the code.
  • Symbolic
  • .NET integration

Taming asynchronous I/O

  • Pages of C# code
let ProcessImageAsync(i) =
    async { let inStream = File.OpenRead(sprintf "source%d.jpg" i)
    let! pixels = inStream.ReadAsync(numPixels)
    let pixels' = TransformImage(pixels, i)
    let outStream = File.OpenWrite(...
    do! outStream.WriteAsync(pixels')
    do Console.WriteLine "done!" }

let ProcessImagesAsync() =
    Async.Run (Async.Parallel
                [ for i in 1 .. numImages -> ProcessImageAsync(i) ])
  • let! means “do this asynchronously”
  • Async.Parallel means “do this stuff in parallel, and just make it work”

F# Roadmap

  • May 2008: MS Research F# refresh release
    • Language and libraries clean-up and completion
  • Fall 2008: F# CTP release
    • Language decisions finalized
    • Foundation of full integration into Visual Studio
  • Future: releases with betas and RTM of next version of Visual Studio
    • Either as an add-in or in-the box

Resources