Defining testing data in Swift

When writing unit tests, we’re most often looking to test one of our functions or types in isolation — as a unit. However, most code doesn’t operate in a vacuum, and often requires both external dependencies — and data — in order to be able to perform its work. While a big part of writing testable code comes down to how our dependencies are managed, and how easily they can be substituted while testing, how we structure and manage our testing data is often equally important.

This week, let’s take a look at a few different techniques that can enable us to define such testing data more easily — and how reducing the amount of code needed to do so can have a big impact on how much effort that’s required to both read and manage our tests.

Input and verification

Within unit tests, data is often used for two distinct purposes. First, to satisfy a given API requirement, even if the data in question isn’t important for the outcome of the test itself — and second, to provide a way to verify the outcome of the code that we’re testing.

For example, let’s say that we’re building an app that includes some form of contact list, and that we want to write a test that verifies that our list correctly contains a given contact after it was added. That might look something like this:

class ContactListTests: XCTestCase {
    func testAddingContact() {
        var list = ContactList()

        let contact = Contact(
            name: "John Sundell",
            email: "contact@swiftbysundell.com"
        )

        // Make sure that the list doesn't contain the contact
        // initially (to avoid persistence-based flakiness):
        XCTAssertFalse(list.contains(contact))

        // Add the contact to the list, and verify that its
        // 'contains' API correctly returns 'true':
        list.add(contact)
        XCTAssertTrue(list.contains(contact))
    }
}

The above test is quite straight forward, thanks to the fact that the required data model (Contact in this case) is really simple. Since it only has two required properties, and it’s equatable, we can simply create an instance of it inline and use it both as input and for verification.

However, many of our data models are likely to be a bit more complex than that. For example, let’s now say that we’re working on an app that lets the user browse and manage a collection of books, and that our core Book model looks like this:

struct Book: Codable, Equatable, Identifiable {
    let id: Identifier<Book>
    var name: String
    var genres: [Genre]
    var author: Author
    var chapters: [Chapter]
}

Above we’re using the Identifiable protocol and the Identifier type from “Type-safe identifiers in Swift”, which let us define identifiers that are unique for a given type.

While, ideally, models should be kept as simple as possible — sometimes we do need them to carry a fair amount of different data. Our above Book model isn’t poorly structured in any way, and all of its properties are indeed required to describe a book within our app — however, as we start to write tests for code that uses that model, we run into a problem:

class LibraryTests: XCTestCase {
    func testQueryingBooksByAuthor() {
        var library = Library()

        // Creating even the simplest set of books requires quite
        // a lot of code, since we have to provide values for all
        // of our model's properties:
        let books = [
            Book(
                id: "1",
                name: "Book1",
                genres: [],
                author: "John Appleseed",
                chapters: []
            ),
            Book(
                id: "2",
                name: "Book2",
                genres: [],
                author: "John Appleseed",
                chapters: []
            )
        ]

        library.add(books)

        let booksByAuthor = library.books(by: "John Appleseed")
        XCTAssertEqual(booksByAuthor, books)
    }
}

The above works, but it’s a bit of a shame that more than half of our test’s code is dedicated to just setting up our model data — which makes it harder to quickly understand what the actual functionality that we’re testing is.

Like with most problems within programming and computer science, there’s many different solutions that we could explore here. We could, for example, make our Book model define default values for some of its properties — so that we don’t have to pass values for them within our tests — but that isn’t necessarily something that we’d want for our production code.

Another option would be to create a helper method within our test case, which would let us move the complexity of creating our model data out from the tests themselves — like this:

extension LibraryTests {
    func makeBooks(withAuthor author: Author,
                   count: Int) -> [Book] {
        return (0..<count).map { index in
            Book(
                id: Identifier(rawValue: "\(index)"),
                name: "Book\(index)",
                genres: [],
                author: author,
                chapters: []
            )
        }
    }
}

Using the above, we could make our test from before a lot easier to read — as it would now let us focus all of our attention on what’s actually being tested:

class LibraryTests: XCTestCase {
    func testQueryingBooksByAuthor() {
        var library = Library()

        let books = makeBooks(withAuthor: "John Appleseed",
                              count: 2)

        library.add(books)

        let booksByAuthor = library.books(by: "John Appleseed")
        XCTAssertEqual(booksByAuthor, books)
    }
}

However, the improvement that we just made is very specific for both one single test case, and a specific type of model — meaning that we’ll probably need to scatter various kinds of such utility methods all over our unit testing suite — which isn’t ideal.

Stubbing values

What we’re essentially doing when defining testing data, like we did above, is to create stubs. In contrast to mocking, which enables us to create ”fake” versions of our various types with test-specific behaviors, stubbing is when we’re defining a set of values that don’t modify our code’s behavior — but rather serves as predictable input to the functionality that we’re testing.

Since stubbing as a concept isn’t specific to our Book model, or to any other model that we might have, we should be able to generalize it into something that we can reuse — both within our project, and across projects as well.

To do that, let’s start by creating a protocol that lets us declare any type as being capable of being stubbed, which we only add to our unit testing target:

protocol Stubbable: Identifiable {
    static func stub(withID id: Identifier<Self>) -> Self
}

We make our above stub method take an identifier as its only argument (as any model’s identifier is likely to be a constant) — and we also define our Stubbable protocol as a specialization of Identifiable, so that we’ll be able to retain full type safety when it comes to our identifiers.

With the above in place, we’re now able to encapsulate the stubbing logic for each model within extensions that make those models conform to Stubbable:

extension Book: Stubbable {
    static func stub(withID id: Identifier<Book>) -> Book {
        return Book(
            id: id,
            name: "Book",
            genres: [],
            author: "Author",
            chapters: []
        )
    }
}

There was of course nothing stopping us from defining such stubbing methods before — without introducing a protocol — but now that we have a shared abstraction for all of our stubs, we can start to build some really nice APIs that’ll apply to any of our testing data.

Let’s start by making it easy to tweak the testing data for each test, allowing us to do things like assign a specific Author to each Book stub — which we’ll need in order to be able to migrate our testQueryingBooksByAuthor test from before to our new stubbing API. One way to achieve that would be to extend Stubbable with a method that changes the value for one of our model’s key paths — like this:

extension Stubbable {
    func setting<T>(_ keyPath: WritableKeyPath<Self, T>,
                    to value: T) -> Self {
        var stub = self
        stub[keyPath: keyPath] = value
        return stub
    }
}

With the above in place, we’re now able to easily define any stub using a very declarative syntax, and without needing to keep track of any mutable models:

let book = Book
    .stub(withID: "id")
    .setting(\.author, to: "John Appleseed")

That’s really cool, but we’re just getting started. Since we now have a common protocol for all of our stubbing needs, we can define all sorts of utilities using it. For example, in order to be able to easily create arrays of stubbed data models, we could extend Array using a generic type constraint to add an API for that:

extension Array where Element: Stubbable, Element.RawIdentifier == String {
    static func stub(withCount count: Int) -> Array {
        return (0..<count).map {
            .stub(withID: Identifier(rawValue: "\($0)"))
        }
    }
}

We could also extend MutableCollection (the standard library protocol that collections like Dictionary and Array conform to in order to support subscripting-based mutations) to be able to change each element’s value for a given key path in one go:

extension MutableCollection where Element: Stubbable {
    func setting<T>(_ keyPath: WritableKeyPath<Element, T>,
                    to value: T) -> Self {
        var collection = self

        for index in collection.indices {
            let element = collection[index]
            collection[index] = element.setting(keyPath, to: value)
        }

        return collection
    }
}

With that small but powerful collection of stubbing APIs in place, we can now go back to our tests and replace their manual data definitions with our new system — making our tests easier to read without having to set up any additional utility methods:

class LibraryTests: XCTestCase {
    func testQueryingBooksByAuthor() {
        var library = Library()

        let books = [Book]
            .stub(withCount: 3)
            .setting(\.author, to: "John Appleseed")

        library.add(books)

        let booksByAuthor = library.books(by: "John Appleseed")
        XCTAssertEqual(booksByAuthor, books)
    }
}

Even though our stubbing system is really small, implementing something like that may at first seem like “over-engineering”. While such a system definitely isn’t necessary unless we need to stub multiple kinds of data models — chances are high that we actually do — and the fact that we’ve now created a common API for all sorts of stubbing makes it possible to define entire hierarchies of models with very little effort:

let bundle = Book.Bundle
    .stub(withID: "fantasy")
    .setting(\.name, to: "Fantasy Bundle")
    .setting(\.books, to: [Book]
        .stub(withCount: 5)
        .setting(\.genres, to: [.fantasy])
    )

Another common solution when it comes to setting up more complex hierarchies of testing data is to use JSON files, and to then decode such files into models within test methods. While such a solution definitely has the advantage of moving all of our data definitions out from our tests, it also creates a bit of a disconnect between a test and its data — which tends to make verification trickier, and also requires us to constantly jump back and forth between two files in order to understand what data that a test is working with.

By still defining all of our testing data inline, but doing so in a much more lightweight way, we can both keep our tests focused on the functionality that we’re actually testing, while still having all of our data closely available.

Conclusion

Defining what data that a test should use for both input and verification is arguably just as important as how the type that’s being tested is setup, or how we’re structuring our assertions and other kinds of verification. By unifying how our testing data is defined, using a single shared abstraction, we can more easily start to build up a collection of utilities that makes both reading and writing tests much easier.

While any sort of abstraction does come with a cost, and it’s important to investigate whether that cost is worth it before going ahead with the implementation — abstractions that successfully reduce friction when it comes to writing tests often have a higher chance of being worth their cost — since when there’s too much friction, tests don’t often get written at all.

What do you think? Have you tried to unify the way you stub data models within your tests before, or is it something you’ll try out? Let me know, either via email, or Twitter.

Thanks for reading! 🚀

Customizing Codable types in Swift

Configurable types in Swift