Steve On Stuff

Functional Data Transformations

21 Feb 2018

All applications need to do some sort of data transformation. It’s unlikely that the information that you get from your api or database will come in exactly the format or order that you want to display it in, and you will likely have to perform some sort of sorting, filtering, or other manipulation to extract the information that your app requires.

Swift’s standard library provides us with a range of functions that we can use, such as Map, FlatMap and Filter to perform transformations on collections. The reason these higher order functions are so useful is that they abstract part of an algorithm, and we provide just the part that is unique for our use case.

For example, in the case of FlatMap - Iterating through elements and unwrapping optionals is all handled for us, we just supply a function that applys a transformation.

Sometimes we need an alorithm that the standard library doesn’t provide, though. Rather than write something bespoke for our use case, let’s look at how we might write some abstracted algorthms that we can reuse thoughout our app.

The data

Let’s use a todo app as an example, our todo model will look like this:

struct Todo {
    
    enum Priority: Int {
        case low
        case regular
        case high
    }
    
    let text: String
    let dueDate: Date
    let priority: Priority
}

We’ll load todos from our database using the getTodos() function. These are the todos it returns:

1. 04/01/2000 | priority: regular | Buy milk
2. 05/01/2000 | priority: regular | Book Haircut
3. 08/01/2000 | priority: high    | Email Boss
4. 02/01/2000 | priority: regular | Mow Lawn
5. 06/01/2000 | priority: regular | Write blog post
6. 03/01/2000 | priority: low     | Buy bread
7. 01/01/2000 | priority: regular | Call Parents
8. 07/01/2000 | priority: high    | Service Car

Displaying the data

Our first task is to display the todos in a tableview:

Great, lets write some code:

func todosByPriority() -> [[Todo]] {
    
    // Get the todos and sort them by priority
    let todos = getTodos().sorted { $0.priority.rawValue < $1.priority.rawValue }
    
    // Split in to a separate array for each priority
    var todosByPriority = [[Todo]]()
    var rawValue = 0
    while let priority = Todo.Priority(rawValue: rawValue) {
        let priorityTodos = todos.filter { $0.priority == priority }
        if !priorityTodos.isEmpty {
            todosByPriority.append(priorityTodos)
        }
        rawValue += 1
    }
    
    // Sort each array descending by date
    todosByPriority = todosByPriority.map {
        $0.sorted(by: {(todo1, todo2) in todo1.dueDate < todo2.dueDate})
    }
    
    // Return in reverse order so that the highest priority is first
    return todosByPriority.reversed()
}

Here’s the output:

---------------------------------------------------
1. 07/01/2000 | priority: high    | Service Car
2. 08/01/2000 | priority: high    | Email Boss
---------------------------------------------------
1. 01/01/2000 | priority: regular | Call Parents
2. 02/01/2000 | priority: regular | Mow Lawn
3. 04/01/2000 | priority: regular | Buy milk
4. 05/01/2000 | priority: regular | Book Haircut
5. 06/01/2000 | priority: regular | Write blog post
---------------------------------------------------
1. 03/01/2000 | priority: low     | Buy bread
---------------------------------------------------

Great, it works! Buuuut…

Let’s refactor.

Sorting:

Take a look at the first line, which sorts the todos by priority:

let todos = getTodos().sorted { $0.priority.rawValue < $1.priority.rawValue }

It’s not clear at all if todos is going to be sorted ascending or descending. You really have to look at the documentation for sorted to determine this. We’ve also repeated ourselves, because we’ve written priority.value twice. Lets create an extension on array that can make it a little clearer:

public extension Array {
    
    public func sortedAscendingBy<T: Comparable>(_ key: (Element) -> T) -> [Element] {
        return sorted { key($0) < key($1) }
    }
    
    public func sortedDescendingBy<T: Comparable>(_ key: (Element) -> T) -> [Element] {
        return sorted { key($0) > key($1) }
    }
}

// Now we can write...
let todos = getTodos().sortedAscendingBy { $0.priority.rawValue }

// Or...
let todos = getTodos().sortedDescendingBy { $0.priority.rawValue }

Much better! Both sortedAscendingBy and SortedDescendingBy take a function that transforms each element of the array in to a comparable type. This is great for sorting by a nested property such as a name, or in our case priority. It’s clear what’s happening too, because ascending or descending is in the name!

Chunking:

Next up, chunking. Remember we wanted to split the todos in to arrays of each priority, for each section in our table? Lets look at how we can chunk an array of items in to separate arrays in a more generalised way:

extension Array {
    
   func chunk<T: Equatable>(atChangeTo key: (Element) -> T) -> [[Element]] {
        
        // We want to create an array of arrays
        var groups = [[Element]]()
        
        // addGroup will add a new array, but only if it contains values
        func addGroup(_ groupToAdd: [Element]) {
            if groupToAdd.isEmpty == false {
                groups.append(groupToAdd)
            }
        }
        
        // Loop though all of our items saving lastKey on each iteration
        // When lastKey changes, add the existing group to groups and create a new one
        var lastKey: T?
        var currentGroup = [Element]()
        
        for item in self {
            let itemKey = key(item)
            if itemKey == lastKey {
                currentGroup.append(item)
            } else {
                addGroup(currentGroup)
                currentGroup.removeAll()
                currentGroup.append(item)
            }
            lastKey = itemKey
        }
        
        addGroup(currentGroup)
        return groups
    }
}

chunk(atChangeTo:) is less complex than it looks, read the comments to understand how it works. All it does it transform every element through the passed in key function, and compare against the last result. When the result changes, it starts a new array.

If you chunked a bunch of integers when they changed, you might have an input and output that looked like this:

Input: [3, 4, 4, 2, 5, 5, 5, 7, 7, 4, 3, 8, 8, 8]
Output: [[3], [4, 4], [2], [5, 5, 5], [7, 7], [4], [3], [8, 8, 8]]

If you sorted those integers before you chunked them, you would end up with this:

[[2], [3, 3], [4, 4, 4], [5, 5, 5], [7, 7], [8, 8, 8]]

Putting it together:

Ok, now we’ve done the hard (and importantly, reuseable) work, it’s time to profit. Let’s re-write our getTodosByPriority() method.

func todosByPriority() -> [[Todo]] {
    
    return getTodos()                                   // Get the todos
        .sortedDescendingBy { $0.priority.rawValue }    // sort by priority
        .chunk(atChangeTo: { $0.priority })             // make an array for each priority
        .map {                                          // sort each priority array by date
            $0.sortedAscendingBy({ todo in todo.dueDate })
    }
}

That’s more like it! It’s fewer lines of code, and it’s much easier to read.

Reuse:

So, we just really moved the code from the todosByPriority() method in to array extensions, right? Did we really buy ourselves anything?

Absolutey! Now we’ve added a set of robust, tested functions that we really never need to change (you did add tests, right?). Once you have a simple extension that is well tested, you can continue to use it with confidence throughout your codebase. Let’s look at a quick example of how we can apply our new functions to a different problem.

We can imagine a scenario where we want to know what the most common priority amongst our todos is. Perhaps if most of the todos on our list are high priority then we want to make the interface red to alert the user.

We’ll write a mostCommonPriority() function. This is how it might look (we’ll return nil if there are no todos):

func mostCommonPriority() -> Todo.Priority? {
    
    // Get the todos
    let todos = getTodos()
    
    // Get the most common priority
    var highestCount = 0
    var mostCommonPriority: Todo.Priority?
    var rawValue = 0
    while let priority = Todo.Priority(rawValue: rawValue) {
        let count = todos.filter { $0.priority == priority }.count
        if count >= highestCount {
            highestCount = count
            mostCommonPriority = priority
        }
        rawValue += 1
    }
    
    // Return the most common priority
    return mostCommonPriority
}
Most common priority: regular

Great, it works! Buuuuut… again, it’s a bit messy and difficult to follow. Lets see if we can utilise the extensions we created earlier. Here’s our strategy:

func mostCommonPriority() -> Todo.Priority? {
    
    return getTodos()
        .sortedAscendingBy { $0.priority.rawValue }  // Sort ascending by priority
        .chunk(atChangeTo: { $0.priority })          // Chunk in to separate arrays of each priority
        .sortedDescendingBy { $0.count }             // Sort arrays descending by count (most common first)
        .first?.first?.priority                      // Most common = the first item of the first array
}

So much better! Remember those tests we wrote earlier? They’re still covering most of this logic too. By abstracting away the parts of an algorithm that we can reuse, and you would be surprised just how many problems you can apply these methods to.

Further Refactoring:

At some point, the need for another function will reveal itself. On both examples we want to split the todos in to chunks that are either ascending or descending. We’re making two calls to achieve this. One to sort the todos, and one to chunk them when the priority changes.

Lets make chunkAscending() and chunkDescending() methods to handle it all in one call. Both take a function that transforms an array element to a Comparable type and return [[Element]]. It’s easy to implement, because we’ll just call in to our existing methods:

public extension Array {
    
    public func chunkAscendingBy<T: Comparable>(key: (Element) -> T) -> [[Element]] {
        return self.sortedAscendingBy(key).chunk(atChangeTo: key)
    }
    
    public func chunkDescendingBy<T: Comparable>(key: (Element) -> T) -> [[Element]] {
        return self.sortedDescendingBy(key).chunk(atChangeTo: key)
    }
}

Now we can rewrite both of our examples to be even shorter:

func todosByPriority() -> [[Todo]] {
    
    return getTodos()
        .chunkDescendingBy { $0.priority.rawValue }
        .map { $0.sortedDescendingBy({ todo in todo.dueDate }) }
}

func mostCommonPriority() -> Todo.Priority? {

    return getTodos()
        .chunkDescendingBy { $0.priority.rawValue }
        .sortedDescendingBy { $0.count }
        .first?.first?.priority
}

Comparing todosByPriority() to the implementation that we started with. We’ve gone from 14 lines to 2, and it’s so much easier to read. We can also write tests that cover our extensions and we get the benefit of that robustness everywhere.

It’s great to build up your own library of these extensions, they’ll help you work faster with more confidence in the future. You can check out my personal library here: SBSwiftUtils. All of the functions used in this post, complete with tests are available there.

All of the code used in this blog post is also available on github

Thoughts / comments / complaints / objections / just wanna chat? Tweet me.

Next »




I'm Steve Barnegren, an iOS developer based in London. If you want to get in touch: