Today’s a discussion of baseball, stats, box scores, and three handy Kotlin functions.

tl;dr

We cover three built-in Kotlin functions today:

  • filter { ... }: Return a new Collection with all the items that cause the attached predicate block to return true.
  • flatMap { ... }: Take a List of List objects and turn it into a single List with all the items together.
  • groupBy { ... }: Turn a List into a Map with the value from the key selector block.

Those three functions take the stats for every day of the season for all batters and turn them into a single set of totals for each batter.

Background

The code you’ll see today comes from the Android Baseball League (ABL) APIs, the data source for the second half of “Kotlin and Android Development featuring Jetpack”.

The quick summary of the ABL is that I wanted to use a baseball app for the advanced half of the book, and rather than worry about any legal issues with using real data from a site like Retrosheet, I “just” wrote my own baseball simulator. If you want to hear more about that whole process, I recorded a video on it:

Today’s focus is on calculating batting stats from the ABL and how we can convert data to get what we need.

The End Goal

The goal here is to get box scores for all batters for the season up to a given date, represented by a LocalDateTime parameter. The end result is a set of summaries which contain data like you see here. We can then use this data in the ABL app to show the leaders in home runs, RBI, or any of the other two-dozen+ stats using a BatterBoxScoreItem (the data class holding all the stats). For reference, that class partially looks like this:

data class BatterBoxScoreItem(
    val player: Player,
    val games: Int,
    val plateAppearances: Int,
    val atBats: Int,
    val runs: Int,
    // More stats live here
) : BoxScoreItem {
    // We also have calculated stats based on other stats    
    val totalBases = (hits + doubles + triples * 2 + homeRuns * 3)
}

As a BatterBoxScoreItem can be the stats for any number of games, we can convert the daily box scores into a single BatterBoxScoreItem per player for all the games played in our time frame. This is what calculateBatterBoxScores(), the function we’re focusing on, does: it takes a list of BatterBoxScoreItem objects for a player and combines them into a single BatterBoxScoreItem instance.

The Main Function

The batting stats calculation function looks like this:

fun calculateBattingStatsForDateTime(filterDateTime: LocalDateTime? = null) =
    battingStats
        .filter { (dayOfYear, _) ->
            filterDateTime == null || dayOfYear < filterDateTime.dayOfYear
        }
        .flatMap { (_, batterBoxScoreItems) ->
            batterBoxScoreItems
        }
        .groupBy { batterBoxScoreItem ->
            batterBoxScoreItem.player
        }
        .calculateBatterBoxScores()

The battingStats object is a Map<Int, List<BatterBoxScoreItem>> where the key is the day of the year and the List<BatterBoxScoreItems> contains all the stats for all batters on that day. That Map contains all the batting data for the entire season, but we only want the data up through the entered filterDateTime, meaning we need to filter out some items.

Filtering the Data

One of my favorite features of the ABL APIs is that everything can be time-shifted, meaning you can send in a date to any of the endpoints and get the stats for that day. This gives the sense of an ongoing season at any point and gives the ABL app something to update.

filterDateTime is an optional parameter, so if it’s not there, we can skip the filtering process by including filterDateTime == null. If omitted, this check will be true regardless of the data and will allow every item through. Otherwise, we will take the records where the day of the year is before the entered filterDate.

The filter {...} function returns a new (immutable) Map in the same format as battingStats but now limited to the time frame we want. This is better, but it’ll be easier to get a player’s stats together if we’re working with a single List instead.

Flat-Mapping the Data

The flatMap { ... } function takes in a Collection containing other Collection objects and puts them all into a single List. In our case, we take the List of stats from each day and combine them into one new List<BatterBoxScoreItem>. All the batting stats for all players in the entered time frame are in this new list, which makes splitting them up per player that much smoother.

Note that the block for flatMap { ... } can always transform the data as you wish, so you can also change the Collection objects as you merge them together. And this is always done without modifying the original Collection.

Grouping the Data

Since we’re looking for the stats for each player, we want to split up the data by each player. The groupBy function does just that by taking the items in the List and changing them into a Map. This Map has the value returned from the block as the key (in our case, the Player object) and all the items from the List as the value.

Once we have the Map ready, we can use calculateBatterBoxScores() to convert each Map.Entry into a single BatterBoxScoreItem summarizing a batter’s stats. That function (in part) looks like this:

fun Map<Player, List<BatterBoxScoreItem>>.calculateBatterBoxScores() =
    this.map { (player, boxScoreItems) ->
        BatterBoxScoreItem(
            player,
            boxScoreItems.sumBy { it.games },
            boxScoreItems.sumBy { it.plateAppearances },
            boxScoreItems.sumBy { it.atBats },
            // All the other stats follow this same lovely pattern.
        )
    }

Final Thoughts

By using these three functions, we went from stats split up per day to stats split up per player. We get a summarized view of a batter’s output that is then used to give us league leaders or individual player histories.

All three functions work on any subtype of Collection (List, Map, Set, etc.) and can do even more than we see here via their transformation functions. They’re perfect for dealing with any kind of data you need.