Swift + JSON with code generation

Update 2016-01-17: I gave a talk about this topic at the January 2016 meetup of CocoaHeadsNL. See Swift JsonGen [talk].


There are many JSON parsing libraries written in Swift, and equally as many articles about them. This article is not one of those. This article is about a code generation tool, running on NodeJS, written in TypeScript: JsonGen.

JsonGen generates JSON parsers for immutable Swift structs. Get type safety without manually writing a parser.

Background

When we – at Q42 – started work on our first Swift app (September 2014) we ran into an issue: How to get the JSON from API calls into nice, strongly typed, Swift structs. Chris Eidhof’s post on functional JSON parsing sounded appealing, so we tried the Argo library.

As a Haskell developer, Argo’s style looked interesting, although some colleagues were less enthusiastic about all the custom operators. However, before we got to discuss the merits of custom operators, we ran into a bigger issue: Argo didn’t work. When writing more complex parsers, the Swift type inferencer (or some other part of the compiler) kept crashing. I’m not blaming Argo here, I’m sure it’s swiftc that was to blame, and it probably now works with Swift 1.2. Still, it was a problem, so we looked further.

Getting data from JSON

For me, one of the most appealing aspects of working with Swift is type safety. Being able to precisely model all states of a user interface in a Swift enum is huge! This is the whole reason why we want JSON in structs instead of untyped dictionaries.

The native way of reading JSON in Swift is by using dynamic lookup on the dictionary returned by NSJSONSerialization:

let str = "{ \"title\": \"Hello, World!\", \"published\": true, \"author\": { \"first\": \"Tom\", \"last\": \"Lokhorst\" } }"
let data = str.dataUsingEncoding(NSUTF8StringEncoding)!

let obj = try! NSJSONSerialization.JSONObjectWithData(data, options: [])

Dynamic lookup

The result of NSJSONSerialization.JSONObjectWithData is actually a dictionary containing string values and nested dictionaries. By doing dynamic casts you can get to the desired value;

if let dict = obj as? [String: AnyObject],
    author = dict["author"] as? [String: AnyObject],
    firstname = author["first"] as? String,
    lastname = author["last"] as? String {
  
  print("Author: \(firstname) \(lastname)")
}

The reasons why I don’t like this approach are; It is really verbose to deconstruct the dictionary to extract some value. It is very error prone, because all the code is untyped. All the field names are in strings, and the types are just added ad-hoc.

Modeling data as structs

To get more type safety, I model my data with structs:

struct Blog {
  let title: String
  let published: Bool
  let author: Author
}

struct Author {
  let first: String
  let last: String
}

With a library like Argo, I can create a parser (or decoder) that can parse the AnyObject from NSJSONSerialization.JSONObjectWithData into the structs. However the downside is: I have to write a parser, where I again have to write all the field names in strings!

Code generation with JsonGen

Instead of manually creating the decoder, let’s generate it! That way there is less manual work to do, and no room for human-error. (For example: our app has over 50 structs for dealing with JSON, I don’t want to manually write and maintain the parsers for those). With a decoder generated by JsonGen, the use-side code becomes:

if let blog = Blog.decodeJson(obj) {
  let author = blog.author
  print("Author: \(author.first) \(author.last)")
}

The swift-json-gen command line tool will generate new source files for each supplied input file. For each struct an extension is created with a decodeJson method. The tool is smart enough to only generate methods for structs that don’t already have that method. So if you want to customize the behaviour of the decoding for one of your structs, all you have to do is implement the decodeJson method yourself and JsonGen won’t generate a new method.

Example:

  1. Put the Blog and Author structs in a file called Blog.swift
  2. Run the command: swift-json-gen Blog.swift
  3. This will generate Blog+JsonGen.swift that contains static decodeJson methods for the Blog and Author structs.

JsonGen ships with a file containing decoders for standard Swift and Foundation types: JsonGen.swift. These are decoders for types like Bool, Double, String, NSURL, but also generic types like Optional<T> and Array<T>. JsonGen also resolves type aliases and can generate decoders for nested types, as well as generic types.

If there is part of your JSON structure you don’t want to model in Swift structs, you can leave that part untyped by using one of three type aliases from JsonGen.swift: AnyJson, JsonObject or JsonArray. For example, in the following struct, the data field is kept as an untyped AnyObject:

struct Item {
  let id: String
  let name: String?
  let data: AnyJson
}

How it works

JsonGen works by calling swiftc with the -dump-ast flag. With this flag, the Swift compiler parses and type-checks the provided Swift files. But instead of generating code, it then outputs the Abstract Syntax Tree (AST) to stderr in a Lisp-like format. This AST is then parsed, and JsonGen looks for structs. For each struct, all fields and corresponding types are looked up. An extension method is generated that tries to find each field in the dictionary. If found, the value will be decoded using the decoder for the type of the field. If either the lookup or the decoding fails, nil is returned instead of the value being decoded.

The generated decoder returns an optional value. If the AnyObject can’t be decoded to the requested type, nil is returned. In development mode, the debugger is stopped using an assertionFailure so that the developer can figure out what is wrong (either the struct, the JSON, or both).

This is the generated code for the Blog struct from before.

extension Blog {
  static func decodeJson(json: AnyObject) -> Blog? {
    let _dict = json as? [String : AnyObject]
    if _dict == nil { return nil }
    let dict = _dict!

    let title_field: AnyObject? = dict["title"]
    if title_field == nil { assertionFailure("field 'title' is missing"); return nil }
    let title_optional: String? = String.decodeJson(title_field!)
    if title_optional == nil { assertionFailure("field 'title' is not String"); return nil }
    let title: String = title_optional!

    let published_field: AnyObject? = dict["published"]
    if published_field == nil { assertionFailure("field 'published' is missing"); return nil }
    let published_optional: Bool? = Bool.decodeJson(published_field!)
    if published_optional == nil { assertionFailure("field 'published' is not Bool"); return nil }
    let published: Bool = published_optional!

    let author_field: AnyObject? = dict["author"]
    if author_field == nil { assertionFailure("field 'author' is missing"); return nil }
    let author_optional: Author? = Author.decodeJson(author_field!)
    if author_optional == nil { assertionFailure("field 'author' is not Author"); return nil }
    let author: Author = author_optional!

    return Blog(title: title, published: published, author: author)
  }
}

The generated code is simple albeit not very pretty to look at, but hopefully you’ll never have too. The added benefit of generating such simple, straight forward code is; If you should ever decide to stop using JsonGen, you can just continue on by manually editing and maintaining the generated code.

By the way; JsonGen also generates a encodeJson method. The encode method generates a dictionary for any type of struct. This is useful if you want to post data back to a JSON api.

JSON model vs Domain model

When decoding JSON into a domain model, you quickly run into the issue that the JSON structure doesn’t quite match the structure of the domain model. The generated decoder isn’t good enough and you need to customize. As said before, you can manually specify decoders for certain specific types. But, it can be quite a lot of work to create a decoder for a whole type. However, I think this is often not the approach you will want to take.

In my opinion there are two kinds of JSON in an app; It is either from a source that is completely under my control, or the source is external and the JSON format is out of my control. In the first case; I try to have the JSON format match as closely as possible to my domain model. For the few differences between the JSON and the domain model, I write custom decoders. That way I can directly parse JSON into the structs of my domain model.

In the case that the JSON format is out of my control, the format is often weird and doesn’t match my domain model. It has strange internal rules and custom logic. I don’t want to write all that custom logic and those business rules in the untyped world of decoders. I want to use the type checker to tell me if I’ve forgotten to implement an enum case, or if I assume a optional field is always available.

To achieve this, I create two sets of types; One set of structs that exactly matches the raw JSON, and another set of structs and enums that are my domain model. For the JSON structs, I generate decoders and encoders. But for the domain model, I manually write the mapping from the JSON structs to the domain structs. This may seem tedious, but it is necessitated by the fact that that mapping code is full of custom business logic. That business logic is code I want to write and maintain in the world of types.

This abbreviated example demonstrates this approach. The domain model is the Profile struct. The JSON model is the ProfileJson struct:

/* Domain model */

// A Profile consists of a first name and an optional avatar
struct Profile {
  let firstName: String
  let avatarURL: NSURL?
}

/* JSON model */

// This is the format of JSON data returned by an API.
// The `profile` property maps this JSON model to the domain model.
// The JSON parser for this type is generated by swift-json-gen.
struct ProfileJson {
  let FirstName: String
  let UseAvatar: Bool
  let AvatarUrl: String?

  var profile: Profile {
    // These are the business rules for handling the JSON data:
    // 1. When UseAvatar == false, ignore the url
    // 2. If AvatarUrl is not a real url, ignore it

    let avatarURL: NSURL?
    if let url = AvatarUrl where UseAvatar {
      avatarURL = NSURL(string: url)
    }
    else {
      avatarURL = nil
    }

    return Profile(firstName: FirstName, avatarURL: avatarURL)
  }
}

(See this gist for a more slightly more detailed example)

Note that there is no need to completely “duplicate” all structs in the domain model into a JSON model. Sometimes the domain happens to match the JSON, in that case I only use the domain model. This happens a lot at the “leaves” of data structures. It is the roots that differ and contain a lot of custom mapping logic. In that case, the root structs exist in both the JSON model and the domain model and the leaves are shared.

Conclusion

If you want type safety, and don’t want to do dynamic lookup, you don’t have to manually write JSON parsers for your structs.
You can use JsonGen to generate decoders and encoders for your Swift structs. Let the untyped code be generated and only work in the nice, strongly typed world of Swift!

Install and use swift-json-gen like so:

$ npm install -g swift-json-gen
$ swift-json-gen MyDirectory/SubDirectory/

Or get the code from GitHub: tomlokhorst/swift-json-gen

This tool is used in a few projects at Q42, but it hasn’t been battle tested outside of our company. If you use it and find it useful, please let me know: @tomlokhorst.