Making OpenAPI / Swagger Bearable With Your Own DSL

Taming OpenAPI using Racket to create a DSL

Reddit
LinkedIn

Problem: while OpenAPI is great for describing APIs, generating client SDK code and defining your API contracts, it's definitely not a great experience to write OpenAPI documents from scratch.

Even with the use of references, OpenAPI's mechanism to avoid repetition, a simple document with less than 10 endpoints described ends up at over 1,000 lines for us. To get a feeling for it, this snippet defines a schema that's just a list of integers:

   schema:
     type: array
     description: List of IDs
     items:
       type: integer
       description: Pet ID

Having such long files means it's very hard as a reader to grasp what's being described in the file. The format is much more machine friendly than it is reader friendly.

References are a mechanism that allows you to define that schema only once and reference it where it's being used. For example you would create:

   components:
     parameters:
       Page:
         description: Number of the page to be returned
         in: query
         name: page
         schema:
           example: 2
           type: integer

To define a pagination parameter and later you could just reference to it like:

 "$ref": "#/components/schemas/Page"

However, references cannot be used everywhere or anywhere in a document and while they can somewhat be composed together, it's not incredibly intuitive.

A couple more issues we were having because we split our API definition over multiple files:

  • The tooling to display your documentation generally doesn't support references coming from other files, so you end up copy-pasting your definitions from file to file
  • The same thing being defined in multiple files now starts to get out of date in some files and only updated in others. We recently bumped our API version number but only 1 of our OpenAPI documents reflects that change. The others are left at version 1.

A Better Way

The great thing about Open API documents though is that they are YAML. Which means they just represent a <key> : <value> data structure that all programming languages can create.

So writing a DSL (Domain Specific Language) for Open API just means creating a set of variables and functions that return hash maps. The resulting object can then be serialized to either YAML or JSON.

Using this strategy, our spec files now average around 100 lines instead of 1,000. Even if you take into account the files containing the DSL definition, the total is still under 1,000 lines of code.

We used Racket for this, S-Expressions neatly resemble a data definition language more than a programming language so it doesn't feel that different from writing YAML or JSON. However, we're not using anything incredibly complex from Racket and everything we do here could be achieved in any language that supports creating functions with optional, named and rest arguments. Because of that, Racket is not the only language you could use to create such a DSL and if you're more comfortable with Python or Ruby for example, those would be good choices here as well.

While some level of this DSL could be shared, the real benefits comes from creating your own. Your API responses likely don't look like ours, but they likely all have parts that look the same, so you can easily re-use that.

Example

As an example, this spec is a simple swagger doc, loosely inspired by the pet store that is the demo when you use the online swagger editor. You can see it in swagger's UI here.

This document describes 5 endpoint for a single type of entity:

  • GET a list
  • GET just one
  • PUT update one
  • POST create one
  • DELETE one

This is fairly typical but already requires a document that is almost 400 lines long for something that simple. Contrast this with the DSL (we call our DSL swaggy) document that generated it:

#lang racket
(require "./lib/my-service.rkt" "./lib/openapi.rkt")
 
;;; ENTITIES
(define pet-entity
 (entity "Pet"
         'race (string "What kind of dog / cat this is (labrador, golden retriever, siamese, etc...)" "Labrador")
         'origin (string "Country of origin" "Egypt")
         'birthday (datetime "Birth date of the pet" "2017-10-20T00:14:02+0000")
         'species (string "What kind of animal is this" "dog" #:enum '("dog" "cat"))))
(define $pet (schema-reference 'Pet pet-entity))
 
;;;  TAGS
(define pet-tags (list "Pets"))
 
;;; RESPONSES
(define single-pet-response (jsonapi-single-response "Pet" ($pet)))
(define list-pets-response (jsonapi-paginated-response "List of pets" ($pet)))
 
;;; REQUESTS
(define pet-request (json-request "Pet Request Body" ($pet)))
 
;;; MAIN DOC
(define swagger
 (my-service-api-doc "Pet Store" "Per store pets management"
   (path "/pets") (endpoint-group
                           'tags pet-tags
                           'parameters (list store-id-param)
                           'get (endpoint
                                   'operationId "listPets"
                                   'summary "Retrieve all the pets for this store"
                                   'parameters pagination-params
                                   'responses (with-standard-get-responses 200 list-pets-response))
                           'post (endpoint
                                   'operationId "createPet"
                                   'summary "Create a new Pet record"
                                   'requestBody pet-request
                                   'parameters (list xsrf-token)
                                   'responses (with-standard-post-responses 200 single-pet-response)))
   (path "/pets/{petId}")
                   (endpoint-group
                           'tags pet-tags
                           'parameters (list store-id-param (path-param "petId" (string "The ID of the pet in our store. A UUID." "060be70e-7b92-4560-94ce-d02733785447" #:format "uuid")))
                           'get (endpoint
                                   'operationId "getPet"
                                   'summary "Retrieve a single pet record"
                                   'responses (with-standard-get-responses 200 single-pet-response))
                           'put (endpoint
                                   'operationId "updatePet"
                                   'summary "Update store information for a pet"
                                   'parameters (list xsrf-token)
                                   'requestBody pet-request
                                   'responses (with-standard-put-responses 200 single-pet-response))
                           'delete (endpoint
                                     'operationId "removePet"
                                     'summary "Remove a pet's record from the store"
                                     'parameters (list xsrf-token)
                                     'responses delete-responses))))
 
(module+ main
 (generate-doc swagger))

This document is in total 61 lines long. Very far from the 380 of the actual YAML document or the 550 lines of nicely formatted JSON equivalent. The main reason it can be that short is that a single function name can carry a lot of information with it. For example, in OpenAPI you create a floating point number's schema like this:

   price:
       type: number
       description: The price
       format: float
       example: 3.2

Swaggy on the other hand can define a function called float so that you can simply write:

(float "The price" 3.2)

The information brought by type: number and format: float is now encoded in the use of the function call (float).

By augmenting your own DSL this way, you can get to focus only on the information you are trying to provide. The information specific to this endpoint or the document. This is repeated everywhere in this document:

  • The (entity) function provided by the my-service module uses the base (openapi/object) function and adds the fields: id, created_at, deleted_at. Almost all resources our API returns will contain those fields and it isn't helpful to repeat them everywhere when writing the API spec.
  • (string) is provided by openapi and is interesting in that only description and example are required. But OpenAPI allows to add many more properties such as an enum. For those, we use optional named arguments. If you call (string "The type of pet" "dog" #:enum '("cat" "dog")) the schema definition will contain an enum. If you omit the #:enum argument, then it'll be omitted.
  • (with-standard-post-responses) is another interesting one. A POST request can have many successful or unsuccessful responses. But in our case, only the success case differs from one POST endpoint to the next. Our error format is always the same so that function will take any response code and response definition you give it and add the following ones 400, 401, 403, 404, 419, 422, 500
  • There is also a lot that is NOT in this document because it will be shared between all our documents such as: the API version, the list of servers where this API is available, how we do pagination, the slack channel to get in touch with our team, the list goes on and on. By basing this DSL on an actual programming language, you get to easily share those definitions from document to document instead of having out of date YAML definitions in some documents.
  • OpenAPI inconsistencies: in the OpenAPI spec, the description for a query / path parameter doesn't go into its schema, this defers from how you define an API resource. Swaggy tries to eliminate those inconsistencies so that things are always defined the same way no matter if they are part of a parameter definition, a schema definition or anything else.

How it Works

Creating a DSL like this one is incredibly simple since at all levels, you're only dealing with hashmaps, arrays or simple scalar types. As an example, the (string)function just reads:

   (define (string description example)
     (hash 'type "string"
           'description description
           'example example))

Which is just creating a hash map with 3 keys and their values. The same in JavaScript would read:

   function string(description, example) {
       return {
           type: "string",
           description: "description",
           example: "example"
       };
   }

Some functions encode a bit more information in their names such as:

   (define (datetime description example)
     (hash 'type "string"
           'format "date-time"
           'example example
           'description description))
   (define (float description example)
     (hash 'type "number"
           'format "float"
           'description description
           'example example))

Your turn

Switching from writing raw swagger to writing Swaggy has made the dread of having to write or update a swagger file go away. This means our documentation is kept up to date instead of written once and abandoned. We’ve also had people from other teams need to update our documentation when they added features to our service, and this never required any hand holding. This is simple enough to pick up on your own.

I encourage you and your team to do the same if you write Swagger / OpenAPI documents. However, It's likely your response format or your error format is specific to your API and won't fit what we've done ourselves by customizing this DSL to our service. That's why I encourage you to write your own DSL if you want to get the real benefits from this approach.