REST API (2): Carving the state(s)
Say you are designing an API and you want it to be of the “REST” kind. You’ll need to make many decisions and to do so you’re going to have to ask yourself many questions. I’ll list many of those here, in hopes they will help you avoid blind spots.
Before we begin, make sure that we’re in sync about what REST APIs are. You’re almost guaranteed to be surprised one way or another, about the variety of understandings and implications. This was covered in part 1 – please have a look at it.
I hope you’re laughing in disbelief at the mere mention of that, bringing us to our first question - how to carve out or divide that big state into palatable chunks that we’ll call “resources”? It may feel like trying to pull the stuck chewing gum blobs apart, leaving stringy, stretchy tentacles in the process. We need to figure this out. To do so let’s consider the following:
· Do “chunks” need to cover 100% of the monolithic state or not?
· Can coverages of different chunks overlap? How does this affect caching? Do we care?
· If there are relationships (linkages) between elements represented in different chunks, where do they go? One, some or all related chunks? Separate, dedicated chunks?
· How many chunks will be needed for any single use case?
· It’s not just about data. What to do when consistent changes are needed across multiple chunks? (Think “transactions”.) Can careful division prevent such a need?
· What are the implications of each of the above decisions?
To help us align our assumptions I’ll introduce a concrete example to discuss. Let’s design an API for an interactive game of chess. We’ll just allow clients to make moves while maintaining the board state and ensuring the rules of the game are followed. Heads up: you may decide that REST API style is not suitable for this example as we’ll be focusing on some challenges. Whether or not you’ll find a better option I will leave out for now as something for you to contemplate on your own. My only goal here is to help, as much as I can, prevent API designers from getting trapped by decisions not thoroughly thought through.
Rules of the Game
If Chess is your thing, skip this part – you already know it. Otherwise, I’ll spell out the rules because of some oddities many may not be familiar with. You don’t need to memorize those, just be aware of what kinds of rules exist.
The game is played on an 8x8 square board.
There are two equal sets (different colours) of 16 chess pieces each that each player gets, with fixed initial positions.
1 King
1 Queen
2 Rooks
2 Knights
2 Bishops
8 Pawns
Different kinds of pieces have different movement restrictions. For example, pawns usually move only forward and only by one square but there are exceptions. Kings are sluggish and can move one square in any direction but are permitted to move two squares once, when “castling”. Other rules apply to other pieces.
Players take turns making moves and each move usually affects just one piece… but not always:
“Castling” makes two pieces move together, at once in ways they would not usually be able to, but only if relevant pieces have not moved since the start of the game.
If a piece is moved over another, that other piece is “captured” and removed from the board.
When pawns reach the end of the board (“last rank”), it must be replaced with a queen, rook, bishop or knight, neither of which needs to have been captured: the player could end up having more than the initial count of those pieces. For example, they may end up having 9 queens if the other player “cooperates”.
A special “En Passant” rule allows pawns to capture pieces in ways that would not be apparent by just looking at the current and last board state. I won’t go into details.
The game is won by having the opponent’s king not be able to avoid capture (“checkmate”).
The game can end in a draw if the same move(s) are repeated three (or five) times or when the last fifty moves contained no captures or pawn moves.
Players themselves may also agree to a draw (one offers it, the other accepts it) or a player may accept/admit defeat.
Think time limits may also be present for each player independently. Running out of time yields defeat.
What are we dealing with?
State-wise we have:
players
games, which there can be many, each choosing the variation of rules, time available and one, two or more players
one board for each game
up to 16 pieces for each of the two players
positions of those pieces on the board
game history (moves) that affect “En Passant”, “Castling” and automatic game draw.
remaining time for each player
Ignoring the setup, just to play the game we need to be able to:
move one piece
make a castling move
pick a piece for the pawn promotion
offer a draw
be presented with a draw offer and accept it or decline it
accept defeat
Option 1: Game Monolith
This is the “funny” option I mentioned at the beginning, just brought down to a game. The representation of the state includes everything on the list above. To stay true to uniform interfaces and using the same representation for both current and intended state (Fielding 5.1.5 and 5.2.1.2), we should accept the entirety of that representation communicated back to the server for each move. That includes:
The positions of pieces
The history of moves or a digest covering captures, castlings, pawn promotions (affects future possibilities, whose turn it is)
Draw offers, acceptances or rejections of those and/or acceptance of defeat
Remaining time for each player
Which players are playing the game
The board that the game is played on
Chosen variant of the game
Think: what is the potential for abuse? Could a client modify the state to say that the other player accepted defeat? Maybe they can just (re)move pieces favourably, undo moves, hide some history or mess with the time remaining? Wait, time remaining? Whose clock do we trust? What about the time in transit?
What does this mean in terms of division of power and responsibility? Does the server have any power (or responsibility) beyond (and at most) validating that the indented state can be reached from the state currently known to it? What does it have to do to decide that, knowing that there can be multiple differences?
Option 2: Board Monolith
Let’s split the board out, together with pieces into its own “board monolith” resource. Since clients can only manipulate one resource at a time, this eliminates the complexity of having to deal with “combination” changes made both to the board state and state outside the board (the game). Even if you allow resource overlaps and “composite” resources (Fielding 5.2.1.2), these are still resources in their own and you’d have to figure out the transfer details.
What else did we accomplish with this? Are there any new challenges?
How many resources do clients need to work with now? I‘m thinking “one more than in option 1” – now we have the game and the board separated. The game may be needed for variants, history, remaining time, draw negotiation. The board is needed to see the positions of the pieces and make the moves. Often, if not always, both are needed.
How is the relationship between the board and the game represented? Is it 1:1 or does the game have a “history” of boards? How does that affect the payload size? In one case a move would modify the current board and in another it would add/create new boards within the game. In 1:n case, are boards themselves chained chronologically? Would each need its own id? How does modifying an existing board vs creating new differently affect caching? How useful is caching in either case?
As the client can still transfer “intended” illegally modified board state representations to the server, the server has to keep reverse engineering the moves from the last state known to it and validate them. Does this get any easier considering the reliance on the implied (not included) game state? Note that, even with exceptions, chess moves are easy to notice and validate. Think about your options when this is not the case.
What happens with the “statelessness” of the API? How does this relate to Mr. Fielding’s note in (5.2.2): “… each request contains all of the information necessary for a connector to understand the request, independent of any requests that may have preceded it”? How about the one note in 5.1.6: “Within REST, intermediary components can actively transform the content of messages because the messages are self-descriptive and their semantics are visible to intermediaries.”
In multiplayer games multiple players may attempt to make moves for the same turn / board state. How is this to be recognized and handled? We don’t want those to appear as separate moves, that may even look like “undo” in some situation, possibly causing automatic draw due to move repetition. Do we introduce expectations to do something like “optimistic locking”? If we took that “game is a history of boards” and chain the boards to next and previous, that “previous” board id can act as the expectation. Who updates the previous board to point to the next one? Is this beginning to look like tighter coupling between the client and the server? How does this impact caching of game and board resources?
Option 3: Split out the pieces
Perhaps we should directly model the pieces that move? That seems to align with most of the moves, right? Now we have:
The game state (as a resource)
Optionally one or more board states representing positions over time.
32 standard chess pieces (individual resources) on or off the board + additional pawn promotion pieces.
Relationships: either the game or the boards must be linked with the pieces on or off the board, at any given “position”. Note that every relationship goes both ways, regardless of how it is stored or represented.
Getting the initial game state now requires 33 or 34 resources: 1 game resource and 32 linked pieces, perhaps via the board resource. Are you thinking of embedding linked resource states inside a composite resource that overlaps individual ones? Well:
Will you maintain uniformity and allow manipulation of each resource type, both the composite and individually? Is that not just more work? Will you sacrifice aspects of uniformity and make composites read-only? How should intermediaries behave given multiple different ways the same data is communicated?
Given a piece state, can the client know all the possible moves without referring to data from prior requests? How would it know that the move won’t land on top of another piece of the same player?
Given a new/intended piece state (position), what does the server have to look up to perform validation? The request does not include all the data as required by Fielding 5.2.2: “… each request contains all of the information necessary for a connector to understand the request, independent of any requests that may have preceded it.
How does one execute a move affecting more than one piece: capture, castling, en passant? Should pawn promotion change the piece’s type or exchange the pieces (pawn goes off the board, new piece comes in)? Would you need to implement transactions? But how? Composites again?
Option 4: Moves are resources
OK, we’re getting away from resource = thing now. An action is now a resource. Let’s think about that:
We can create new moves and read the past ones. We can also create multi-piece / transaction moves.
How about modifying existing/past moves? No?
How do clients know the positions of all pieces? Do they replay all the moves or will there be special resources to help? Would they be read-only? Are there any caching implications?
Once again, what do we do about (Fielding 5.2.2) “each request contains all of the information necessary for a connector to understand the request, independent of any requests that may have preceded it”?
Do you have any expectations (some do) of staying with “HTTP verbs” (methods: GET, POST, …) being the only action identifiers? If you do, does this feel like cheating or breaking that expectation?
Option 5: I’ve got something better
You do? You may. How much effort did you put in to figure it out? Is it easy to explain to others? Would they understand? Is it truly, fully REST? Would you share it? Please do.
Option 6: REST is something else
I have a twisted view of REST. You feel that some challenges I list don’t apply. I explicitly accept that possibility. In that case, I’d appreciate you reading the part 1 and helping people like me understand it better. Be ready for many questions!
Option 7: REST isn’t a fit
So you don’t like other options and have opted to step outside the REST principles. Perhaps you will have dedicated endpoints that accomplish just what each individual use case needs, without trying to force square pegs through round holes? Maybe a part of the API can be “REST-ish”, such as the state retrieval part? Perhaps you’ll ignore caching as some combination of unachievable and/or irrelevant? You may also find that statelessness isn’t so important. Whatever you do, you may still be able to document your API using Open API specification. It is, after all, about HTTP APIs, not (just) REST, so as long as you stick to HTTP, you’d do fine. Desire to use standard tooling and other reasons may also lead you to other common API styles.
Note: I am not judging. Why do you think I listed all these options?
So what’s the answer?
Did you come here to find answers? Did you assume that I have one to offer? I created this site to entice you to think on your own and form opinions you’ll be able to understand and trust.
How about you? Do you know of a non-trivial yet successfully carved proper REST API in production that you can showcase? How does it address the challenges touched upon here? How much more is your server than just a data store and a gatekeeper? Does it serve only one kind of a client or many? How did you optimize the runtime performance, if needed? Did you need to implement detection and prevention of malicious requests, such as denial-of-service attacks and how did you do that?
Maybe you fit in a different group that has never made, used or seen a proper REST API, even though some may have those letters in their names. Is this the case with you?
See you in the next post!