cadlag dot org

State Machines and Sequence Diagrams

While reading the TCP specification in RFC 793, it struck me how important diagrammatic reasoning is in understanding such protocols.

Broadly speaking, a protocol is a set of rules describing how entities (e.g. people, machines, processes) interact in a given system. Almost inevitably these rules depend on whether certain circumstances are met, and typically the entire protocol can be described as a state machine where transitions are triggered by events and themselves imply certain actions (e.g. starting a timer).

There are many ways to describe such a state machine, but one of these most legible approaches is through a graphical diagram, where states are vertices and transitions are edges. These edges are labeled with a precondition (which must be met for the transition to occur) and an action (performed as part of the transition). For example, the TCP specification describes the connection state machine and summarizes it with a diagram. This is a full page of ASCII art – feast your eyes:

September 1981                                                          
                                           Transmission Control Protocol
                                                Functional Specification

                                    
                              +---------+ ---------\      active OPEN  
                              |  CLOSED |            \    -----------  
                              +---------+<---------\   \   create TCB  
                                |     ^              \   \  snd SYN    
                   passive OPEN |     |   CLOSE        \   \           
                   ------------ |     | ----------       \   \         
                    create TCB  |     | delete TCB         \   \       
                                V     |                      \   \     
                              +---------+            CLOSE    |    \   
                              |  LISTEN |          ---------- |     |  
                              +---------+          delete TCB |     |  
                   rcv SYN      |     |     SEND              |     |  
                  -----------   |     |    -------            |     V  
 +---------+      snd SYN,ACK  /       \   snd SYN          +---------+
 |         |<-----------------           ------------------>|         |
 |   SYN   |                    rcv SYN                     |   SYN   |
 |   RCVD  |<-----------------------------------------------|   SENT  |
 |         |                    snd ACK                     |         |
 |         |------------------           -------------------|         |
 +---------+   rcv ACK of SYN  \       /  rcv SYN,ACK       +---------+
   |           --------------   |     |   -----------                  
   |                  x         |     |     snd ACK                    
   |                            V     V                                
   |  CLOSE                   +---------+                              
   | -------                  |  ESTAB  |                              
   | snd FIN                  +---------+                              
   |                   CLOSE    |     |    rcv FIN                     
   V                  -------   |     |    -------                     
 +---------+          snd FIN  /       \   snd ACK          +---------+
 |  FIN    |<-----------------           ------------------>|  CLOSE  |
 | WAIT-1  |------------------                              |   WAIT  |
 +---------+          rcv FIN  \                            +---------+
   | rcv ACK of FIN   -------   |                            CLOSE  |  
   | --------------   snd ACK   |                           ------- |  
   V        x                   V                           snd FIN V  
 +---------+                  +---------+                   +---------+
 |FINWAIT-2|                  | CLOSING |                   | LAST-ACK|
 +---------+                  +---------+                   +---------+
   |                rcv ACK of FIN |                 rcv ACK of FIN |  
   |  rcv FIN       -------------- |    Timeout=2MSL -------------- |  
   |  -------              x       V    ------------        x       V  
    \ snd ACK                 +---------+delete TCB         +---------+
     ------------------------>|TIME WAIT|------------------>| CLOSED  |
                              +---------+                   +---------+

                      TCP Connection State Diagram
                               Figure 6.

                                                               [Page 23]

For all of its complexity, this diagram gives a single page summary of what would otherwise take many pages to describe in words. One analogy that comes to mind is with the configuration space of a physical system. The state diagram provides a complete map of possible states: at any given moment, a TCP implementation must be in one of these states.

If state diagrams summarize the structure of a system, sequence diagrams reveal its evolution over time. Typically these show how several actors, each running their own version of the state machine, interact via messages. TCP has several such diagrams in the specification. For example, on pg. 31 there is the following diagram for the “3-way handshake” used to establish a connection:

      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  3.  ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  4.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK>       --> ESTABLISHED

  5.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK><DATA> --> ESTABLISHED

          Basic 3-Way Handshake for Connection Synchronization

                                Figure 7.          

Here we have two hosts running TCP, and hence each in their own TCP state. As time progresses from top to bottom, these hosts communicate with TCP messages and advance from state to state. The first host, TCP A, goes from CLOSED to SYN-SENT to ESTABLISHED, while the second host, TCP B, goes from LISTEN to SYN-RECEIVED to ESTABLISHED. Although this scenario was latent in the state diagram itself, it’s much more legible in a sequence diagram.

Whether it visually or in text, defining a protocol involves describing a state machine and explaining the sequences it can produce. While the above examples are presented informally, one can imagine formalizing such diagrams to prove the mathematical correctness of a protocol1. But that’s a discussion for another time.


  1. Here I am imagining that the protocol itself is formalized as a state machine in some specification language. A correctness proof would establish that certain properties hold for all sequence diagrams generated by this state machine. ↩︎

Reply to this post by email ↪