Boxers or Briefs, Stateful or Stateless

August 2nd, 2008 @

In this article, I’ll compare two opposing design paradigms for session data management in a session-based online application. Which is better, stateful or stateless?

Epic clashes – in changing rooms near you!
Human history is marked with epic clashes of ideologies: Darwinism vs. Creationism, Capitalism vs. Socialism and Boxers vs. Briefs. Some of these ideas have played out over centuries on the global stage. Sometimes a clear winner emerges but most times not.

Our own software engineering world is not immune to ideological clashes. One such example is the design of server-side applications: should they be designed as stateful or stateless? It pits those who believe in the performance of statefulness vs. those whose believe in the simplicity of statelessness. It is divisive because the choice is not always clear. When an application is being designed, software engineers may have limited requirements in hand, short-term vision or lack experience to make the right decision. In addition, the technological landscape is always shifting. As new technologies emerge, assumptions are invalidated and rules rewritten.

Demanding online applications
Online bookstores, instant messaging services and online travel agencies are examples of session-based online applications. The more popular brands have very demanding requirements. Their throughput requirements are measured in the thousands of transactions per second, their response time latency clocked in milliseconds, and their availability north of “five-nines”. (That’s 99.999% uptime or 86 seconds of downtime a day.) Being session-based means that a user establishes a session, perhaps via a sign-in, and that a conversation exists between client and server. Consequently, each request is implicitly bound to a session context typically stored on the server-side. Deciding how the application manages this data is the focus of this article.
Managing session data

In some ways, session-based applications face tougher engineering challenges than their sessionless counterparts. This is because session data must be managed. The larger and more dynamic the data, the tougher the challenge will be. Sessions can include data describing the end-user (user name, address), dialog data (current request, time of request, http headers, browser capabilities), and the shopping cart data representing merchandise being purchased. Since high-traffic sites require the application to run on a cluster of back-end servers, the data must be accessible regardless of which server processes the request. There are essentially two ways to solve this problem: a) bring the data to the thread or b) bring the thread to the data.

The first option is really all about externalizing the data and making it accessible from any server. Externalization renders the application stateless because servers do not own nor encapsulate the session data. The application is still session-based yet the components are without end-user state.

The second option keeps session data internal and owned by a given server. It internalizes the data to that server. This has the effect of making the application stateful because components within the application encapsulate behaviour and data.

If we need to design a session-based online application, with high throughput requirements and a large session data footprint, which paradigm serves us best? The ultimate choice will have a profound effect on the design, performance, scalability, operability and reliability of the system. Let’s explore the effects of stateness against these five pillars.

Statelessly simple
At first glance, designing applications statelessly is simpler. A cluster of homogeneous servers hosting the application can be deployed and requests can be load-balanced round-robinly to any server in the cluster. This is standard fare for HTTP-based load balancers. Once a request lands on a server, the application can access the session data remotely. Simplistically, this data can be accessed and stored via a shared database. After each request, the modified data is stored back onto the database.

An accidental by-product is that the session is always recoverable should a server crash. As an end-user, it’s nice to know that my flight and hotel are still booked if a server crashes before I paid for my exotic vacation. In fact, a load-balancer can seamlessly re-direct my request to another server and maybe, I won’t even notice the glitch. Thus session data recovery is built-in. Additionally, it can greatly simplify the life of a system administrator who can shut down any server without impacting the end-user. Stateless applications are therefore easier to host and get by with off-the-shelf infrastructure.

Statefully fast
Stateful applications, on the other hand, are inherently more complex. This is the price to pay for higher performance. Statefulness goes hand in hand with caching, which, in turn, is the gateway to high performance. There are many forms of caching but they can be summarized as either being caches internal to the application, stored in a hash map for example, or caches external to the JVM, such as distributed and remote caches. With respect to statefulness, only the first category is applicable since the latter is just another type of data externalization and therefore more closely associated with statelessness from an application point of view.

Stateful applications can take advantage of the fact that all session data is readily available for consumption in its native object form. Data does not need to be externalized before and after every request and this saves considerable CPU cycles. There is no need to design complex object-to-relational data mapping. Regardless of the amount of tools available in this space, there is a price to pay in both design effort and performance.

Since nothing known to software engineering is faster than memory access, statefulness naturally translates into higher performance and higher degree of scalability for the entire application. Stateful applications thus circumvent design issues related to externalization as well as benefit from higher performance.

Statelessly complicated
On the other hand, statelessness has an alluring air of simplicity that can be deceiving. If you take the naïve approach and externalize session data after every request, you will simply shift all of the pain onto the external store. If, for example, a database is used as an external store, it will need to be tuned and configured for redundancy and performance. Individual queries will also require tuning, as performance will not come standard out-of-the-box.

Externalization also has its challenges in regards to how data is transformed. There is considerable application-wide complexity to manage with object-to-relational mapping. Choosing to side-step the object-to-relational mapping issue with Java serialization is also fraught with risk. Java serialization hides externalization details from the developer. However, it hides so much, is so easy and automatic that serialization bugs creep up at runtime. All it takes is one attribute in a complex objects structure to be unserializable and the entire externalization fails.

There are products that can help with externalization (Times Ten in-memory database, Coherence distributed caching and Terracotta’s NAM come to mind). While these are examples of products that can tip balance in favour of statelessness, effort will ultimately be dispensed to get the right behaviour and performance. Buying a ready-made product that meets our stringent requirements will be expensive. Taking the simplistic approach and externalizing the session data can ironically spread complexity across the entire application code base. In the end, what seemed simple was indeed not. There is no free lunch.

Statefully slow
Statefulness is also not a silver bullet for performance. The pain point will become the large footprint of cached session data. For Java applications in particular, large heaps containing caches of long-lived session data objects along with short-lived working data objects, can punish the garbage collector. In high throughput systems, this deadly combination of object behaviour will strain even the best built JVMs.

In addition, stateful applications put reliability at risk in two ways. First, since data is cached, there is an inherent risk of bugs that can cause data to linger in memory forever. Forgetting to clear data in memory will have far more serious consequences for a stateful application. Secondly, session recovery must be designed explicitly. Since memory is volatile, there are some critical pieces of data that will need to be externalized and stored remotely.

Statefulness also adds more infrastructure complexity because sessions become sticky to servers. The clustering and routing infrastructure will need to take into account the locality of the session and its server.

In the end, it is unrealistic to completely avoid externalization. There’s only so much session data that can fit in memory. Some data will need to be stored remotely.

Interestingly, statelessness started out as the simpler alternative but taken to an extreme, became very complicated. Statefulness started out being the faster alternative but also, taken to an extreme, became inefficient. This shows the value in mixing both paradigms and making the right choices and trade-offs.

In the real world, only toy applications can be purely stateful or purely stateless. Real applications will fall somewhere along the stateness continuum rather than at the edges. Choices are made by analyzing how critical the session data is, how often it changes, how big the footprint, how complex the session objects, how high the throughput, how low the latency, how capable the software engineers are in managing complexity and how cheap the overall deployment cost must be.

Even with due diligence, the outcome won’t be perfect. Just like religion, you pick one and eat the meat that comes with it (or lack thereof).