Programming Pattern: Event Sourcing

Posted On: 2024-09-02

By Mark

I've been recently working on a side project that gives the user the ability to seek backwards and forwards through time - similar to how video editing software lets a user quickly move backwards/forwards to make changes to any part of the video. While looking around for the best approach to solve this challenge, I happened upon the Event Sourcing pattern, which is largely focused on representing the program's internal state as a series of changes. Although this pattern can be used in a variety of systems and situations, I've been a bit disappointed by the available information on it: nearly every article I've read focuses on only a single use case (one part of a specific architecture for cloud-based systems.) Thus, I've decided today's post should be an exception to that: a brief introduction to using Event Sourcing in other kinds of systems.

What Is It?

As mentioned in the introduction, Event Sourcing is an architectural pattern that represents the current state of the program as a series of changes (aka. events.) A simple example of this is a bank account balance: rather than keeping track of the current balance (and changing it when withdrawals/deposits occur), a program using Event Sourcing would track a list of all withdrawals and deposits and use those to calculate the current balance on the fly. The term "Event Sourcing" refers to that specific idea: the events are the "source of truth" for the program, so anything you want to know about the present has to be derived from the history of changes recorded in the events.

Why Use It?

When you have a record of every event, not only can you derive the present state of the program, you can also derive past states just as easily. In fact, not only can you derive the state of the system at any point in the past, Event Sourcing makes it simple to investigate alternate possibilities: from any point in the past you can "branch" the history, and treat that point as the new present. From then on, whatever actions the program supports (ie. withdrawals/deposits/etc.) would happen at that point, creating a kind of alternate future (or, put another way, making it effortless to roll back to a specific point.)

How To Implement It?

Using Event Sourcing adds quite a bit of complexity: most programming languages make tracking and organizing state fairly simple, but by using Event Sourcing you throw a lot of that away. If you think that trade-off is worthwhile for your project, there are a few high-level things you'll need in order to use this pattern.

First, you'll need a source of truth - the series of events. When working on a cloud-based system, there's middleware marketed for that purpose, but if you're using Event Sourcing in a standalone program, you'll likely need to build something yourself from the available collections in your language/framework of choice. While I can't recommend any specific data structure (I don't have enough experience with the pattern yet) everything I've read on the topic recommends using storing the events in an append-only collection (that is, you can add events, but cannot remove any, as that would cause system-wide instability.)
Second, you'll need a representation of the current state. While you could simply display the list of events to the user, it's often best to be able to display the things that a user most wants to know directly, rather than expecting them to read through the events and figure it out themselves. Importantly, although I refer to this as the "current" state, this will be "current" as of a specific event - ie. if you are looking at all events up to a specific point in time, the state should represent that specific moment in time.
Third, you'll need a set of rules for applying an event to the current state. Generally speaking, these rules should allow you to start with a blank/initial state and then apply each of the events (in order) to arrive at an accurate representation for the current state. Optionally, you may want the rules to also be reversible: if there are a lot of events, it can be useful to move from the state at one moment to the state at a different moment (rather than starting from the initial state every time).
Drawing on my own experience with this: I recommend having one rule for each event (if using Object-Oriented programming, each type of event can be its own class, with a pair of methods representing Apply and Reverse). I also recommend allowing "reverse" attempts to cleanly report failures (ie. a TryReverse method): if reversal fails, you can always fall back to deriving from the initial state.
Lastly, if you intend to allow users to explore alternative timelines, you'll want some way to "branch" the current source of truth at a specific point. Depending on the scale and complexity of your project, that might be as simple as duplicating the collection that stores the series of events (I've had good results thus far simply duplicating a subset of my <1000-event collection.)

Potential Pitfalls

While I still have fairly limited experience using it, I've already run into a few pitfalls that I attribute to this pattern.

The Future Is Trouble
The Event Source should not contain things that are planned for the future - it should only be things that are absolute (and, ideally, already in effect.) Consider, for example, signing a contract to pay X on Y date: one might assume that it's safe to add an event to record that on Y date X will be paid - but contracts can be revised/broken, and once something is added to the source of truth, it cannot be altered or removed.
Event Order Matters
Somewhat related to the earlier point: the order in which events are added has an impact on the order they are processed. This comes in two variants - firstly, using a simple ordered collection (ie. a Queue) for the event source means that events will be processed in the exact order that they were added. Secondly, when using a collection that organizes events according to the time they occur (ie. a SortedDictionary that uses the time as its key) if a prior event adds a future event, then all branches that occur after the prior event will necessarily contain that future event.
External Systems Can Be Trouble
Although I haven't yet encountered this myself, others have written about this risk extensively: just because one system can simulate/roll-back to any point in time with ease, doesn't mean other systems will be able to handle that. Consider, for example, rolling back deposits/withdrawals of money: even if all the money is tracked inside one system (ie. customer loyalty rewards points), you still can't reverse the real-world consequences of those deposits/withdrawals (ie. spending the points to get a physical object, or donating them to another person in exchange for favors.)

Conclusion

Despite the added complexity and pitfalls, Event Sourcing remains an excellent fit for my timeline-based side project. Development's been slow, but as it comes together, this approach is making it possible to explore the infinite varieties of possible futures in a clean, intuitive way. If you find yourself in a similar situation - needing to simulate not only the present, but alternate timelines branching off the past, I highly recommend using the Event Sourcing pattern.