Long-running business processes exist in many systems.
Whether the steps are automated, manual, or a combination,
effective handling of these processes is critical.
NServiceBus employs event-driven architectural principles
to bake fault-tolerance and scalability into these processes.
The Saga is a pattern that addresses these challenges
uncovered by the relational database community years ago,
packaged in NServiceBus for ease of use by developers.
Long-Running
One of the common mistakes developers make when designing distributed systems is based on the assumptions that time is constant. If something runs quickly on their machine, they're liable to assume it will run with a similar performance characteristic in production.
Network invocations (like web service calls) are misleading this way. When invoked on the developers local machine, they perform well. In production, across firewalls and datacenters, they don't perform nearly as well.
While a single web service invocation need not be considered "long-running", once we have two or more calls within a given use case, we need to take into account issues of consistency: the first call may be successful but the second call can timeout. Sagas give us the ability to code for these cases in a simple and robust fashion.
Design processes with more than one remote call using sagas.
While it may seem excessive at first, the business implications of your system getting out of sync with the other systems it interacts with can be substantial. It's not just about exceptions that end up in your log files.
Long-Running Means Stateful
Any process that involves multiple network calls (or messages sent and received) has interim state. That state may be kept in memory, persisted to disk, or stored in a distributed cache; it may be as simple as 'Response 1 received, pending response 2', but the state exists.
Using NServiceBus, you can explicitly define the data used for this state by implementing the interface IContainSagaData - all public get/set properties will be persisted by default:
public class MySagaData : IContainSagaData { // the following properties are mandatory public Guid Id { get; set; } public string Originator { get; set; } public string OriginalMessageId { get; set; } // all other properties you want persisted }
By default NServiceBus stores your sagas in RavenDB, the schemaless nature of document databases makes them a perfect fit for saga storage where each saga instance is persisted as a single document. There is also support for relational databases using NHibernate. The NHibernate support is located in the NServiceBus.NHibernate assembly. You can, as always, swap out these technologies - just implement the IPersistSagas interface.
Adding behavior
The important part of a long-running process is its behavior. With NServiceBus, you specify behavior by writing a class which implements ISaga<T> where T is the saga data. There is also a base class for sagas which contains many features necessary when implementing long-running processes. All the examples below will make use of this base class.
Just like regular message handlers the behavior of a saga is implemented via the IHandleMessages<M> interface for the message types to be handled. Here is a saga that processes messages of type Message1 and Message2:
public class MySaga : Saga<MySagaData>, IHandleMessages<Message1>, IHandleMessages<Message2> { public void Handle(Message1 message) { // code to handle Message1 } public void Handle(Message2 message) { // code to handle Message2 } }
Starting and correlating sagas
Since a saga manages the state of a long-running process, the first question that needs to be addressed is under which conditions should a new saga be created. Sometimes it's as simple as the arrival of a given message type. In our previous example, let's say that a new saga should be started every time a message of type Message1 arrives - that's done like this:
public class MySaga : Saga<MySagaData>, IAmStartedByMessages<Message1>, IHandleMessages<Message2> { public void Handle(Message1 message) { // code to handle Message1 } public void Handle(Message2 message) { // code to handle Message2 } }
Notice that we've replaced IHandleMessages<Message1> with IAmStartedByMessages<Message1>. This interface tells NServiceBus that the saga not only handles Message1, but that when that type of message arrives, a new instance of this saga should be created to handle it.
Now the question we have to deal with is how to correlate a Message2 message to the right saga that's already running. Usually, there's some applicative ID in both types of messages that we can use to correlate between them - the only thing we need to do is to store this in the saga data, and tell NServiceBus about the connection. Here's how that would look:
public class MySaga : Saga<MySagaData>, IAmStartedByMessages<Message1>, IHandleMessages<Message2> { public override void ConfigureHowToFindSaga() { ConfigureMapping<Message2>(s => s.SomeID, m => m.SomeID); } public void Handle(Message1 message) { this.Data.SomeID = message.SomeID; // rest of the code to handle Message1 } public void Handle(Message2 message) { // code to handle Message2 } }
What happens underneath the covers is that when Message2 arrives, NServiceBus goes to the saga persistence infrastructure, and asks it to find an object of the type MySagaData that has a property SomeID whose value is the same as the SomeID property of the message.
Uniqueness
To have NServiceBus ensure uniqness across your saga instances it's highly recommended that you adorn your correlation properties with the [Unique] attribute. This tells NServiceBus that there can be only one instance for each property value. This will also increase performance for property lockups in most cases. While there are plans for it NServiceBus doesn't currently support one message being mapped to more than one instance of the same saga.
Read more about the Unique property and concurrency here.
Notifying callers of status
While you always have the option of publishing a message at any time in a saga, sometimes you want to notify the original caller who caused the saga to be started of some interm state that isn't relevant to other subscribers.
If you tried to use Bus.Reply() or Bus.Return() to communicate with the caller, that would only achieve the desired result in the case where the current message came from that client - and not in the case where any other partner sent a message arriving at that saga.
For this reason, you can see that the saga data contains the original client's return address. It also contains the message ID of the original request so that the client will be able to correlate status messages on it's end.
Here's how you'd communicate status in our ongoing example:
public class MySaga : Saga<MySagaData>, IAmStartedByMessages<Message1>, IHandleMessages<Message2> { public override void ConfigureHowToFindSaga() { ConfigureMapping<Message2>(s => s.SomeID, m => m.SomeID); } public void Handle(Message1 message) { this.Data.SomeID = message.SomeID; // rest of the code to handle Message1 } public void Handle(Message2 message) { // code to handle Message2 ReplyToOriginator(new AlmostDoneMessage { SomeID = message.SomeID }); } }
This is one of the methods on the Saga base class that would be very difficult to implement yourself without tying your applicative saga code to low-level parts of the NServiceBus infrastructure.
Timeouts
When working in a message-driven environment, we cannot make assumptions about when the next message will arrive. While the connectionless nature of messaging prevents our system from bleeding expensive resources while waiting, there is usually an upper limit on how long we should wait from a business perspective. At that point, some business-specific action should be taken. This is shown below:
public class MySaga : Saga<MySagaData>, IAmStartedByMessages<Message1>, IHandleMessages<Message2>, IHandleTimeouts<MyCustomTimeout> { public override void ConfigureHowToFindSaga() { ConfigureMapping<Message2>(s => s.SomeID, m => m.SomeID); } public void Handle(Message1 message) { this.Data.SomeID = message.SomeID; RequestUtcTimeout<MyCustomTimeout>(TimeSpan.FromHours(1)); // rest of the code to handle Message1 } public void Timeout(MyCustomTimeout state) { // some business action like: if (!Data.Message2Arrived) ReplyToOriginator(new TiredOfWaitingForMessage2()); } public void Handle(Message2 message) { // code to handle Message2 Data.Message2Arrived = true; ReplyToOriginator(new AlmostDoneMessage { SomeID = message.SomeID }); } }
The RequestUtcTimeout<T> method on the base class tells NServiceBus to send a message to what is called a Timeout Manager(TM) which will durably keep time for us. In NServiceBus each endpoint will host a TM by default so there is no configuration needed to get this up and running.
When time is up, the Timeout Manager sends a message back to the saga causing its Timeout method to be called with the same state message originally passed.
Important: don't assume that other messages haven't arrived in the meantime.
Ending a long-running process
After receiving all the messages needed in a long-running process, or possibly after a timeout (or two, or more) you'll want to clean up the state that was stored for the saga. This is done simply by calling the MarkAsComplete() method.
The infrastructure contacts the Timeout Manager (if an entry for it exists) telling it that timeouts for the given saga ID can be cleared.
If any messages relating to that saga arrive after it has completed, they will be discarded. If you want a copy of these messages to be maintained, that can be handled by the generic audit functionality in NServiceBus as explained here.
Complex saga finding logic
Sometimes there are certain message types handled by a saga which do not have a single simple property that can be mapped to a specific saga instance. In those cases, you'll want finer-grained control of how to find a saga instance. This is done as follows:
public class MySagaFinder : IFindSagas<MySagaData>.Using<Message2> { public MySagaData FindBy(Message2 message) { //your custom finding logic here } }
You can have as many of these classes as you want for a given saga or message type. If a saga can't be found, return null, and if the saga specifies that it is to be started for that message type, NServiceBus will know that a new saga instance is to be created.
Sagas in self-hosted endpoints
When hosting NServiceBus in your own endpoint (as described here), make sure to include .Sagas().RavenSagaPersister() as follows:
NServiceBus.Configure.With()
.Log4Net()
.DefaultBuilder()
.XmlSerializer()
.MsmqTransport()
.UnicastBus()
.Sagas()
.RavenSagaPersister()
.CreateBus()
.Start();
Sagas and automatic subscriptions
In NServiceBus 3.0 and onwards the autosubscription feature applies to sagas as well as your regular message handlers. This is a change compared to earlier versions of NServiceBus.
Sagas and request/response
Sagas often plays the role of coordinator, especially when used in integration scenarios. In essence this mean that the saga decides what to do next and then asks some one else to do it. This allows you to keep your sagas free from interacting with no transactional things like file systems, rest services, etc. The type of communication pattern best suited best for these type of interactions is the request/response pattern since there is really only one party interested in the response and that is the saga itself.
A typical scenario would be a saga controlling the process of billing a customer through visa or mastercard. In this case you would probably have separate endpoints for making the webservice/rest-calls to each payment provider and a saga coordinating retries and fallback rules. Each payment request would be a separate saga instance so how would we know which instance to hydrate and invoke when the response comes back?
The usual way to do this is to correlate on some kind of ID and let the user tell us how to find the correct saga instance using that ID. While this is easily done we decided that this was common enough to warrant native support in NServiceBus for these type of interactions.
In 3.0 we can now handle all this for you without getting in your way. If you do a IBus.Reply in response to a message coming from a saga we’ll detect this and automatically set the correct headers so that we can correlate the reply back to the saga instance that issued the request. You can see all this in action in our Manufacturing sample.
