High Availability for Event Hub Overview

Since version 2.2, Event Hub can run in a HA configuration employing two or more horizontally scalable nodes. One node is designated the primary role and handles the communication with clients under normal operation. One of the other nodes are running in active backup mode, and only becomes operational in the event the live primary node, for some reason, disappear. When an active backup node is promoted to live backup, one of the remaining passive backup nodes, if it exists, takes on the active backup role, ready to become the new live backup, if necessary. When the live role moves to a new node, subscriber and publisher connections will automatically fail-over to that new live node. This gives the Event Hub as a system the ability to continue functioning after losing one or more of its nodes.

The primary and active/passive backup nodes do not share the same data directories on the file system; instead, all data synchronization is done over the network. All persistent data received by the live node will be synchronously replicated over to the active backup.

The different node roles are resolved during application start-up. The node that last held the live role will become the new live primary node, and the fist of the other nodes to connect when the new primary has started is elected the active backup role. All remaining nodes will take on a passive backup role. During start-up, an active backup node will first need to synchronize existing data from the live primary node before it could replace the primary should it fail. This means that a replicating backup will not be fully operational immediately after start, but only after it has finished synchronizing with the primary. The time it will take for this to happen will depend on the amount of data to be synchronized and the speed of the network connection. This also means that an active backup node will always wait until the live primary node is fully started before completing its own start up. Hence, a cold start of the Event Hub application must always involve starting up the node that last had the live role.