Clustering
FsShelter topologies can run under cluster supervision for automatic master/standby failover — no external coordinator (ZooKeeper, etcd, etc.) required. Add a few configuration options to your existing topology and swap Hosting.run for Supervisor.runCluster.
Only one peer in the cluster — the master — runs the topology at any time. Standby peers gossip among themselves and monitor liveness. If the master goes down, the remaining peers elect a new master and activate the topology there.
Key properties:
- No external dependencies — peers communicate directly over TCP; no coordinator service needed.
- Deterministic election — Rendezvous (HRW) hashing; every peer computes the same result independently.
- Transparent fallthrough — when
CLUSTER_SEEDSis empty,Supervisor.runClusterdelegates toHosting.runWithfor single-process parity. Same topology code, zero cluster overhead. - Same topology code — sync, async, reliable, unreliable — all component types work under cluster supervision without changes.
Quick start
Configure your topology with cluster options and use Supervisor.runCluster instead of Hosting.run:
open FsShelter.Cluster
let shutdown =
myTopology
|> withConf [ CLUSTER_LISTEN "0.0.0.0:6700"
CLUSTER_SEEDS "node1:6700,node2:6700,node3:6700"
CLUSTER_QUORUM 2 ]
|> Supervisor.runCluster (fun _ _ -> ignore)
match shutdown with
| Ok stop -> stop() // graceful shutdown
| Error msg -> eprintfn "Cluster startup failed: %s" msg
runCluster returns Result<unit -> unit, string> — startup failures (bad config, transport bind, peer-id I/O) are returned as Error rather than thrown.
How it works
The supervisor is a pure state machine (Supervisor.step) driven by external events. The impure orchestration (runClusterWith) wires together the TCP transport, periodic heartbeats, membership tracking, and election recompute.
Supervisor states
|
State |
What's happening |
|---|---|
Standby |
Not the elected master. Only gossip and transport machinery are running. |
Activating |
Just won the election. Debouncing for |
Master |
Elected master and the topology is running. |
Stopping |
Shutdown was requested. Terminal state — no further transitions. |
The transition from Activating → Standby (lost mastership before stabilize window expires) is key: the topology was never started, so no StopTopology effect is emitted.
Failure detection (SWIM)
Peers detect failures using a lightweight gossip protocol inspired by SWIM:
- Each peer sends heartbeat pings at
CLUSTER_HEARTBEAT_MSintervals to all known peers, piggy-backing membership digests (peer state + incarnation numbers). - If a peer stops responding, it transitions from Alive → Suspect after
CLUSTER_SUSPECT_TIMEOUT_MSwith no heartbeat received. - A suspected peer can refute by responding with a newer incarnation — transitioning back to Alive.
- If no refutation arrives, the peer transitions from Suspect → Dead and is evicted.
Member states
|
Quorum: Only Alive and Suspect peers count toward quorum. The topology activates only when the cluster has at least CLUSTER_QUORUM live members AND the local peer is the elected master.
Epochs: Membership changes (joins, deaths) bump the cluster epoch. Only epoch changes trigger election recompute — suspect flaps do not.
Leader election (HRW)
FsShelter uses Rendezvous Hashing (Highest Random Weight) for deterministic, decentralized leader election.
How it works
Each peer independently computes a score for every live peer:
score(peer) = SHA-256(topologyName | epoch_BE8 | peer_GUID)
The peer with the highest score wins. Ties are broken by PeerId ordinal comparison.
Key properties
- Deterministic — all peers compute the same result independently; no voting or coordination messages needed.
- Epoch stickiness — folding the epoch into the hash input makes mastership stable: a higher-scored peer rejoining without a membership change cannot steal the role until the live set shifts (which bumps the epoch).
- Minimal disruption — when a peer joins or leaves, on average only 1/N of the election results change.
Quorum requirement
The topology only activates when MemberView.hasQuorum is satisfied. A cluster of 3 peers with CLUSTER_QUORUM 2 can tolerate 1 failure. A cluster of 5 with CLUSTER_QUORUM 3 can tolerate 2.
Stabilization window
After winning an election, the master waits CLUSTER_STABILIZE_MS (default: 2000ms) before activating the topology. This debounce window prevents rapid start/stop when membership is still settling (e.g., peers starting up in quick succession).
If mastership is lost during the stabilization window, the peer returns to Standby without ever starting the topology — avoiding unnecessary initialization and teardown.
Configuration reference
Option |
Default |
Effect |
|---|---|---|
|
(none) |
Comma-separated |
|
(none) |
|
|
(none) |
Minimum live peers required before topology activation. Typically |
|
1000 |
Heartbeat ping interval in milliseconds. |
|
5000 |
Time without heartbeat before marking a peer as Suspect, then Dead. |
|
2000 |
Post-election debounce window in milliseconds. |
|
|
Directory for persistent state ( |
|
1024 |
Per-peer outbound message queue limit. |
|
262144 (256KB) |
Maximum single frame size on the wire. |
Example with all options:
myTopology
|> withConf [ CLUSTER_LISTEN "0.0.0.0:6700"
CLUSTER_SEEDS "node1:6700,node2:6700,node3:6700"
CLUSTER_QUORUM 2
CLUSTER_HEARTBEAT_MS 1500
CLUSTER_SUSPECT_TIMEOUT_MS 7500
CLUSTER_STABILIZE_MS 3000
CLUSTER_STATE_DIR "/var/lib/myapp/cluster"
CLUSTER_SEND_QUEUE_BOUND 2048
CLUSTER_MAX_FRAME_BYTES (512 * 1024) ]
Graceful shutdown
When the shutdown function is called:
- A Leaving message is sent to all known peers — they can immediately mark this peer as Dead and re-elect without waiting for the suspect timeout.
- The heartbeat timer is disposed.
- A
Shutdowninput is posted to the supervisor mailbox. - If the peer is the current master, the topology is stopped (using the standard shutdown sequence: stop spouts → drain window → stop bolts → stop ackers).
- The TCP transport is stopped.
Single-process fallthrough
If CLUSTER_SEEDS is empty (and CLUSTER_LISTEN is not set), Supervisor.runCluster delegates to Hosting.runWith — your topology runs in single-process mode with zero cluster overhead. This means you can use Supervisor.runCluster as your only entry point and control the mode via configuration:
// Development: no cluster config → single-process
myTopology |> Supervisor.runCluster log
// Production: add cluster config → supervised failover
myTopology
|> withConf [ CLUSTER_LISTEN "0.0.0.0:6700"
CLUSTER_SEEDS "node1:6700,node2:6700,node3:6700"
CLUSTER_QUORUM 2 ]
|> Supervisor.runCluster log
Advanced: custom dependencies
For testing or alternate hosting scenarios, Supervisor.runClusterWith accepts a ClusterDeps<'t> record that provides all external capabilities:
type ClusterDeps<'t> =
{ NowMs: unit -> int64 // clock
LoadOrCreatePeerId: string -> Result<PeerId, StoreError> // persistent identity
LoadPeersCache: string -> Result<PeerEndpoint list, StoreError>
SavePeersCache: string -> PeerEndpoint list -> Result<unit, StoreError>
CreateTransport: ... -> PeerTransport // TCP transport
RunTopology: (int -> Log) -> Topology<'t> -> (unit -> unit) } // topology host
Production defaults are provided by Supervisor.defaultDeps, which uses the system clock, on-disk stores, real TCP transport, and Hosting.runWith for topology hosting. Override individual fields for integration testing or custom environments.
See also
- Running Topologies — Entry points, configuration, diagnostics for single-process hosting
- Architecture — Internal runtime structure: tasks, executors, channels, lifecycle
- Message Flow — End-to-end processing scenarios with sequence diagrams
val int64: value: 'T -> int64 (requires member op_Explicit)
--------------------
type int64 = System.Int64
--------------------
type int64<'Measure> = int64
val string: value: 'T -> string
--------------------
type string = System.String
module Result from Microsoft.FSharp.Core
--------------------
type Result<'T,'TError> = | Ok of ResultValue: 'T | Error of ErrorValue: 'TError
val int: value: 'T -> int (requires member op_Explicit)
--------------------
type int = int32
--------------------
type int<'Measure> = int
FsShelter