FsStorm
Overview
FsStorm is a library for implementation of Apache Storm components, definition of topologies in F# DSL and submission via F# scripts for execution. The topology and the components could be implemented in a single EXE project and are executed by Storm via its multilang protocol as separate processes - one for each task/instance. Accompanying FsJson library is used for dealing with Json structures passed in and out of Storm.
FsStorm components
FsStorm components are defined as functions that take at least one (last) argument: configuration passed in from Storm. In practice, you'll want to pass all your dependencies in, and that means at least one other: a runner, passed in from your topology. Additionally you can pass as many arguments from the topology as needed. Think of the component function as "main" for your program. Storm will start (a copy of) the same EXE for all components in the topology, and will instruct each instance with the task it supposed to execute. The "main" function will be called by FsStorm once per instance of every component and its purpose is to construct either "next" function for spouts or "consume" function for bolts and pass it to a runner. FsStorm implements several runners that either talk to Storm or allow you to unit-test your components by recording outputs or playing back the inputs.
FsStorm tuples
Storm components communicate by passing tuples to each other over streams. The tuples are emmited into streams and have schema defined by the spout topology Output element. Storm multilang is wrapped and accessible via included FsJson. In addition to raw json access, FsStorm defines several helpers: tuple, namedStream, anchor, etc. that help to abstract the specifics of underlying multilang.
Example of a spout
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: |
|
Topology DSL in F#
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: |
|
Submitting the topology using F# scripts
1: 2: 3: 4: 5: 6: 7: |
|
Exporting the topology graph in DOT format (GraphViz) using F# scripts
1: 2: 3: 4: 5: 6: |
|
Samples & documentation
FstSample contains a "unreliable" spout example - emitted tuples do not require ack, could be lost in case of failure.
FstGuaranteedSample contains a "reliable" spout example - emitted tuples have unique ID and require ack.
API Reference contains automatically generated documentation for public types, modules and functions in the library.
WordCount contains a simple example showing a spout with two bolts.
Getting FsStorm
Contributing and copyright
The project is hosted on GitHub where you can report issues, fork the project and submit pull requests. If you're adding a new public API, please also consider adding samples that can be turned into a documentation. You might also want to read the library design notes to understand how it works.
The library is available under MIT license, which allows modification and redistribution for both commercial and non-commercial purposes. For more information see the License file in the GitHub repository.
Full name: Index.rnd
type Random =
new : unit -> Random + 1 overload
member Next : unit -> int + 2 overloads
member NextBytes : buffer:byte[] -> unit
member NextDouble : unit -> float
Full name: System.Random
--------------------
System.Random() : unit
System.Random(Seed: int) : unit
Full name: Index.spout
spout - produces messages
cfg: the configuration passed in by Storm
runner: a spout runner function (passed in from topology)
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.async
System.Random.Next(maxValue: int) : int
System.Random.Next(minValue: int, maxValue: int) : int
Full name: Index.topology
Full name: Microsoft.FSharp.Core.Operators.log
Full name: Index.binDir
Full name: StormSubmit.runTopology
Full name: StormSubmit.default_nimbus_port