OrleanSpaces - v1.0.0

OrleanSpaces has finally reached a point in its development where I deem it to be production ready. Therefor I have launched v1.0.0 of the package, alongside the analyzer package.

What's changed?

Method and signature changes to the agent interface.
Sepparation of SpaceOptions into SpaceServerOptions and SpaceClientOptions.
Direct array passing to constructors of SpaceTuple and SpaceTemplate.
Space partitioning.
Configurable agent startup behavior.
OSA004 analyzer and code fixer.

Agent Interface

Due to data locality, the agent(s) can directly expose some functionalities as synchronous operations, as opposed to asynchronous as it was before. Below we have given the changes to the ISpaceAgent interface, but the same are true for the generic versions ISpaceAgent<T, TTuple, TTemplate>.

The ValueTask<int> returning method CountAsync(), has been replaced with a read only int property called Count.
The ValueTask<SpaceTuple> returning method PeekAsync(SpaceTemplate template), has been replaced with a synchronous version called Peek(SpaceTemplate template). This is because the data is in memory and accessible to the agent.
⚠️ Note that the version accepting a callback PeekAsync(SpaceTemplate template, Func<SpaceTuple, Task> callback), remains asynchronous.
The ValueTask<IEnumerable<SpaceTuple>> returning method ScanAsync(), has been replaced with a synchronous version called Enumerate(SpaceTemplate template = default). If a SpaceTemplate argument is not passed, or it is the default(SpaceTemplate), than all tuples in the space will be enumerated. Otherwise, only tuples that are matched against the template will be enumerated.
The IAsyncEnumerable<SpaceTuple> returning method PeekAsync(), has been replaced with another asynchronous version called EnumerateAsync(SpaceTemplate template = default). This has been done for consistency reasons with Enumerate(template). Same behavior applies here too, if a SpaceTemplate argument is not passed, or it is the default(SpaceTemplate), than all tuples (as they get written in the space) will be enumerated. Otherwise, only tuples that are matched against the template will be enumerated.
ReloadAsync() has been added. This method allows re/loading of the space contents (i.e. tuples) into the agent, on-demand. This has been added in relation to the configurable agent startup behavior.

Options Sepparation

The SpaceOptions has been removed and replaced with SpaceClientOptions. In addition SpaceServerOptions has been added. The extension method AddOrleanSpaces on the IClientBuilder now accepts an optional Action<SpaceClientOptions> to configure the client options. Whereas the AddOrleanSpaces on the ISiloBuilder now accepts an optional Action<SpaceClientOptions>, and an optional Action<SpaceServerOptions>. This has been done in order to allow configurations to be split on responsibility.

If the client options configurator is not provided on the client, by default the generic agent is configured to run.
If the client options configurator is not provided on the server, by default no agent is configured to run.

SpaceServerOptions currently only contains a single property called PartitioningThreshold. As the name implies, this is related to the space partitioning. It defines the maximum number of tuples that should be stored within a partition, per space kind.

SpaceClientOptions remains almost the same as the old SpaceOptions but we have added 2 extra properties: LoadSpaceContentsUponStartup and LoadingStrategy. These relate to the configurable agent startup behavior, and we'll discuss them in more details later on.

Direct Array Passing

SpaceTuple and SpaceTemplate had a quirk that they didn't allow for passing an object[] directly into their constructors as a single argument. The specialized tuples and templates did not have this limitation. This release brings such ability to the generic versions.

It is useful especially when clients build an array of elements in a loop and wants to pass it to the constructor, or if they already get an array from a source that they can not control. Overall this feature makes it more feasible to work with the generic versions.

OSA003, the analyzer that checks if the supplied argument type is not supported, has been also adjusted to take this into consideration.

Space Partitioning

Its in my opinion the biggest and most impactful change, and probably the one that pushed me over the fence to decide on labelling this release as an official v1. Previously there was a single grain per space kind that stored all tuples contained in a list. This inevitably would result in rapid performance degradation with space itself growing, because of frequent updates to a large list, and the fact that each update involves serializing and persisting the entire dataset.

Now OrleanSpaces employs partitioning the space into multiple store grains. Partitioning is controlled via the PartitioningThreshold set on the SpaceServerOptions. Whenever the threshold is crossed, a new partition is created, and subsequent tuple writes go into this new partition.

While its true that existing partitions may go below the threshold due to removals of the tuples, we have gone against an adaptable approach that would balance the tuples, in the existing partitions. The reasoning behind it, is the fact that we would not gain anything from redistributing the tuples, because searching is done on the client side, and removals of a single tuple already contains information from which partition it came from.

In addition, when a partition is cleared from all tuples it contained, the entity that represents the partition (depending on the configured storage provider) can be wiped out completely (depending on the configuration of Orleans itself) as the store grain that keeps the partition is cleared and deactived on such occassion.

Clearing of the whole space involves clearing of each partition. This represents a distributed transaction. Orleans supports distributed transactions, but its an "All-or-Nothing" approach with these, which means that every method on a grain interface needs (in some way, shape, or form) to go through either creating or joining a transaction.

We went against using Orleans' transaction mechanisim do avoid a hit on performance, and instead employed a simplified transaction management on the director grain itself. This kicks in only when the whole space is invoked to be cleared.

There are no complicated mechanisms behind it, as the clearing method is idempotant so its safe to call it again in case of partial failures. When a partition (represeted via a store grain) that has been cleared gets invoked to be "re-cleared" it has no side effects, as such grain does not exist anymore. The director grain will retry untill all partitions are cleared, and at that point it will mark its internal transaction as "done".

Startup Behavior

As mentioned on the options sepparation section, the SpaceClientOptions now includes 2 new properties, and with the addition of ReloadAsync, these 3 combined allow the client to control the behavior of an agent's startup process.

Loading Upon Startup

LoadSpaceContentsUponStartup is a bool flag, which if set to false, the agents will not load the space contents (i.e. tuples), upon its startup. This is useful in cases where the agent is used to perform only writes, or the application needs fast startup times. By default this values is set to true. Space contents can always be reloaded via the new ReloadAsync method on the agents.

Loading Strategy

Gathering all tuples at once from the director grain is the more efficient way, since it involves n + 1 calls, where n is the number of partitions (i.e. store grains), and 1 is a call from the agent to the director. These n calls are all done in parallel, but we need to keep in mind the potential size of the whole tuple space, so when the number of partitions grows a lot, it might result in contention for threads, and may lead to ThreadPool starvation.

An alternative is for the director to expose a way to load the data in batches, where a batch is defined to be the contents of a single partition. This batching approach basically means the director calls the store grains one-by-one and streams back the result, which the agent appends to its in-memory dataset. This does result in a slower loading of the whole tuple space, as there are 2*n calls (1 call to the director + 1 call to a partition, for n-number of partitions), but ultimately this avoids potential ThreadPool starvation.

LoadingStrategy is an enum with two options: Sequential and Parallel.

Use Sequential loading, if fast loading time is not important, and the space is heavily partitioned.
Use Parallel loading, if fast loading time is important, and the space is not heavily partitioned.

OSA004

Severity: Info
Category: Performance
Code Fix: Available

Instantiation of a SpaceTuple or SpaceTemplate having arguments of the same type should be avoided because there exists an appropriate tuple/template which is specialized for that type, and provides significant performance benefits. OSA004 is an analyzer which detects such usages and informs the user, while providing an automatic code fix.

Below we can see examples of such violations:

var tuple1 = new SpaceTuple(1, 2, 3);
SpaceTuple tuple2 = new(1f, 2f, 3f);
SpaceTuple tuple3 = new SpaceTuple(1d, 2d, 3d);

var template1 = new SpaceTemplate('a', null, 'b');
SpaceTemplate template2 = new(1m, null, 2m);
SpaceTemplate template3 = new SpaceTemplate(null, null, DateTime.MaxValue);

The analyzer will pick up all syntax types for instantiation (traditional, simplified, and using 'var'). Below is shown how the fixer converts them to their appropriate type.

var tuple1 = new IntTuple(1, 2, 3);
FloatTuple tuple2 = new(1f, 2f, 3f);
DoubleTuple tuple3 = new DoubleTuple(1d, 2d, 3d);

var template1 = new CharTemplate('a', null, 'b');
DecimalTemplate template2 = new(1m, null, 2m);
DateTimeTemplate template3 = new DateTimeTemplate(null, null, DateTime.MaxValue);

If you found this article helpful please give it a share in your favorite forums 😉.
The solution project is available on GitHub.