ParaSeq

Parallel execution of Iterables

NAME

ParaSeq - Parallel execution of Iterables

SYNOPSIS

use ParaSeq;

DESCRIPTION

ParaSeq provides the functional equivalent of hyper and race, but re-implemented from scratch with all of the experience from the initial implementation of hyper and in 2014, and using features that have since been added to the Raku Programming Language.

As such it exports two subroutines hyperize and racify, to make them plug-in compatible with the hyperize distribution.

IMPROVEMENTS

Automatic batch size adaptation

One of the main issues with the current implemementation of .hyper and .race in the Rakudo core is that the batch size is fixed. Worse, there is no way to dynamically adapt the batch size depending on the load.

Batch sizes that are too big, have a tendency to not use all of the CPUs (because they have a tendency to eat all of the source items too soon, thus removing the chance to start up more threads).

Batch sizes that are too small, have a tendency to have their resource usage drowned out by the overhead of batching and dispatching to threads.

This implementation aims to adapt batch sizes from the originally (implicitely) specified one for better throughput and resource usage.

Unnecessary parallelization

If the degree specified is 1, then there is no point in batching or parallelization. In that case, this implementation will take itself completely out of the flow.

Alternately, if the initial batch size is large enough to exhaust the source, it is clearly too large. Which is interpreted as not making any sense at parallelization either. So it won't.

Note that the default initial batch size is 10, rather than 64 in the current implementation of .hyper and .race, making the chance smaller that parallelization is abandoned too soon.

Infectiousness

The .serial method or .Seq coercer can be typically be used to "unhyper" a hypered sequence. However many other interface methods do the same in the current implementation of .hyper and .race, thereby giving the impression that the flow is still parallelized. When in fact they aren't anymore.

Also, hyperized sequences in the current implementation are considered to be non-lazy, even if the source is lazy.

This implementation aims to make all interface methods pass on the hypered nature and laziness of the sequence.

Loop control statements

Some loop control statements may affect the final result. Specifically the last statement does. In the current implementation of .hyper and .race, this will only affect the batch in which it occurs.

This implementation aims to make last stop any processing of current and not create anymore batches.

Support more interface methods

Currently only the .map and .grep methods are completely supported by the current implementation of .hyper and .race. Other methods, such as .first, will also be supported.

Use of phasers

When an interface method takes a Callable, then that Callable can contain phasers that may need to be called (or not called) depending on the situation. The current implementation of .hyper and .race do not allow phasers at all.

This implementation aims to support phasers in a sensible manner:

ENTER

Called before each iteration.

FIRST

Called on the first iteration in the first batch.

Called at the end of each iteration.

LAST

Called on the last iteration in the last batch. Note that this can be short-circuited with a last control statement.

LEAVE

Called after each iteration.

AUTHOR

Elizabeth Mattijsen [email protected]

Source can be located at: https://github.com/lizmat/ParaSeq . Comments and Pull Requests are welcome.

If you like this module, or what I’m doing more generally, committing to a small sponsorship would mean a great deal to me!

COPYRIGHT AND LICENSE

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.

ParaSeq

NAME

SYNOPSIS

DESCRIPTION

IMPROVEMENTS

Automatic batch size adaptation

Unnecessary parallelization

Infectiousness

Loop control statements

Support more interface methods

Use of phasers

ENTER

FIRST

NEXT

LAST

LEAVE

AUTHOR

COPYRIGHT AND LICENSE

ParaSeq v0.0.1

Authors

License

Dependencies

Test Dependencies

Provides

Documentation