ParaSeq
NAME
ParaSeq - Parallel execution of Iterables
SYNOPSIS
use ParaSeq;
DESCRIPTION
ParaSeq provides the functional equivalent of
hyper and
race, but
re-implemented from scratch with all of the experience from the
initial implementation of hyper
and <race> in 2014, and using
features that have since been added to the Raku Programming Language.
As such it exports two subroutines hyperize
and racify
, to
make them plug-in compatible with the
hyperize distribution.
IMPROVEMENTS
Automatic batch size adaptation
One of the main issues with the current implemementation of .hyper
and .race
in the Rakudo core is that the batch size is fixed. Worse,
there is no way to dynamically adapt the batch size depending on the
load.
Batch sizes that are too big, have a tendency to not use all of the CPUs (because they have a tendency to eat all of the source items too soon, thus removing the chance to start up more threads).
Batch sizes that are too small, have a tendency to have their resource usage drowned out by the overhead of batching and dispatching to threads.
This implementation aims to adapt batch sizes from the originally (implicitely) specified one for better throughput and resource usage.
Unnecessary parallelization
If the degree
specified is 1, then there is no point in batching
or parallelization. In that case, this implementation will take itself
completely out of the flow.
Alternately, if the initial batch size is large enough to exhaust the source, it is clearly too large. Which is interpreted as not making any sense at parallelization either. So it won't.
Note that the default initial batch size is 10, rather than 64
in the current implementation of .hyper
and .race
, making the
chance smaller that parallelization is abandoned too soon.
Infectiousness
The .serial
method or .Seq
coercer can be typically be used to
"unhyper" a hypered sequence. However many other interface methods do
the same in the current implementation of .hyper
and .race
,
thereby giving the impression that the flow is still parallelized.
When in fact they aren't anymore.
Also, hyperized sequences in the current implementation are considered to be non-lazy, even if the source is lazy.
This implementation aims to make all interface methods pass on the hypered nature and laziness of the sequence.
Loop control statements
Some loop control statements may affect the final result. Specifically
the last
statement does. In the current implementation of .hyper
and .race
, this will only affect the batch in which it occurs.
This implementation aims to make last
stop any processing of current
and not create anymore batches.
Support more interface methods
Currently only the .map
and .grep
methods are completely supported
by the current implementation of .hyper
and .race
. Other methods,
such as .first
, will also be supported.
Use of phasers
When an interface method takes a Callable
, then that Callable
can contain phasers that may need to be called (or not called) depending
on the situation. The current implementation of .hyper
and .race
do not allow phasers at all.
This implementation aims to support phasers in a sensible manner:
ENTER
Called before each iteration.
FIRST
Called on the first iteration in the first batch.
NEXT
Called at the end of each iteration.
LAST
Called on the last iteration in the last batch. Note that this can be
short-circuited with a last
control statement.
LEAVE
Called after each iteration.
AUTHOR
Elizabeth Mattijsen <[email protected]>
Source can be located at: https://github.com/lizmat/ParaSeq . Comments and Pull Requests are welcome.
If you like this module, or what Iām doing more generally, committing to a small sponsorship would mean a great deal to me!
COPYRIGHT AND LICENSE
Copyright 2024 Elizabeth Mattijsen
This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.
# vim: expandtab shiftwidth=4