Traps to avoid

Traps to avoid when getting started with Raku

When learning a programming language, possibly with the background of being familiar with another programming language, there are always some things that can surprise you and might cost valuable time in debugging and discovery.

This document aims to show common misconceptions in order to avoid them.

During the making of Raku great pains were taken to get rid of warts in the syntax. When you whack one wart, though, sometimes another pops up. So a lot of time was spent finding the minimum number of warts or trying to put them where they would rarely be seen. Because of this, Raku's warts are in different places than you may expect them to be when coming from another language.

Variables and constants

Constants are computed at compile time

Constants are computed at compile time, so if you use them in modules keep in mind that their values will be frozen due to precompilation of the module itself:

# WRONG (most likely):
unit module Something::Or::Other;
constant $config-file = "config.txt".IO.slurp;

The $config-file will be slurped during precompilation and changes to config.txt file won't be re-loaded when you start the script again; only when the module is re-compiled.

Avoid using a container and prefer binding a value to a variable that offers a behavior similar to a constant, but allowing the value to get updated:

# Good; file gets updated from 'config.txt' file on each script run:
unit module Something::Or::Other;
my $config-file := "config.txt".IO.slurp;

Assignment of Nil can produce a different value, usually Any

Actually, assignment of Nil to a variable reverts the variable to its default value. For example,

my @a = 4, 8, 15, 16;
@a[2] = Nil;
say @a; # OUTPUT: Ā«[4 8 (Any) 16]ā¤Ā»

In this case, Any is the default value of an Array element.

You can purposefully assign Nil as a default value:

my %h is default(Nil) = a => Nil;
say %h; # OUTPUT: Ā«Hash %h = {:a(Nil)}ā¤Ā»

Or bind a value to Nil if that is the result you want:

@a[3] := Nil;
say @a; # OUTPUT: Ā«[4 8 (Any) Nil]ā¤Ā»

This trap might be hidden in the result of functions, such as matches:

my $result2 = 'abcdef' ~~ / dex /;
say "Result2 is { $result2.^name }"; # OUTPUT: Ā«Result2 is Anyā¤Ā»

A Match will be Nil if it finds nothing; however assigning Nil to $result2 above will result in its default value, which is Any as shown.

Using a block to interpolate anon state vars

The programmer intended for the code to count the number of times the routine is called, but the counter is not increasing:

sub count-it { say "Count is {$++}" }
count-it;
count-it;
# OUTPUT:
# Count is 0
# Count is 0

When it comes to state variables, the block in which the vars are declared gets cloned ā€”and vars get initialized anewā€” whenever that block's block is re-entered. This lets constructs like the one below behave appropriately; the state variable inside the loop gets initialized anew each time the sub is called:

sub count-it {
    for ^3 {
        state $count = 0;
        say "Count is $count";
        $count++;
    }
}
count-it;
say "ā€¦and againā€¦";
count-it;
# OUTPUT:
# Count is 0
# Count is 1
# Count is 2
# ā€¦and againā€¦
# Count is 0
# Count is 1
# Count is 2

The same layout exists in our buggy program. The { } inside a double-quoted string isn't merely an interpolation to execute a piece of code. It's actually its own block, which is just as in the example above gets cloned each time the sub is entered, re-initializing our state variable. To get the right count, we need to get rid of that inner block, using a scalar contextualizer to interpolate our piece of code instead:

sub count-it { say "Count is $($++)" }
count-it;
count-it;
# OUTPUT:
# Count is 0
# Count is 1

Alternatively, you can also use the concatenation operator instead:

sub count-it { say "Count is " ~ $++ }

Using set subroutines on Associative when the value is falsy

Using (cont), āˆ‹, āˆŒ, (elem), āˆˆ, or āˆ‰ on classes implementing Associative will return False if the value of the key is falsy:

enum Foo Ā«a bĀ»;
say Foo.enums āˆ‹ 'a';

# OUTPUT:
# False

Instead, use :exists:

enum Foo Ā«a bĀ»;
say Foo.enums<a>:exists;

# OUTPUT:
# True

Blocks

Beware of empty "blocks"

Curly braces are used to declare blocks. However, empty curly braces will declare a hash.

$ = {say 42;} # Block
$ = {;}       # Block
$ = {ā€¦}       # Block
$ = { }       # Hash

You can use the second form if you effectively want to declare an empty block:

my &does-nothing = {;};
    say does-nothing(33); # OUTPUT: Ā«Nilā¤Ā»

Objects

Assigning to attributes

Newcomers often think that, because attributes with accessors are declared as has $.x, they can assign to $.x inside the class. That's not the case.

For example

class Point {
    has $.x;
    has $.y;
    method double {
        $.x *= 2;   # WRONG
        $.y *= 2;   # WRONG
        self;
    }
}

say Point.new(x => 1, y => -2).double.x
# OUTPUT: Ā«Cannot assign to an immutable valueā¤Ā»

the first line inside the method double is marked with # WRONG because $.x, short for $( self.x ), is a call to a read-only accessor.

The syntax has $.x is short for something like has $!x; method x() { $!x }, so the actual attribute is called $!x, and a read-only accessor method is automatically generated.

Thus the correct way to write the method double is

method double {
    $!x *= 2;
    $!y *= 2;
    self;
}

which operates on the attributes directly.

BUILD prevents automatic attribute initialization from constructor arguments

When you define your own BUILD submethod, you must take care of initializing all attributes by yourself. For example

class A {
    has $.x;
    has $.y;
    submethod BUILD {
        $!y = 18;
    }
}

say A.new(x => 42).x;       # OUTPUT: Ā«Anyā¤Ā»

leaves $!x uninitialized, because the custom BUILD doesn't initialize it.

Note: Consider using TWEAK instead. Rakudo supports TWEAK method since release 2016.11.

One possible remedy is to explicitly initialize the attribute in BUILD:

submethod BUILD(:$x) {
    $!y = 18;
    $!x := $x;
}

which can be shortened to:

submethod BUILD(:$!x) {
    $!y = 18;
}

Whitespace

Whitespace in regexes does not match literally

say 'a b' ~~ /a b/; # OUTPUT: Ā«Falseā¤Ā»

Whitespace in regexes is, by default, considered an optional filler without semantics, just like in the rest of the Raku language.

Ways to match whitespace:

  • \s to match any one whitespace, \s+ to match at least one

  • ' ' (a blank in quotes) to match a single blank

  • \t, \n for specific whitespace (tab, newline)

  • \h, \v for horizontal, vertical whitespace

  • <.ws>, a built-in rule for whitespace that oftentimes does what you actually want it to do

  • with m:s/a b/ or m:sigspace/a b/, the blank in the regexes matches arbitrary whitespace

Ambiguities in parsing

While some languages will let you get away with removing as much whitespace between tokens as possible, Raku is less forgiving. The overarching mantra is we discourage code golf, so don't scrimp on whitespace (the more serious underlying reason behind these restrictions is single-pass parsing and ability to parse Raku programs with virtually no backtracking).

The common areas you should watch out for are:

Block vs. Hash slice ambiguity

# WRONG; trying to hash-slice a Bool:
while ($++ > 5){ .say }
# RIGHT:
while ($++ > 5) { .say }

# EVEN BETTER; Raku does not require parentheses there:
while $++ > 5 { .say }

Reduction vs. Array constructor ambiguity

# WRONG; ambiguity with `[<]` metaop:
my @a = [[<foo>],];
# RIGHT; reductions cannot have spaces in them, so put one in:
my @a = [[ <foo>],];

# No ambiguity here, natural spaces between items suffice to resolve it:
my @a = [[<foo bar ber>],];

Less than vs. Word quoting/Associative indexing

# WRONG; trying to index 3 associatively:
say 3<5>4
# RIGHT; prefer some extra whitespace around infix operators:
say 3 < 5 > 4

Exclusive sequences vs. sequences with Ranges

See the section on operator traps for more information about how the ...^ operator can be mistaken for the ... operator with a ^ operator immediately following it. You must use whitespace correctly to indicate which interpretation will be followed.

Captures

Containers versus values in a capture

Beginners might expect a variable in a Capture to supply its current value when that Capture is later used. For example:

my $a = 2; say join ",", ($a, ++$a);  # OUTPUT: Ā«3,3ā¤Ā»

Here the Capture contained the container pointed to by $a and the value of the result of the expression ++$a. Since the Capture must be reified before &say can use it, the ++$a may happen before &say looks inside the container in $a (and before the List is created with the two terms) and so it may already be incremented.

Instead, use an expression that produces a value when you want a value.

my $a = 2; say join ",", (+$a, ++$a); # OUTPUT: Ā«2,3ā¤Ā»

Or even simpler

my $a = 2; say  "$a, {++$a}"; # OUTPUT: Ā«2, 3ā¤Ā»

The same happens in this case:

my @arr;
my ($a, $b) = (1,1);
for ^5 {
    ($a,$b) = ($b, $a+$b);
    @arr.push: ($a, $b);
    say @arr
};

Outputs Ā«[(1 2)]ā¤[(2 3) (2 3)]ā¤[(3 5) (3 5) (3 5)]ā¤.... $a and $b are not reified until say is called, the value that they have in that precise moment is the one printed. To avoid that, decontainerize values or take them out of the variable in some way before using them.

my @arr;
my ($a, $b) = (1,1);
for ^5 {
    ($a,$b) = ($b, $a+$b);
    @arr.push: ($a.item, $b.item);
    say @arr
};

With item, the container will be evaluated in item context, its value extracted, and the desired outcome achieved.

Cool tricks

Raku includes a Cool class, which provides some of the DWIM behaviors we got used to by coercing arguments when necessary. However, DWIM is never perfect. Especially with Lists, which are Cool, there are many methods that will not do what you probably think they do, including contains, starts-with or index. Please see some examples in the section below.

Strings are not Lists, so beware indexing

In Raku, strings (Strs) are not lists of characters. One cannot iterate over them or index into them as you can with Lists, despite the name of the .index method.

Lists become strings, so beware .index()ing

List inherits from Cool, which provides access to .index. Because of the way .index coerces a List into a Str, this can sometimes appear to be returning the index of an element in the list, but that is not how the behavior is defined.

my @a = <a b c d>;
say @a.index(ā€˜aā€™);    # OUTPUT: Ā«0ā¤Ā»
say @a.index('c');    # OUTPUT: Ā«4ā¤Ā» -- not 2!
say @a.index('b c');  # OUTPUT: Ā«2ā¤Ā» -- not undefined!
say @a.index(<a b>);  # OUTPUT: Ā«0ā¤Ā» -- not undefined!

These same caveats apply to .rindex.

Lists become strings, so beware .contains()

Similarly, .contains does not look for elements in the list.

my @menu = <hamburger fries milkshake>;
say @menu.contains('hamburger');            # OUTPUT: Ā«Trueā¤Ā»
say @menu.contains('hot dog');              # OUTPUT: Ā«Falseā¤Ā»
say @menu.contains('milk');                 # OUTPUT: Ā«Trueā¤Ā»!
say @menu.contains('er fr');                # OUTPUT: Ā«Trueā¤Ā»!
say @menu.contains(<es mi>);                # OUTPUT: Ā«Trueā¤Ā»!

If you actually want to check for the presence of an element, use the (cont) operator for single elements, and the <superset=),_infix_āŠ‡>> and <strict superset),_infix_āŠƒ>> operators for multiple elements.

my @menu = <hamburger fries milkshake>;
say @menu (cont) 'fries';                   # OUTPUT: Ā«Trueā¤Ā»
say @menu (cont) 'milk';                    # OUTPUT: Ā«Falseā¤Ā»
say @menu (>) <hamburger fries>;            # OUTPUT: Ā«Trueā¤Ā»
say @menu (>) <milkshake fries>;            # OUTPUT: Ā«Trueā¤Ā» (! NB: order doesn't matter)

If you are doing a lot of element testing, you may be better off using a Set.

Numeric literals are parsed before coercion

Experienced programmers will probably not be surprised by this, but Numeric literals will be parsed into their numeric value before being coerced into a string, which may create nonintuitive results.

say 0xff.contains(55);      # OUTPUT: Ā«Trueā¤Ā»
say 0xff.contains(0xf);     # OUTPUT: Ā«Falseā¤Ā»
say 12_345.contains("23");  # OUTPUT: Ā«Trueā¤Ā»
say 12_345.contains("2_");  # OUTPUT: Ā«Falseā¤Ā»

Getting a random item from a List

A common task is to retrieve one or more random elements from a collection, but List.rand isn't the way to do that. Cool provides rand, but that first coerces the List into the number of items in the list, and returns a random real number between 0 and that value. To get random elements, see pick and roll.

my @colors = <red orange yellow green blue indigo violet>;
say @colors.rand;       # OUTPUT: Ā«2.21921955680514ā¤Ā»
say @colors.pick;       # OUTPUT: Ā«orangeā¤Ā»
say @colors.roll;       # OUTPUT: Ā«blueā¤Ā»
say @colors.pick(2);    # OUTPUT: Ā«(yellow violet)ā¤Ā»  (cannot repeat)
say @colors.roll(3);    # OUTPUT: Ā«(red green red)ā¤Ā»  (can repeat)

Lists numify to their number of elements in numeric context

You want to check whether a number is divisible by any of a set of numbers:

say 42 %% <11 33 88 55 111 20325>; # OUTPUT: Ā«Trueā¤Ā»

What? There's no single number 42 should be divisible by. However, that list has 6 elements, and 42 is divisible by 6. That's why the output is true. In this case, you should turn the List into a Junction:

say 42 %% <11 33 88 55 111 20325>.any;
# OUTPUT: Ā«any(False, False, False, False, False, False)ā¤Ā»

which will clearly reveal the falsehood of the divisiveness of all the numbers in the list, which will be numified separately.

Arrays

Referencing the last element of an array

In some languages one could reference the last element of an array by asking for the "-1th" element of the array, e.g.:

my @array = qw{victor alice bob charlie eve};
say @array[-1];    # OUTPUT: Ā«eveā¤Ā»

In Raku it is not possible to use negative subscripts, however the same is achieved by actually using a function, namely *-1. Thus, accessing the last element of an array becomes:

my @array = qw{victor alice bob charlie eve};
say @array[*-1];   # OUTPUT: Ā«eveā¤Ā»

Yet another way is to utilize the array's tail method:

my @array = qw{victor alice bob charlie eve};
say @array.tail;      # OUTPUT: Ā«eveā¤Ā»
say @array.tail(2);   # OUTPUT: Ā«(charlie eve)ā¤Ā»

Typed array parameters

Quite often new users will happen to write something like:

sub foo(Array @a) { ... }

...before they have gotten far enough in the documentation to realize that this is asking for an Array of Arrays. To say that @a should only accept Arrays, use instead:

sub foo(@a where Array) { ... }

It is also common to expect this to work, when it does not:

sub bar(Int @a) { 42.say };
bar([1, 2, 3]);             # expected Positional[Int] but got Array

The problem here is that [1, 2, 3] is not an Array[Int], it is a plain old Array that just happens to have Ints in it. To get it to work, the argument must also be an Array[Int].

my Int @b = 1, 2, 3;
bar(@b);                    # OUTPUT: Ā«42ā¤Ā»
bar(Array[Int].new(1, 2, 3));

This may seem inconvenient, but on the upside it moves the type-check on what is assigned to @b to where the assignment happens, rather than requiring every element to be checked on every call.

Using Ā«Ā» quoting when you don't need it

This trap can be seen in different varieties. Here are some of them:

my $x = ā€˜helloā€™;
my $y = ā€˜foo barā€™;

my %h = $x => 42, $y => 99;
say %hĀ«$xĀ»;   # ā† WRONG; assumption that $x has no whitespace
say %hĀ«$yĀ»;   # ā† WRONG; splits ā€˜foo barā€™ by whitespace
say %hĀ«"$y"Ā»; # ā† KINDA OK; it works but there is no good reason to do that
say %h{$y};   # ā† RIGHT; this is what should be used

run Ā«touch $xĀ»;        # ā† WRONG; assumption that only one file will be created
run Ā«touch $yĀ»;        # ā† WRONG; will touch file ā€˜fooā€™ and ā€˜barā€™
run Ā«touch "$y"Ā»;      # ā† WRONG; better, but has a different issue if $y starts with -
run Ā«touch -- "$y"Ā»;   # ā† KINDA OK; it works but there is no good enough reason to do that
run ā€˜touchā€™, ā€˜--ā€™, $y; # ā† RIGHT; explicit and *always* correct
run <touch -->, $y;    # ā† RIGHT; < > are OK, this is short and correct

Basically, Ā«Ā» quoting is only safe to use if you remember to always quote your variables. The problem is that it inverts the default behavior to unsafe variant, so just by forgetting some quotes you are risking to introduce either a bug or maybe even a security hole. To stay on the safe side, refrain from using Ā«Ā».

Strings

Some problems that might arise when dealing with Strs.

Quotes and interpolation

Interpolation in string literals can be too clever for your own good.

# "HTML tags" interpreted as associative indexing:
"$foo<html></html>" eq
"$foo{'html'}{'/html'}"
# Parentheses interpreted as call with argument:
"$foo(" ~ @args ~ ")" eq
"$foo(' ~ @args ~ ')"

You can avoid those problems using non-interpolating single quotes and switching to more liberal interpolation with \qq[] escape sequence:

my $a = 1;
say '\qq[$a]()$b()';
# OUTPUT: Ā«1()$b()ā¤Ā»

Another alternative is to use Q:c quoter, and use code blocks {} for all interpolation:

my $a = 1;
say Q:cĀ«{$a}()$b()Ā»;
# OUTPUT: Ā«1()$b()ā¤Ā»

Beware of variables used within qqx

Variables within qqx[] can introduce a security hole; the variable content can be set to well-crafted string and execute arbitrary code:

my $world = "there\";rm -rf /path/to/dir\"";
say qqx{echo "hello $world"};
# OUTPUT: Ā«hello thereā¤Ā»

The above code will also delete /path/to/dir, you can avoid this problem by making sure the variable content does not have shell special characters, or use run and Proc::Async for better ways to execute external commands.

Strings are not iterable

There are methods that Str inherits from Any that work on iterables like lists. Iterators on strings contain one element that is the whole string. To use list-based methods like sort, reverse, you need to convert the string into a list first.

say "cba".sort;              # OUTPUT: Ā«(cba)ā¤Ā»
say "cba".comb.sort.join;    # OUTPUT: Ā«abcā¤Ā»

.chars gets the number of graphemes, not Codepoints

In Raku, .chars returns the number of graphemes, or user visible characters. These graphemes could be made up of a letter plus an accent for example. If you need the number of codepoints, you should use .codes. If you need the number of bytes when encoded as UTF8, you should use .encode.bytes to encode the string as UTF8 and then get the number of bytes.

say "\c[LATIN SMALL LETTER J WITH CARON, COMBINING DOT BELOW]"; # OUTPUT: Ā«Ē°Ģ£ā¤Ā»
    say 'Ē°Ģ£'.codes;        # OUTPUT: Ā«2ā¤Ā»
    say 'Ē°Ģ£'.chars;        # OUTPUT: Ā«1ā¤Ā»
    say 'Ē°Ģ£'.encode.bytes; # OUTPUT: Ā«4ā¤Ā»

For more information on how strings work in Raku, see the Unicode page.

All text is normalized by default

Raku normalizes all text into Unicode NFC form (Normalization Form Canonical). Filenames are the only text not normalized by default. If you are expecting your strings to maintain a byte for byte representation as the original, you need to use UTF8-C8 when reading or writing to any filehandles.

Allomorphs generally follow numeric semantics

Str "0" is True, while Numeric is False. So what's the Bool value of allomorph <0>?

In general, allomorphs follow Numeric semantics, so the ones that numerically evaluate to zero are False:

say so   <0>; # OUTPUT: Ā«Falseā¤Ā»
    say so <0e0>; # OUTPUT: Ā«Falseā¤Ā»
    say so <0.0>; # OUTPUT: Ā«Falseā¤Ā»

To force comparison being done for the Stringy part of the allomorph, use prefix ~ operator or the Str method to coerce the allomorph to Str, or use the chars routine to test whether the allomorph has any length:

say so      ~<0>;     # OUTPUT: Ā«Trueā¤Ā»
    say so       <0>.Str; # OUTPUT: Ā«Trueā¤Ā»
    say so chars <0>;     # OUTPUT: Ā«Trueā¤Ā»

Case-insensitive comparison of strings

In order to do case-insensitive comparison, you can use .fc (fold-case). The problem is that people tend to use .lc or .uc, and it does seem to work within the ASCII range, but fails on other characters. This is not just a Raku trap, the same applies to other languages.

say ā€˜groƟā€™.lc eq ā€˜GROSSā€™.lc; # ā† WRONG; False
say ā€˜groƟā€™.uc eq ā€˜GROSSā€™.uc; # ā† WRONG; True, but that's just luck
say ā€˜groƟā€™.fc eq ā€˜GROSSā€™.fc; # ā† RIGHT; True

If you are working with regexes, then there is no need to use .fc and you can use :i (:ignorecase) adverb instead.

Pairs

Constants on the left-hand side of pair notation

Consider this code:

enum Animals <Dog Cat>;
my %h := :{ Dog => 42 };
say %h{Dog}; # OUTPUT: Ā«(Any)ā¤Ā»

The :{ ā€¦ } syntax is used to create object hashes. The intentions of someone who wrote that code were to create a hash with Enum objects as keys (and say %h{Dog} attempts to get a value using the Enum object to perform the lookup). However, that's not how pair notation works.

For example, in Dog => 42 the key will be a Str. That is, it doesn't matter if there is a constant, or an enumeration with the same name. The pair notation will always use the left-hand side as a string literal, as long as it looks like an identifier.

To avoid this, use (Dog) => 42 or ::Dog => 42.

Scalar values within Pair

When dealing with Scalar values, the Pair holds the container to the value. This means that it is possible to reflect changes to the Scalar value from outside the Pair:

my $v = 'value A';
my $pair = Pair.new( 'a', $v );
$pair.say;  # OUTPUT: a => value A

$v = 'value B';
$pair.say; # OUTPUT: a => value B

Use the method freeze to force the removal of the Scalar container from the Pair. For more details see the documentation about Pair.

Sets, bags and mixes

Sets, bags and mixes do not have a fixed order

When iterating over this kind of objects, an order is not defined.

my $set = <a b c>.Set;
.say for $set.list; # OUTPUT: Ā«a => Trueā¤c => Trueā¤b => Trueā¤Ā»
# OUTPUT: Ā«a => Trueā¤c => Trueā¤b => Trueā¤Ā»
# OUTPUT: Ā«c => Trueā¤b => Trueā¤a => Trueā¤Ā»

Every iteration might (and will) yield a different order, so you cannot trust a particular sequence of the elements of a set. If order does not matter, just use them that way. If it does, use sort

my $set = <a b c>.Set;
    .say for $set.list.sort;  # OUTPUT: Ā«a => Trueā¤b => Trueā¤c => Trueā¤Ā»

In general, sets, bags and mixes are unordered, so you should not depend on them having a particular order.

Operators

Some operators commonly shared among other languages were repurposed in Raku for other, more common, things:

Junctions

The ^, |, and & are not bitwise operators, they create Junctions. The corresponding bitwise operators in Raku are: +^, +|, +& for integers and ?^, ?|, ?& for Bools.

Exclusive sequence operator

Lavish use of whitespace helps readability, but keep in mind infix operators cannot have any whitespace in them. One such operator is the sequence operator that excludes right point: ...^ (or its Unicode equivalent ā€¦^).

say 1... ^5; # OUTPUT: Ā«(1 0 1 2 3 4)ā¤Ā»
    say 1...^5;  # OUTPUT: Ā«(1 2 3 4)ā¤Ā»

If you place whitespace between the ellipsis (ā€¦) and the caret (^), it's no longer a single infix operator, but an infix inclusive sequence operator (ā€¦) and a prefix Range operator (^). Iterables are valid endpoints for the sequence operator, so the result you'll get might not be what you expected.

String ranges/Sequences

In some languages, using strings as range end points, considers the entire string when figuring out what the next string should be; loosely treating the strings as numbers in a large base. Here's the Perl version:

say join ", ", "az".."bc";
# OUTPUT: Ā«az, ba, bb, bcā¤Ā»

Such a range in Raku will produce a different result, where each letter will be ranged to a corresponding letter in the end point, producing more complex sequences:

say join ", ", "az".."bc";
#`{ OUTPUT: Ā«
    az, ay, ax, aw, av, au, at, as, ar, aq, ap, ao, an, am, al, ak, aj, ai, ah,
    ag, af, ae, ad, ac, bz, by, bx, bw, bv, bu, bt, bs, br, bq, bp, bo, bn, bm,
    bl, bk, bj, bi, bh, bg, bf, be, bd, bc
ā¤Ā»}
say join ", ", "r2".."t3";
# OUTPUT: Ā«r2, r3, s2, s3, t2, t3ā¤Ā»

To achieve simpler behavior, similar to the Perl example above, use a sequence operator that calls .succ method on the starting string:

say join ", ", ("az", *.succ ... "bc");
# OUTPUT: Ā«az, ba, bb, bcā¤Ā»

Topicalizing operators

The smartmatch operator ~~ and andthen set the topic $_ to their left-hand-side. In conjunction with implicit method calls on the topic this can lead to surprising results.

my &method = { note $_; $_ };
$_ = 'object';
say .&method;
# OUTPUT: Ā«objectā¤objectā¤Ā»
say 'topic' ~~ .&method;
# OUTPUT: Ā«topicā¤Trueā¤Ā»

In many cases flipping the method call to the LHS will work.

my &method = { note $_; $_ };
$_ = 'object';
say .&method;
# OUTPUT: Ā«objectā¤objectā¤Ā»
say .&method ~~ 'topic';
# OUTPUT: Ā«objectā¤Falseā¤Ā»

Fat arrow and constants

The fat arrow operator => will turn words on its left-hand side to Str without checking the scope for constants or \-sigiled variables. Use explicit scoping to get what you mean.

constant V = 'x';
my %h = V => 'oiā€½', ::V => 42;
say %h.raku
# OUTPUT: Ā«{:V("oiā€½"), :x(42)}ā¤Ā»

Infix operator assignment

Infix operators, both built in and user defined, can be combined with the assignment operator as this addition example demonstrates:

my $x = 10;
    $x += 20;
    say $x;     # OUTPUT: Ā«30ā¤Ā»

For any given infix operator op, L op= R is equivalent to L = L op R (where L and R are the left and right arguments, respectively). This means that the following code may not behave as expected:

my @a = 1, 2, 3;
    @a += 10;
    say @a;  # OUTPUT: Ā«[13]ā¤Ā»

Coming from a language like C++, this might seem odd. It is important to bear in mind that += isn't defined as method on the left hand argument (here the @a array) but is simply shorthand for:

my @a = 1, 2, 3;
    @a = @a + 10;
    say @a;  # OUTPUT: Ā«[13]ā¤Ā»

Here @a is assigned the result of adding @a (which has three elements) and 10; 13 is therefore placed in @a.

Use the hyper form of the assignment operators instead:

my @a = 1, 2, 3;
    @a Ā»+=Ā» 10;
    say @a;  # OUTPUT: Ā«[11 12 13]ā¤Ā»

Method calls do not chain

An exception to the clarification L = L op R above occurs when infix operator assignment is used in conjunction with the method call operator a la L .= R. In this case, only the first method in any chain is applied in the assignment.

my $s = "abcd";
    say $s .= uc.lc; # OUTPUT: abcd
    say $s;          # OUTPUT: ABCD

By separating the chain into individual infix operator assignments, you can achieve the desired affect:

my $s = "abcd";
    say $s .= uc .= lc; # OUTPUT: abcd
    say $s;             # OUTPUT: abcd

Regexes

$x vs <$x>, and $(code) vs <{code}>

Raku offers several constructs to generate regexes at runtime through interpolation (see their detailed description here). When a regex generated this way contains only literals, the above constructs behave (pairwise) identically, as if they are equivalent alternatives. As soon as the generated regex contains metacharacters, however, they behave differently, which may come as a confusing surprise.

The first two constructs that may easily be confused with each other are $variable and <$variable>:

my $variable = 'camelia';
    say ā€˜I ā™„ cameliaā€™ ~~ /  $variable  /;   # OUTPUT: ļ½¢cameliaļ½£
    say ā€˜I ā™„ cameliaā€™ ~~ / <$variable> /;   # OUTPUT: ļ½¢cameliaļ½£

Here they act the same because the value of $variable consists of literals. But when the variable is changed to comprise regex metacharacters the outputs become different:

my $variable = '#camelia';
    say ā€˜I ā™„ #cameliaā€™ ~~ /  $variable  /;   # OUTPUT: Ā«ļ½¢#cameliaļ½£ā¤Ā»
    say ā€˜I ā™„ #cameliaā€™ ~~ / <$variable> /;   # !! Error: malformed regex

What happens here is that the string #camelia contains the metacharacter #. In the context of a regex, this character should be quoted to match literally; without quoting, the # is parsed as the start of a comment that runs until the end of the line, which in turn causes the regex not to be terminated, and thus to be malformed.

Two other constructs that must similarly be distinguished from one another are $(code) and <{code}>. Like before, as long as the (stringified) return value of code comprises only literals, there is no distinction between the two:

my $variable = 'ailemac';
    say ā€˜I ā™„ cameliaā€™ ~~ / $($variable.flip)   /;   # OUTPUT: Ā«ļ½¢cameliaļ½£ā¤Ā»
    say ā€˜I ā™„ cameliaā€™ ~~ / <{$variable.flip}>  /;   # OUTPUT: Ā«ļ½¢cameliaļ½£ā¤Ā»

But when the return value is changed to comprise regex metacharacters, the outputs diverge:

my $variable = 'ailema.';
    say ā€˜I ā™„ cameliaā€™ ~~ / $($variable.flip)   /;   # OUTPUT: Nil
    say ā€˜I ā™„ cameliaā€™ ~~ / <{$variable.flip}>  /;   # OUTPUT: Ā«ļ½¢cameliaļ½£ā¤Ā»

In this case the return value of the code is the string .amelia, which contains the metacharacter .. The above attempt by $(code) to match the dot literally fails; the attempt by <{code}> to match the dot as a regex wildcard succeeds. Hence the different outputs.

| vs ||: which branch will win

To match one of several possible alternatives, || or | will be used. But they are so different.

When there are multiple matching alternations, for those separated by ||, the first matching alternation wins; for those separated by |, which to win is decided by LTM strategy. See also: documentation on || and documentation on |.

For simple regexes just using || instead of | will get you familiar semantics, but if writing grammars then it's useful to learn about LTM and declarative prefixes and prefer |. And keep yourself away from using them in one regex. When you have to do that, add parentheses and ensure that you know how LTM strategy works to make the code do what you want.

The trap typically arises when you try to mix both | and || in the same regex:

say 42 ~~ / [  0 || 42 ] | 4/; # OUTPUT: Ā«ļ½¢4ļ½£ā¤Ā»
say 42 ~~ / [ 42 ||  0 ] | 4/; # OUTPUT: Ā«ļ½¢42ļ½£ā¤Ā»

The code above may seem like it is producing a wrong result, but the implementation is actually right.

$/ changes each time a regular expression is matched

Each time a regular expression is matched against something, the special variable $/ holding the result Match object is changed accordingly to the result of the match (that could also be Nil).

The $/ is changed without any regard to the scope the regular expression is matched within.

For further information and examples please see the related section in the Regular Expressions documentation.

<foo> vs. < foo>: named rules vs. quoted lists

Regexes can contain quoted lists; longest token matching is performed on the list's elements as if a | alternation had been specified (see here for further information).

Within a regex, the following are lists with a single item, 'foo':

say 'foo' ~~ /< foo >/;  # OUTPUT: Ā«ļ½¢fooļ½£ā¤Ā»
    say 'foo' ~~ /< foo>/;   # OUTPUT: Ā«ļ½¢fooļ½£ā¤Ā»

but this is a call to the named rule foo:

say 'foo' ~~ /<foo>/;
# OUTPUT: Ā«No such method 'foo' for invocant of type 'Match'ā¤ in block <unit> at <unknown file> line 1ā¤Ā»

Be wary of the difference; if you intend to use a quoted list, ensure that whitespace follows the initial <.

Non-capturing, non-global matching in list context

Unlike Perl, non-capturing and non-global matching in list context doesn't produce any values:

if  'x' ~~ /./ { say 'yes' }  # OUTPUT: Ā«yesā¤Ā»
    for 'x' ~~ /./ { say 'yes' }  # NO OUTPUT

This is because its 'list' slot (inherited from Capture class) doesn't get populated with the original Match object:

say ('x' ~~ /./).list  # OUTPUT: Ā«()ā¤Ā»

To achieve the desired result, use global matching, capturing parentheses or a list with a trailing comma:

for 'x' ~~ m:g/./ { say 'yes' }  # OUTPUT: Ā«yesā¤Ā»
    for 'x' ~~ /(.)/  { say 'yes' }  # OUTPUT: Ā«yesā¤Ā»
    for ('x' ~~ /./,) { say 'yes' }  # OUTPUT: Ā«yesā¤Ā»

Common precedence mistakes

Adverbs and precedence

Adverbs do have a precedence that may not follow the order of operators that is displayed on your screen. If two operators of equal precedence are followed by an adverb it will pick the first operator it finds in the abstract syntax tree. Use parentheses to help Raku understand what you mean or use operators with looser precedence.

my %x = a => 42;
say !%x<b>:exists;            # dies with X::AdHoc
say %x<b>:!exists;            # this works
say !(%x<b>:exists);          # works too
say not %x<b>:exists;         # works as well
say True unless %x<b>:exists; # avoid negation altogether

Ranges and precedence

The loose precedence of .. can lead to some errors. It is usually best to parenthesize ranges when you want to operate on the entire range.

1..3.say;    # OUTPUT: Ā«3ā¤Ā» (and warns about useless use of "..")
(1..3).say;  # OUTPUT: Ā«1..3ā¤Ā»

Loose Boolean operators

The precedence of and, or, etc. is looser than routine calls. This can have surprising results for calls to routines that would be operators or statements in other languages like return, last and many others.

sub f {
    return True and False;
    # this is actually
    # (return True) and False;
}
say f; # OUTPUT: Ā«Trueā¤Ā»

Exponentiation operator and prefix minus

say -1Ā²;   # OUTPUT: Ā«-1ā¤Ā»
say -1**2; # OUTPUT: Ā«-1ā¤Ā»

When performing a regular mathematical calculation, the power takes precedence over the minus; so -1Ā² can be written as -(1Ā²). Raku matches these rules of mathematics and the precedence of ** operator is tighter than that of the prefix -. If you wish to raise a negative number to a power, use parentheses:

say (-1)Ā²;   # OUTPUT: Ā«1ā¤Ā»
say (-1)**2; # OUTPUT: Ā«1ā¤Ā»

Method operator calls and prefix minus

Prefix minus binds looser than dotty method op calls. The prefix minus will be applied to the return value from the method. To ensure the minus gets passed as part of the argument, enclose in parenthesis.

say  -1.abs;  # OUTPUT: Ā«-1ā¤Ā»
say (-1).abs; # OUTPUT: Ā«1ā¤Ā»

Subroutine and method calls

Subroutine and method calls can be made using one of two forms:

foo(...); # function call form, where ... represent the required arguments
foo ...;  # list op form, where ... represent the required arguments

The function call form can cause problems for the unwary when whitespace is added after the function or method name and before the opening parenthesis.

First we consider functions with zero or one parameter:

sub foo() { say 'no arg' }
sub bar($a) { say "one arg: $a" }

Then execute each with and without a space after the name:

foo();    # okay: no arg
foo ();   # FAIL: Too many positionals passed; expected 0 arguments but got 1
bar($a);  # okay: one arg: 1
bar ($a); # okay: one arg: 1

Now declare a function of two parameters:

sub foo($a, $b) { say "two args: $a, $b" }

Execute it with and without the space after the name:

foo($a, $b);  # okay: two args: 1, 2
foo ($a, $b); # FAIL: Too few positionals passed; expected 2 arguments but got 1

The lesson is: "be careful with spaces following sub and method names when using the function call format." As a general rule, good practice might be to avoid the space after a function name when using the function call format.

Note that there are clever ways to eliminate the error with the function call format and the space, but that is bordering on hackery and will not be mentioned here. For more information, consult Functions.

Finally, note that, currently, when declaring the functions whitespace may be used between a function or method name and the parentheses surrounding the parameter list without problems.

Named parameters

Many built-in subroutines and method calls accept named parameters and your own code may accept them as well, but be sure the arguments you pass when calling your routines are actually named parameters:

sub foo($a, :$b) { ... }
foo(1, 'b' => 2); # FAIL: Too many positionals passed; expected 1 argument but got 2

What happened? That second argument is not a named parameter argument, but a Pair passed as a positional argument. If you want a named parameter it has to look like a name to Perl:

foo(1, b => 2); # okay
foo(1, :b(2));  # okay
foo(1, :b<it>); # okay

my $b = 2;
foo(1, :b($b)); # okay, but redundant
foo(1, :$b);    # okay

# Or even...
my %arg = 'b' => 2;
foo(1, |%arg);  # okay too

That last one may be confusing, but since it uses the | prefix on a Hash, which is a special compiler construct indicating you want to use the contents of the variable as arguments, which for hashes means to treat them as named arguments.

If you really do want to pass them as pairs you should use a List or Capture instead:

my $list = ('b' => 2),; # this is a List containing a single Pair
foo(|$list, :$b);       # okay: we passed the pair 'b' => 2 to the first argument
foo(1, |$list);         # FAIL: Too many positionals passed; expected 1 argument but got 2
foo(1, |$list.Capture); # OK: .Capture call converts all Pair objects to named args in a Capture
my $cap = \('b' => 2); # a Capture with a single positional value
foo(|$cap, :$b); # okay: we passed the pair 'b' => 2 to the first argument
foo(1, |$cap);   # FAIL: Too many positionals passed; expected 1 argument but got 2

A Capture is usually the best option for this as it works exactly like the usual capturing of routine arguments during a regular call.

The nice thing about the distinction here is that it gives the developer the option of passing pairs as either named or positional arguments, which can be handy in various instances.

Argument count limit

While it is typically unnoticeable, there is a backend-dependent argument count limit. Any code that does flattening of arbitrarily sized arrays into arguments won't work if there are too many elements.

my @a = 1 xx 9999;
my @b;
@b.push: |@a;
say @b.elems # OUTPUT: Ā«9999ā¤Ā»
my @a = 1 xx 999999;
my @b;
@b.push: |@a; # OUTPUT: Ā«Too many arguments in flattening array.ā¤  in block <unit> at <tmp> line 1ā¤ā¤Ā»

Avoid this trap by rewriting the code so that there is no flattening. In the example above, you can replace push with append. This way, no flattening is required because the array can be passed as is.

my @a = 1 xx 999999;
my @b;
@b.append: @a;
say @b.elems # OUTPUT: Ā«999999ā¤Ā»

Phasers and implicit return

sub returns-ret () {
    CATCH {
        default {}
    }
    "ret";
}

sub doesn't-return-ret () {
    "ret";
    CATCH {
        default {}
    }
}

say returns-ret;        # OUTPUT: Ā«retā¤Ā»
say doesn't-return-ret;
# BAD: outputs Ā«NilĀ» and a warning Ā«Useless use of constant string "ret" in sink context (line 13)Ā»

Code for returns-ret and doesn't-return-ret might look exactly the same, since in principle it does not matter where the CATCH block goes. However, a block is an object and the last object in a sub will be returned, so the doesn't-return-ret will return Nil, and, besides, since "ret" will be now in sink context, it will issue a warning. In case you want to place phasers last for conventional reasons, use the explicit form of return.

sub explicitly-return-ret () {
    return "ret";
    CATCH {
        default {}
    }
}

LEAVE needs explicit return from a sub to run

As the documentation for the LEAVE phaser indicates LEAVE runs when a block is exited, "... except when the program exits abruptly". That is, unlike END, it's going to be invoked only if the block actually returns in an orderly way. This is why:

sub a() { LEAVE say "left"; exit 1 }; # No output, it will simply exit

will not run the LEAVE code, since technically it's not returning. On the other hand, END is a program execution phaser and will run no matter what

sub a() {
    END say "Begone"; exit 1
}; a; # OUTPUT: Ā«Begoneā¤Ā»

Input and output

Closing open filehandles and pipes

Unlike some other languages, Raku does not use reference counting, and so the filehandles are NOT closed when they go out of scope. You have to explicitly close them either by using close routine or using the :close argument several of IO::Handle's methods accept. See IO::Handle.close for details.

The same rules apply to IO::Handle's subclass IO::Pipe, which is what you operate on when reading from a Proc you get with routines run and shell.

The caveat applies to IO::CatHandle type as well, though not as severely. See IO::CatHandle.close for details.

IO::Path stringification

Partly for historical reasons and partly by design, an IO::Path object stringifies without considering its CWD attribute, which means if you chdir and then stringify an IO::Path, or stringify an IO::Path with custom $!CWD attribute, the resultant string won't reference the original filesystem object:

with 'foo'.IO {
    .Str.say;       # OUTPUT: Ā«fooā¤Ā»
    .relative.say;  # OUTPUT: Ā«fooā¤Ā»

    chdir "/tmp";
    .Str.say;       # OUTPUT: Ā«fooā¤Ā»
    .relative.say   # OUTPUT: Ā«../home/camelia/fooā¤Ā»
}

# Deletes ./foo, not /bar/foo
unlink IO::Path.new("foo", :CWD</bar>).Str

The easy way to avoid this issue is to not stringify an IO::Path object at all. Core routines that work with paths can take an IO::Path object, so you don't need to stringify the paths.

If you do have a case where you need a stringified version of an IO::Path, use absolute or relative methods to stringify it into an absolute or relative path, respectively.

If you are facing this issue because you use chdir in your code, consider rewriting it in a way that does not involve changing the current directory. For example, you can pass cwd named argument to run without having to use chdir around it.

Splitting the input data into lines

There is a difference between using .lines on IO::Handle and on a Str. The trap arises if you start assuming that both split data the same way.

say $_.raku for $*IN.lines # .lines called on IO::Handle
# OUTPUT:
# "foox"
# "fooy\rbar"
# "fooz"

As you can see in the example above, there was a line which contained \r (ā€œcarriage returnā€ control character). However, the input is split strictly by \n, so \r was kept as part of the string.

On the other hand, Str.lines attempts to be ā€œsmartā€ about processing data from different operating systems. Therefore, it will split by all possible variations of a newline.

say $_.raku for $*IN.slurp(:bin).decode.lines # .lines called on a Str
# OUTPUT:
# "foox"
# "fooy"
# "bar"
# "fooz"

The rule is quite simple: use IO::Handle.lines when working with programmatically generated output, and Str.lines when working with user-written texts.

Use $data.split(ā€œ\nā€) in cases where you need the behavior of IO::Handle.lines but the original IO::Handle is not available.

Note that if you really want to slurp the data first, then you will have to use .IO.slurp(:bin).decode.split(ā€œ\nā€). Notice how we use :bin to prevent it from doing the decoding, only to call .decode later anyway. All that is needed because .slurp is assuming that you are working with text and therefore it attempts to be smart about newlines.

If you are using Proc::Async, then there is currently no easy way to make it split data the right way. You can try reading the whole output and then using Str.split (not viable if you are dealing with large data) or writing your own logic to split the incoming data the way you need. Same applies if your data is null-separated.

Proc::Async and print

When using Proc::Async you should not assume that .print (or any other similar method) is synchronous. The biggest issue of this trap is that you will likely not notice the problem by running the code once, so it may cause a hard-to-detect intermittent fail.

Here is an example that demonstrates the issue:

loop {
    my $proc = Proc::Async.new: :w, ā€˜headā€™, ā€˜-nā€™, ā€˜1ā€™;
    my $got-something;
    react {
        whenever $proc.stdout.lines { $got-something = True }
        whenever $proc.start        { die ā€˜FAIL!ā€™ unless $got-something }

        $proc.print: ā€œone\ntwo\nthree\nfourā€;
        $proc.close-stdin;
    }
    say $++;
}

And the output it may produce:

0
1
2
3
An operation first awaited:
  in block <unit> at print.raku line 4

Died with the exception:
    FAIL!
      in block  at print.raku line 6

Resolving this is easy because .print returns a promise that you can await on. The solution is even more beautiful if you are working in a react block:

whenever $proc.print: ā€œone\ntwo\nthree\nfourā€ {
    $proc.close-stdin;
}

Using .stdout without .lines

Method .stdout of Proc::Async returns a supply that emits chunks of data, not lines. The trap is that sometimes people assume it to give lines right away.

my $proc = Proc::Async.new(ā€˜catā€™, ā€˜/usr/share/dict/wordsā€™);
react {
    whenever $proc.stdout.head(1) { .say } # ā† WRONG (most likely)
    whenever $proc.start { }
}

The output is clearly not just 1 line:

A
A's
AMD
AMD's
AOL
AOL's
Aachen
Aachen's
Aaliyah
Aaliyah's
Aaron
Aaron's
Abbas
Abbas's
Abbasid
Abbasid's
Abbott
Abbott's
Abby

If you want to work with lines, then use $proc.stdout.lines. If you're after the whole output, then something like this should do the trick: whenever $proc.stdout { $out ~= $_ }.

Exception handling

Sunk Proc

Some methods return a Proc object. If it represents a failed process, Proc itself won't be exception-like, but sinking it will cause an X::Proc::Unsuccessful exception to be thrown. That means this construct will throw, despite the try in place:

try run("raku", "-e", "exit 42");
say "still alive";
# OUTPUT: Ā«The spawned process exited unsuccessfully (exit code: 42)ā¤Ā»

This is because try receives a Proc and returns it, at which point it sinks and throws. Explicitly sinking it inside the try avoids the issue and ensures the exception is thrown inside the try:

try sink run("raku", "-e", "exit 42");
say "still alive";
# OUTPUT: Ā«still aliveā¤Ā»

If you're not interested in catching any exceptions, then use an anonymous variable to keep the returned Proc in; this way it'll never sink:

$ = run("raku", "-e", "exit 42");
say "still alive";
# OUTPUT: Ā«still aliveā¤Ā»

Using shortcuts

The ^ twigil

Using the ^ twigil can save a fair amount of time and space when writing out small blocks of code. As an example:

for 1..8 -> $a, $b { say $a + $b; }

can be shortened to just

for 1..8 { say $^a + $^b; }

The trouble arises when a person wants to use more complex names for the variables, instead of just one letter. The ^ twigil is able to have the positional variables be out of order and named whatever you want, but assigns values based on the variable's Unicode ordering. In the above example, we can have $^a and $^b switch places, and those variables will keep their positional values. This is because the Unicode character 'a' comes before the character 'b'. For example:

# In order
sub f1 { say "$^first $^second"; }
f1 "Hello", "there";    # OUTPUT: Ā«Hello thereā¤Ā»
# Out of order
sub f2 { say "$^second $^first"; }
f2 "Hello", "there";    # OUTPUT: Ā«there Helloā¤Ā»

Due to the variables allowed to be called anything, this can cause some problems if you are not accustomed to how Raku handles these variables.

# BAD NAMING: alphabetically `four` comes first and gets value `1` in it:
for 1..4 { say "$^one $^two $^three $^four"; }    # OUTPUT: Ā«2 4 3 1ā¤Ā»

# GOOD NAMING: variables' naming makes it clear how they sort alphabetically:
for 1..4 { say "$^a $^b $^c $^d"; }               # OUTPUT: Ā«1 2 3 4ā¤Ā»

Using Ā» and map interchangeably

While <Ā»> may look like a shorter way to write map, they differ in some key aspects.

First, the Ā» includes a hint to the compiler that it may autothread the execution, thus if you're using it to call a routine that produces side effects, those side effects may be produced out of order (the result of the operator is kept in order, however). Also if the routine being invoked accesses a resource, there's the possibility of a race condition, as multiple invocations may happen simultaneously, from different threads.

<a b c d>Ā».say # OUTPUT: Ā«dā¤bā¤cā¤aā¤Ā»

Second, Ā» checks the nodality of the routine being invoked and based on that will use either deepmap or nodemap to map over the list, which can be different from how a map call would map over it:

say ((1, 2, 3), [^4], '5')Ā».Numeric;       # OUTPUT: Ā«((1 2 3) [0 1 2 3] 5)ā¤Ā»
say ((1, 2, 3), [^4], '5').map: *.Numeric; # OUTPUT: Ā«(3 4 5)ā¤Ā»

The bottom line is that map and Ā» are not interchangeable, but using one instead of the other is OK as long as you understand the differences.

Word splitting in Ā« Ā»

Keep in mind that Ā« Ā» performs word splitting similarly to how shells do it, so many shell pitfalls apply here as well (especially when using in combination with run):

my $file = ā€˜--my arbitrary filenameā€™;
    run ā€˜touchā€™, ā€˜--ā€™, $file;  # RIGHT
    run <touch -->, $file;     # RIGHT
run Ā«touch -- "$file"Ā»;    # RIGHT but WRONG if you forget quotes
    run Ā«touch -- $fileĀ»;      # WRONG; touches ā€˜--myā€™, ā€˜arbitraryā€™ and ā€˜filenameā€™
    run ā€˜touchā€™, $file;        # WRONG; error from `touch`
    run Ā«touch "$file"Ā»;       # WRONG; error from `touch`

Note that -- is required for many programs to disambiguate between command-line arguments and filenames that begin with hyphens.

Scope

Using a once block

The once block is a block of code that will only run once when its parent block is run. As an example:

my $var = 0;
for 1..10 {
    once { $var++; }
}
say "Variable = $var";    # OUTPUT: Ā«Variable = 1ā¤Ā»

This functionality also applies to other code blocks like sub and while, not just for loops. Problems arise though, when trying to nest once blocks inside of other code blocks:

my $var = 0;
for 1..10 {
    do { once { $var++; } }
}
say "Variable = $var";    # OUTPUT: Ā«Variable = 10ā¤Ā»

In the above example, the once block was nested inside of a code block which was inside of a for loop code block. This causes the once block to run multiple times, because the once block uses state variables to determine whether it has run previously. This means that if the parent code block goes out of scope, then the state variable the once block uses to keep track of if it has run previously, goes out of scope as well. This is why once blocks and state variables can cause some unwanted behavior when buried within more than one code block.

If you want to have something that will emulate the functionality of a once block, but still work when buried a few code blocks deep, we can manually build the functionality of a once block. Using the above example, we can change it so that it will only run once, even when inside the do block by changing the scope of the state variable.

my $var = 0;
for 1..10 {
    state $run-code = True;
    do { if ($run-code) { $run-code = False; $var++; } }
}
say "Variable = $var";    # OUTPUT: Ā«Variable = 1ā¤Ā»

In this example, we essentially manually build a once block by making a state variable called $run-code at the highest level that will be run more than once, then checking to see if $run-code is True using a regular if. If the variable $run-code is True, then make the variable False and continue with the code that should only be completed once.

The main difference between using a state variable like the above example and using a regular once block is what scope the state variable is in. The scope for the state variable created by the once block, is the same as where you put the block (imagine that the word 'once' is replaced with a state variable and an if to look at the variable). The example above using state variables works because the variable is at the highest scope that will be repeated; whereas the example that has a once block inside of a do, made the variable within the do block which is not the highest scope that is repeated.

Using a once block inside a class method will cause the once state to carry across all instances of that class. For example:

class A {
    method sayit() { once say 'hi' }
}
my $a = A.new;
$a.sayit;      # OUTPUT: Ā«hiā¤Ā»
my $b = A.new;
$b.sayit;      # nothing

LEAVE phaser and exit

Using LEAVE phaser to perform graceful resource termination is a common pattern, but it does not cover the case when the program is stopped with exit.

The following nondeterministic example should demonstrate the complications of this trap:

my $x = say ā€˜Opened some resourceā€™;
LEAVE say ā€˜Closing the resource gracefullyā€™ with $x;

exit 42 if rand < ā…“; # ā‘  ļ½¢exitļ½£ is bad
die ā€˜Dying because of unhandled exceptionā€™ if rand < Ā½; # ā‘” ļ½¢dieļ½£ is ok
# fallthru ā‘¢

There are three possible results:

ā‘ 
Opened some resource

ā‘”
Opened some resource
Closing the resource gracefully
Dying because of unhandled exception
  in block <unit> at print.raku line 5

ā‘¢
Opened some resource
Closing the resource gracefully

A call to exit is part of normal operation for many programs, so beware unintentional combination of LEAVE phasers and exit calls.

LEAVE phaser may run sooner than you think

Parameter binding is executed when we're "inside" the routine's block, which means LEAVE phaser would run when we leave that block if parameter binding fails when wrong arguments are given:

sub foo(Int) {
    my $x = 42;
    LEAVE say $x.Int; # ā† WRONG; assumes that $x is set
}
say foo rand; # OUTPUT: Ā«No such method 'Int' for invocant of type 'Any'ā¤Ā»

A simple way to avoid this issue is to declare your sub or method a multi, so the candidate is eliminated during dispatch and the code never gets to binding anything inside the sub, thus never entering the routine's body:

multi foo(Int) {
    my $x = 42;
    LEAVE say $x.Int;
}
say foo rand; # OUTPUT: Ā«Cannot resolve caller foo(Num); none of these signatures match: (Int)ā¤Ā»

Another alternative is placing the LEAVE into another block (assuming it's appropriate for it to be executed when that block is left, not the routine's body:

sub foo(Int) {
    my $x = 42;
    { LEAVE say $x.Int; }
}
say foo rand; # OUTPUT: Ā«Type check failed in binding to parameter '<anon>'; expected Int but got Num (0.7289418947969465e0)ā¤Ā»

You can also ensure LEAVE can be executed even if the routine is left due to failed argument binding. In our example, we check $x is defined before doing anything with it.

sub foo(Int) {
    my $x = 42;
    LEAVE $x andthen .Int.say;
}
say foo rand; # OUTPUT: Ā«Type check failed in binding to parameter '<anon>'; expected Int but got Num (0.8517160389079508e0)ā¤Ā»

Grammars

Using regexes within grammar's actions

# Define a grammar
grammar will-fail {
    token TOP {^ <word> $}
    token word { \w+ }
}

# Define an action class
class will-fail-actions {
    method TOP ($/) {      # <- note the $/ in the signature, which is readonly
        my $foo = ~$/;
        say $foo ~~ /foo/; # <- the regex tries to assign the result to $/ and will fail
    }
}

# Try to parse something...
will-fail.parse('word', :actions(will-fail-actions));
CATCH { default { put .^name, ': ', .Str } };
# OUTPUT: Ā«X::AdHoc: Cannot assign to a readonly variable or a valueā¤Ā»

Will fail with Cannot assign to a readonly variable ($/) or a value on method TOP. The problem here is that regular expressions also affect $/. Since it is in TOP's signature, it is a read-only variable, which is what produces the error. You can safely either use another variable in the signature or add is copy, this way:

method TOP ($/ is copy) { my $foo = ~$/; my $v = $foo ~~ /foo/;  }

Using certain names for rules/token/regexes

Grammars are actually a type of classes.

grammar G {};
    say G.^mro; # OUTPUT: Ā«((G) (Grammar) (Match) (Capture) (Cool) (Any) (Mu))ā¤Ā»

^mro prints the class hierarchy of this empty grammar, showing all the superclasses. And these superclasses have their very own methods. Defining a method in that grammar might clash with the ones inhabiting the class hierarchy:

grammar g {
    token TOP { <item> };
    token item { 'defined' }
};
say g.parse('defined');
# OUTPUT: Ā«Too many positionals passed; expected 1 argument but got 2ā¤  in regex item at /tmp/grammar-clash.raku line 3ā¤  in regex TOP at /tmp/grammar-clash.raku line 2ā¤  in block <unit> at /tmp/grammar-clash.raku line 5Ā»

item seems innocuous enough, but it is a sub defined in class Mu. The message is a bit cryptic and totally unrelated to that fact, but that is why this is listed as a trap. In general, all subs defined in any part of the hierarchy are going to cause problems; some methods will too. For instance, CREATE, take and defined (which are defined in Mu). In general, multi methods and simple methods will not have any problem, but it might not be a good practice to use them as rule names.

Also avoid phasers for rule/token/regex names: TWEAK, BUILD, BUILD-ALL will throw another kind of exception if you do that: Cannot find method 'match': no method cache and no .^find_method, once again only slightly related to what is actually going on.

Unfortunate generalization

:exists with more than one key

Let's say you have a hash and you want to use :exists on more than one element:

my %h = a => 1, b => 2;
say ā€˜a existsā€™ if %h<a>:exists;   # ā† OK; True
say ā€˜y existsā€™ if %h<y>:exists;   # ā† OK; False
say ā€˜Huhā€½ā€™     if %h<x y>:exists; # ā† WRONG; returns a 2-item list

Did you mean ā€œif any of them existsā€, or did you mean that all of them should exist? Use any or all Junction to clarify:

my %h = a => 1, b => 2;
say ā€˜x or yā€™     if any %h<x y>:exists;   # ā† RIGHT (any); False
say ā€˜a, x or yā€™  if any %h<a x y>:exists; # ā† RIGHT (any); True
say ā€˜a, x and yā€™ if all %h<a x y>:exists; # ā† RIGHT (all); False
say ā€˜a and bā€™    if all %h<a b>:exists;   # ā† RIGHT (all); True

The reason why it is always True (without using a junction) is that it returns a list with Bool values for each requested lookup. Non-empty lists always give True when you Boolify them, so the check always succeeds no matter what keys you give it.

Using [ā€¦] metaoperator with a list of lists

Every now and then, someone gets the idea that they can use [Z] to create the transpose of a list-of-lists:

my @matrix = <X Y>, <a b>, <1 2>;
my @transpose = [Z] @matrix; # ā† WRONG; but so far so good ā†™
say @transpose;              # OUTPUT: Ā«[(X a 1) (Y b 2)]ā¤Ā»

And everything works fine, until you get an input @matrix with exactly one row (child list):

my @matrix = <X Y>,;
my @transpose = [Z] @matrix; # ā† WRONG; ā†™
say @transpose;              # OUTPUT: Ā«[(X Y)]ā¤Ā» ā€“ not the expected transpose [(X) (Y)]

This happens partly because of the single argument rule, and there are other cases when this kind of a generalization may not work.

Using [~] for concatenating a list of blobs

The ~ infix operator can be used to concatenate Strs or Blobs. However, an empty list will always be reduced to an empty Str. This is due to the fact that, in the presence of a list with no elements, the reduction metaoperator returns the identity element for the given operator. Identity element for ~ is an empty string, regardless of the kind of elements the list could be populated with.

my Blob @chunks;
say ([~] @chunks).raku; # OUTPUT: Ā«""ā¤Ā»

This might cause a problem if you attempt to use the result while assuming that it is a Blob:

my Blob @chunks;
say ([~] @chunks).decode;
# OUTPUT: Ā«No such method 'decode' for invocant of type 'Str'. Did you mean 'encode'?ā¤ā€¦Ā»

There are many ways to cover that case. You can avoid [ ] metaoperator altogether:

my @chunks;
# ā€¦
say Blob.new: |Ā«@chunks; # OUTPUT: Ā«Blob:0x<>ā¤Ā»

Alternatively, you can initialize the array with an empty Blob:

my @chunks = Blob.new;
# ā€¦
say [~] @chunks; # OUTPUT: Ā«Blob:0x<>ā¤Ā»

Or you can utilize || operator to make it use an empty Blob in case the list is empty:

my @chunks;
# ā€¦
say [~] @chunks || Blob.new; # OUTPUT: Ā«Blob:0x<>ā¤Ā»

Please note that a similar issue may arise when reducing lists with other operators.

Maps

Beware of nesting Maps in sink context

Maps apply an expression to every element of a List and return a Seq:

say <Ć¾or oĆ°in loki>.map: *.codes; # OUTPUT: Ā«(3 4 4)ā¤Ā»

Maps are often used as a compact substitute for a loop, performing some kind of action in the map code block:

<Ć¾or oĆ°in loki>.map: *.codes.say; # OUTPUT: Ā«3ā¤4ā¤4ā¤Ā»

The problem might arise when maps are nested and in a sink context.

<foo bar ber>.map: { $^a.comb.map: { $^b.say}}; # OUTPUT: Ā«Ā»

You might expect the innermost map to bubble the result up to the outermost map, but it simply does nothing. Maps return Seqs, and in sink context the innermost map will iterate and discard the produced values, which is why it yields nothing.

Simply using say at the beginning of the sentence will save the result from sink context:

say <foo bar ber>.map: *.comb.map: *.say ;
    # OUTPUT: Ā«fā¤oā¤oā¤bā¤aā¤rā¤bā¤eā¤rā¤((True True True) (True True True) (True True True))ā¤Ā»

However, it will not be working as intended; the first fā¤oā¤oā¤bā¤aā¤rā¤bā¤eā¤rā¤ is the result of the innermost say, but then say returns a Bool, True in this case. Those Trues are what get printed by the outermost say, one for every letter. A much better option would be to flatten the outermost sequence:

<foo bar ber>.map({ $^a.comb.map: { $^b.say}}).flat
    # OUTPUT: Ā«fā¤oā¤oā¤bā¤aā¤rā¤bā¤eā¤rā¤Ā»

Of course, saving say for the result will also produce the intended result, as it will be saving the two nested sequences from void context:

say <foo bar ber>.map: { $^Ć¾.comb }; # OUTPUT: Ā« ((f o o) (b a r) (b e r))ā¤Ā»

Smartmatching

The smartmatch operator shortcuts to the right hand side accepting the left hand side. This may cause some confusion.

Smartmatch and WhateverCode

Using WhateverCode in the left hand side of a smartmatch does not work as expected, or at all:

my @a = <1 2 3>;
say @a.grep( *.Int ~~ 2 );
# OUTPUT: Ā«Cannot use Bool as Matcher with '.grep'.  Did you mean to
# use $_ inside a block?ā¤ā¤ā¤Ā»

The error message does not make a lot of sense. It does, however, if you put it in terms of the ACCEPTS method: that code is equivalent to 2.ACCEPTS( *.Int ), but *.Int cannot be coerced to Numeric|/type/Numeric#method_ACCEPTS, being as it is a Block.

Solution: don't use WhateverCode in the left hand side of a smartmatch:

my @a = <1 2 3>;
say @a.grep( 2 ~~ *.Int ); # OUTPUT: Ā«(2)ā¤Ā»

Libraries

Filesystem repositories

Adding large directories to the library search path can negatively impact module load time, even when the module being loaded doesn't reside in any of the added paths. This is because the compiler needs to traverse and checksum all relevant files in the tree before loading the requested module, unless the directory contains a META6.json file.

There are three common ways to add directories to the search path:

  • Providing a switch to Raku on the command line: $ raku -Idir

  • Setting an environment variable: $ env RAKULIB=dir raku

  • Using the lib pragma (i.e. use lib 'dir')

Performance penalties are evident when any of these methods are used to add an especially large or deeply nested directory. For example, adding /usr/lib to the search path on a typical Unix machine has a perceptible performance impact when loading a module,

    $ time raku -e 'use Test'
    real    0m0.511s
    user    0m0.554s
    sys     0m0.118s

    # no penalty, since we don't load anything:
    $ time raku -I/usr/lib -e ''
    real    0m0.247s
    user    0m0.254s
    sys     0m0.066s

    $ time raku -I/usr/lib -e 'use Test'
    real    0m6.344s
    user    0m6.232s
    sys     0m0.356s

    $ time raku -e 'use lib "/usr/lib"; use Test'
    real    0m6.555s
    user    0m6.445s
    sys     0m0.326s

    $ time env RAKULIB=/usr/lib raku -e 'use Test'
    real    0m6.479s
    user    0m6.368s
    sys     0m0.383s

Note that the increased runtime occurs even when requesting modules that were already installed elsewhere (Test is provided as part of Rakudo but /usr/lib was nevertheless scanned before loading it).

The best way to avoid this trap is to install modules that reside in large trees and omit those directories from the search path altogether. Alternatively, adding a META6.json file to a tree will prevent time consuming traversals.

See Also

Brackets

Valid opening/closing paired delimiters

Community

Information about the people working on and using Raku

FAQ

Frequently asked questions about Raku

Glossary

Glossary of Raku terminology

Pod6

An easy-to-use markup language for documenting Raku modules and programs

Pod6 tables

Valid, invalid, and unexpected tables

Terms

Raku terms

Testing

Writing and running tests in Raku

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.