Hashes and maps

Working with associative arrays/dictionaries/hashes

The associative role and associative classes

The Associative role underlies hashes and maps, as well as other classes such as MixHash. It defines the two types that will be used in associative classes; by default, you can use anything (literally, since any class that subclasses Any can be used) as a key, although it will be coerced to a string, and any object as value. You can access these types using the of and keyof methods.

By default, any object declared with the % sigil will get the Associative role, and will by default behave like a hash, but this role will only provide the two methods above, as well as the default Hash behavior.

say (%).^name ; # OUTPUT: «Hash␤»

Inversely, you cannot use the % sigil if the Associative role is not mixed in, but since this role does not have any associated properties, you will have to redefine the behavior of the hash subscript operator. In order to do that, there are several functions you will have to override:

class Logger does Associative[Cool,DateTime] {
    has %.store;

    method log( Cool $event ) {
        %.store{ DateTime.new( now ) } = $event;
    }

    multi method AT-KEY ( ::?CLASS:D: $key) {
        my @keys = %.store.keys.grep( /$key/ );
        %.store{ @keys };
    }

    multi method EXISTS-KEY (::?CLASS:D: $key) {
        %.store.keys.grep( /$key/ )??True!!False;
    }

    multi method DELETE-KEY (::?CLASS:D: $key) {
        X::Assignment::RO.new.throw;
    }

    multi method ASSIGN-KEY (::?CLASS:D: $key, $new) {
        X::Assignment::RO.new.throw;
    }

    multi method BIND-KEY (::?CLASS:D: $key, \new){
        X::Assignment::RO.new.throw;
    }
}
say Logger.of;                   # OUTPUT: «(Cool)␤»
my %logger := Logger.new;
say %logger.of;                  # OUTPUT: «(Cool)␤»

%logger.log( "Stuff" );
%logger.log( "More stuff");

say %logger<2018-05-26>;         # OUTPUT: «(More stuff Stuff)␤»
say %logger<2018-04-22>:exists;  # OUTPUT: «False␤»

In this case, we are defining a logger with Associative semantics that will be able to use dates (or a part of them) as keys. Since we are parameterizing Associative to those particular classes, of will return the value type we have used, Cool in this case (we can log lists or strings only). Mixing the Associative role gives it the right to use the % sigil; binding is needed in the definition since %-sigiled variables get by default the Hash type.

This log is going to be append-only, which is why we escape the associative array metaphor to use a log method to add new events to the log. Once they have been added, however, we can retrieve them by date or check if they exist. For the first we have to override the AT-KEY multi method, for the latter EXIST-KEY. In the last two statements, we show how the subscript operation invokes AT-KEY, while the :exists adverb invokes EXISTS-KEY.

We override DELETE-KEY, ASSIGN-KEY and BIND-KEY, but only to throw an exception. Attempting to assign, delete, or bind a value to a key will result in a Cannot modify an immutable Str (value) exception thrown.

Making classes associative provides a very convenient way of using and working with them using hashes; an example can be seen in Cro, which uses it extensively for the convenience of using hashes to define structured requests and express its response.

Mutable hashes and immutable maps

A Hash is a mutable mapping from keys to values (called dictionary, hash table or map in other programming languages). The values are all scalar containers, which means you can assign to them. Maps are, on the other hand, immutable. Once a key has been paired with a value, this pairing cannot be changed.

Maps and hashes are usually stored in variables with the percent % sigil, which is used to indicate they are Associative.

Hash and map elements are accessed by key via the { } postcircumfix operator:

say %*ENV{'HOME', 'PATH'}.raku;
    # OUTPUT: «("/home/camelia", "/usr/bin:/sbin:/bin")␤»

The general Subscript rules apply providing shortcuts for lists of literal strings, with and without interpolation.

my %h = oranges => 'round', bananas => 'bendy';
    say %h<oranges bananas>;
    # OUTPUT: «(round bendy)␤»
my $fruit = 'bananas';
    say %h«oranges "$fruit"»;
    # OUTPUT: «(round bendy)␤»

You can add new pairs simply by assigning to an unused key:

my %h;
    %h{'new key'} = 'new value';

Hash assignment

Assigning a list of elements to a hash variable first empties the variable, and then iterates the elements of the right-hand side (which must contain an even number of elements). Note that hash keys are always coerced to be strings even if they are unquoted, but keys with spaces in their names must be quoted. For example, using the common Pair syntax:

my %h = 1 => 'a', b => 2, '1 2' => 3;
    say %h.keys.sort.raku;       # OUTPUT: «("1", "1 2", "b").Seq␤»
    say %h.values.sort.raku      # OUTPUT: «(2, 3, "a").Seq␤»

The stringification of the keys can lead to surprising results. For example, the Raku module CSV::Parser in one mode returns a CSV data line as a hash of {UInt => 'value'} pairs which can look like this fragment of a 20-field line:

my %csv = '2' => 'a', '10' => 'b';
    say %csv.keys.sort.raku;           # OUTPUT: «("10", "2").Seq␤»

The sort result is not what one would normally want. We can use the power of Raku to coerce the string values into integers for the sort. There are several ways that can be done but this method may present the least "line noise" to novices or non-Raku viewers:

say %csv.keys.map(+*).sort.raku;   # OUTPUT: «(2, 10).Seq␤»

In a hash constructor list the Pair syntax doesn't have to be used, or it may be intermixed with ordinary list values as long as the list has an even number of elements in total as well as between Pairs. If an element is a Pair type (e.g., 'a => 1'), its key is taken as a new hash key, and its value as the new hash value for that key. Otherwise the value is coerced to Str and used as a hash key, while the next element of the list is taken as the corresponding value.

my %h = 'a', 'b', c => 'd', 'e', 'f';

Same as

my %h = a => 'b', c => 'd', e => 'f';

or

my %h = <a b c d e f>;

or even

my %h = %( a => 'b', c => 'd', e => 'f' );

If you have an odd number of elements using most constructors you will see an error:

my %h = <a b c>; # OUTPUT: «hash initializer expected...␤»

There are two other valid ways of constructing a hash, but the user should be wary:

my %h = [ a => 'b', c => 'd', e => 'f' ]; # This format is NOT recommended.
                                              # It cannot be a constant and there
                                              # will be problems with nested hashes

or

my $h = { a => 'b', c => 'd', e => 'f'};

Please note that curly braces are used only in the case that we are not assigning it to a %-sigiled variable; in case we use it for a %-sigiled variable we will get an error of Potential difficulties:␤ Useless use of hash composer on right side of hash assignment; did you mean := instead?. As this error indicates, however, we can use curly braces as long as we use also binding:

my %h := { a => 'b', c => 'd', e => 'f'};
    say %h; # OUTPUT: «{a => b, c => d, e => f}␤»

Nested hashes can also be defined using the same syntax:

my %h =  e => f => 'g';
     say %h<e><f>; # OUTPUT: «g␤»

However, what you are defining here is a key pointing to a Pair, which is fine if that is what you want and your nested hash has a single key. But %h<e> will point to a Pair which will have these consequences:

my %h =  e => f => 'g';
%h<e><q> = 'k';
# OUTPUT: «Pair␤Cannot modify an immutable Str (Nil)␤  in block <unit>»

This, however, will effectively define a nested hash:

my %h =  e => { f => 'g' };
     say %h<e>.^name;  # OUTPUT: «Hash␤»
     say %h<e><f>;     # OUTPUT: «g␤»

If a Pair is encountered where a value is expected, it is used as a hash value:

my %h = 'a', 'b' => 'c';
    say %h<a>.^name;            # OUTPUT: «Pair␤»
    say %h<a>.key;              # OUTPUT: «b␤»

If the same key appears more than once, the value associated with its last occurrence is stored in the hash:

my %h = a => 1, a => 2;
    say %h<a>;                  # OUTPUT: «2␤»

To assign a hash to a variable which does not have the % sigil, you may use the %() hash constructor:

my $h = %( a => 1, b => 2 );
    say $h.^name;               # OUTPUT: «Hash␤»
    say $h<a>;                  # OUTPUT: «1␤»

If one or more values reference the topic variable, $_, the right-hand side of the assignment will be interpreted as a Block, not a Hash:

my @people = [
    %( id => "1A", firstName => "Andy", lastName => "Adams" ),
    %( id => "2B", firstName => "Beth", lastName => "Burke" ),
    # ...
];

sub lookup-user (Hash $h) { #`(Do something...) $h }

my @names = map {
    # While this creates a hash:
    my  $query = { name => "$person<firstName> $person<lastName>" };
    say $query.^name;      # OUTPUT: «Hash␤»

    # Doing this will create a Block. Oh no!
    my  $query2 = { name => "$_<firstName> $_<lastName>" };
    say $query2.^name;       # OUTPUT: «Block␤»
    say $query2<name>;       # fails

    CATCH { default { put .^name, ': ', .Str } };
    # OUTPUT: «X::AdHoc: Type Block does not support associative indexing.␤»
    lookup-user($query);
    # Type check failed in binding $h; expected Hash but got Block
}, @people;

This would have been avoided if you had used the %() hash constructor. Only use curly braces for creating Blocks.

Hash slices

You can assign to multiple keys at the same time with a slice.

my %h; %h<a b c> = 2 xx *; %h.raku.say;  # OUTPUT: «{:a(2), :b(2), :c(2)}␤»
    my %h; %h<a b c> = ^3;     %h.raku.say;  # OUTPUT: «{:a(0), :b(1), :c(2)}␤»

Non-string keys (object hash)

By default keys in { } are forced to strings. To compose a hash with non-string keys, use a colon prefix:

my $when = :{ (now) => "Instant", (DateTime.now) => "DateTime" };

Note that with objects as keys, you often cannot use the <...> construct for key lookup, as it creates only strings and allomorphs. Use the {...} instead:

:{  0  => 42 }<0>.say;   # Int    as key, IntStr in lookup; OUTPUT: «(Any)␤»
    :{  0  => 42 }{0}.say;   # Int    as key, Int    in lookup; OUTPUT: «42␤»
    :{ '0' => 42 }<0>.say;   # Str    as key, IntStr in lookup; OUTPUT: «(Any)␤»
    :{ '0' => 42 }{'0'}.say; # Str    as key, Str    in lookup; OUTPUT: «42␤»
    :{ <0> => 42 }<0>.say;   # IntStr as key, IntStr in lookup; OUTPUT: «42␤»

Note: Rakudo implementation currently erroneously applies the same rules for :{ } as it does for { } and can construct a Block in certain circumstances. To avoid that, you can instantiate a parameterized Hash directly. Parameterization of %-sigiled variables is also supported:

my Num %foo1      = "0" => 0e0; # Str keys and Num values
    my     %foo2{Int} =  0  => "x"; # Int keys and Any values
    my Num %foo3{Int} =  0  => 0e0; # Int keys and Num values
    Hash[Num,Int].new: 0, 0e0;      # Int keys and Num values

Now if you want to define a hash to preserve the objects you are using as keys as the exact objects you are providing to the hash to use as keys, then object hashes are what you are looking for.

my %intervals{Instant};
my $first-instant = now;
%intervals{ $first-instant } = "Our first milestone.";
sleep 1;
my $second-instant = now;
%intervals{ $second-instant } = "Logging this Instant for spurious raisins.";
for %intervals.sort -> (:$key, :$value) {
    state $last-instant //= $key;
    say "We noted '$value' at $key, with an interval of {$key - $last-instant}";
    $last-instant = $key;
}

This example uses an object hash that only accepts keys of type Instant to implement a rudimentary, yet type-safe, logging mechanism. We utilize a named state variable for keeping track of the previous Instant so that we can provide an interval.

The whole point of object hashes is to keep keys as objects-in-themselves. Currently object hashes utilize the WHICH method of an object, which returns a unique identifier for every mutable object. This is the keystone upon which the object identity operator (===) rests. Order and containers really matter here as the order of .keys is undefined and one anonymous list is never === to another.

my %intervals{Instant};
my $first-instant = now;
%intervals{ $first-instant } = "Our first milestone.";
sleep 1;
my $second-instant = now;
%intervals{ $second-instant } = "Logging this Instant for spurious raisins.";
say ($first-instant, $second-instant) ~~ %intervals.keys;       # OUTPUT: «False␤»
say ($first-instant, $second-instant) ~~ %intervals.keys.sort;  # OUTPUT: «True␤»
say ($first-instant, $second-instant) === %intervals.keys.sort; # OUTPUT: «False␤»
say $first-instant === %intervals.keys.sort[0];                 # OUTPUT: «True␤»

Since Instant defines its own comparison methods, in our example a sort according to cmp will always provide the earliest instant object as the first element in the List it returns.

If you would like to accept any object whatsoever in your hash, you can use Any!

my %h{Any};
    %h{(now)} = "This is an Instant";
    %h{(DateTime.now)} = "This is a DateTime, which is not an Instant";
    %h{"completely different"} = "Monty Python references are neither DateTimes nor Instants";

There is a more concise syntax which uses binding.

my %h := :{ (now) => "Instant", (DateTime.now) => "DateTime" };

The binding is necessary because an object hash is about very solid, specific objects, which is something that binding is great at keeping track of but about which assignment doesn't concern itself much.

Since 6.d was released, Junctions can also be used as hash keys. The result will also be a Junction of the same type used as key.

my %hash = %( a => 1, b => 2, c=> 3);
    say %hash{"a"|"c"};   # OUTPUT: «any(1, 3)␤»
    say %hash{"b"^"c"};   # OUTPUT: «one(2, 3)␤»
    say %hash{"a" & "c"}; # OUTPUT: «all(1, 3)␤»

If a Junction of any kind is used to define a key, it will have the same effect of defining elements of the Junction as separate keys:

my %hash = %( "a"|"b" => 1, c => 2 );
say %hash{"b"|"c"};       # OUTPUT: «any(1, 2)␤»

Constraint value types

Place a type object in-between the declarator and the name to constrain the type of all values of a Hash.

my Int %h;
    put %h<Goku>   = 900;
try {
        %h<Vegeta> = "string";
        CATCH { when X::TypeCheck::Binding { .message.put } }
    }
# OUTPUT:
    # 9001
    # Type check failed in assignment to %h; expected Int but got Str ("string")

You can do the same by a more readable syntax.

my %h of Int; # the same as my Int %h

If you want to constraint the type of all keys of a Hash, add {Type} following the name of variable.

my %h{Int};

Even put these two constraints together.

my %h{Int} of Int;
    put %h{21} = 42;
try {
        %h{0} = "String";
        CATCH { when X::TypeCheck::Binding { .message.put } }
    }
try {
        %h<string> = 42;
        CATCH { when X::TypeCheck::Binding { .message.put } }
    }
try {
        %h<string> = "String";
        CATCH { when X::TypeCheck::Binding { .message.put } }
    }
# OUTPUT:
    # 42
    # Type check failed in binding to parameter 'assignval'; expected Int but got Str ("String")
    # Type check failed in binding to parameter 'key'; expected Int but got Str ("string")
    # Type check failed in binding to parameter 'key'; expected Int but got Str ("string")

Looping over hash keys and values

A common idiom for processing the elements in a hash is to loop over the keys and values, for instance,

my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
    for %vowels.kv -> $vowel, $index {
      "$vowel: $index".say;
    }

gives output similar to this:

a: 1
e: 2
o: 4
u: 5
i: 3

where we have used the kv method to extract the keys and their respective values from the hash, so that we can pass these values into the loop.

Note that the order of the keys and values printed cannot be relied upon; the elements of a hash are not always stored the same way in memory for different runs of the same program. In fact, since version 2018.05, the order is guaranteed to be different in every invocation. Sometimes one wishes to process the elements sorted on, e.g., the keys of the hash. If one wishes to print the list of vowels in alphabetical order then one would write

my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
    for %vowels.sort(*.key)>>.kv -> ($vowel, $index) {
      "$vowel: $index".say;
    }

which prints

a: 1
e: 2
i: 3
o: 4
u: 5

in alphabetical order as desired. To achieve this result, we sorted the hash of vowels by key (%vowels.sort(*.key)) which we then ask for its keys and values by applying the .kv method to each element via the unary >> hyperoperator resulting in a List of key/value lists. To extract the key/value the variables thus need to be wrapped in parentheses.

An alternative solution is to flatten the resulting list. Then the key/value pairs can be accessed in the same way as with plain .kv:

my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
    for %vowels.sort(*.key)>>.kv.flat -> $vowel, $index {
      "$vowel: $index".say;
    }

You can also loop over a Hash using destructuring.

In place editing of values

There may be times when you would like to modify the values of a hash while iterating over them.

my %answers = illuminatus => 23, hitchhikers => 42;
    # OUTPUT: «hitchhikers => 42, illuminatus => 23»
    for %answers.values -> $v { $v += 10 }; # Fails
    CATCH { default { put .^name, ': ', .Str } };
    # OUTPUT: «X::AdHoc: Cannot assign to a readonly variable or a value␤»

This is traditionally accomplished by sending both the key and the value as follows.

my %answers = illuminatus => 23, hitchhikers => 42;
    for %answers.kv -> $k,$v { %answers{$k} = $v + 10 };

However, it is possible to leverage the signature of the block in order to specify that you would like read-write access to the values.

my %answers = illuminatus => 23, hitchhikers => 42;
    for %answers.values -> $v is rw { $v += 10 };

It is not possible directly to do in-place editing of hash keys, even in the case of object hashes; however, a key can be deleted and a new key/value pair added to achieve the same results. For example, given this hash:

my %h = a => 1, b => 2;
for %h.keys.sort -> $k {
    # use sort to ease output comparisons
    print "$k => {%h{$k}}; ";
}
say ''; # OUTPUT: «a => 1; b => 2; ␤»

replace key 'b' with 'bb' but retain 'b's value as the new key's value:

for %h.keys -> $k {
    if $k eq 'b' {
        %h<bb> = %h{$k}:delete;
    }
}
for %h.keys.sort -> $k {
    print "$k => {%h{$k}}; ";
}
say ''; # OUTPUT: «a => 1; bb => 2; ␤»

See Also

Containers

A low-level explanation of Raku containers

Contexts and contextualizers

What are contexts and how to switch into them

Control flow

Statements used to control the flow of execution

Enumeration

An example using the enum type

Exceptions

Using exceptions in Raku

Functions

Functions and functional programming in Raku

Grammars

Parsing and interpreting text

Input/Output the definitive guide

Correctly use Raku IO

Lists, sequences, and arrays

Positional data constructs

Metaobject protocol (MOP)

Introspection and the Raku object system

Native calling interface

Call into dynamic libraries that follow the C calling convention

Raku native types

Using the types the compiler and hardware make available to you

Newline handling in Raku

How the different newline characters are handled, and how to change the behavior

Numerics

Numeric types available in Raku

Object orientation

Object orientation in Raku

Operators

Common Raku infixes, prefixes, postfixes, and more!

Packages

Organizing and referencing namespaced program elements

Performance

Measuring and improving runtime or compile-time performance

Phasers

Program execution phases and corresponding phaser blocks

Pragmas

Special modules that define certain aspects of the behavior of the code

Quoting constructs

Writing strings and word lists, in Raku

Regexes

Pattern matching against strings

Sets, bags, and mixes

Unordered collections of unique and weighted objects in Raku

Signature literals

A guide to signatures in Raku

Statement prefixes

Prefixes that alter the behavior of a statement or a set of them

Data structures

How Raku deals with data structures and what we can expect from them

Subscripts

Accessing data structure elements by index or key

Syntax

General rules of Raku syntax

System interaction

Working with the underlying operating system and running applications

Date and time functions

Processing date and time in Raku

Traits

Compile-time specification of behavior made easy

Unicode versus ASCII symbols

Unicode symbols and their ASCII equivalents

Unicode

Unicode support in Raku

Variables

Variables in Raku

Independent routines

Routines not defined within any class or role.

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.