Sets, bags, and mixes

Unordered collections of unique and weighted objects in Raku

Introduction

The six collection classes are Set, SetHash, Bag, BagHash, Mix and MixHash. They all share similar semantics.

In a nutshell, these classes hold, in general, unordered collections of objects, much like an object hash. The QuantHash role is the role that is implemented by all of these classes: therefore they are also referenced as QuantHashes.

Set and SetHash also implement the Setty role, Bag and BagHash implement the Baggy role, Mix and MixHash implement the Mixy role (which itself implements the Baggy role).

Sets only consider if objects in the collection are present or not, bags can hold several objects of the same kind, and mixes also allow fractional (and negative) weights. The regular versions are immutable, the -Hash versions are mutable.

Let's elaborate on that. If you want to collect objects in a container but you do not care about the order of these objects, Raku provides these unordered collection types. Being unordered, these containers can be more efficient than Lists or Arrays for looking up elements or dealing with repeated items.

On the other hand, if you want to get the contained objects (elements) without duplicates and you only care whether an element is in the collection or not, you can use a Set or SetHash.

If you want to get rid of duplicates but still preserve order, take a look at the unique routine for List.

If you want to keep track of the number of times each object appeared, you can use a Bag or BagHash. In these Baggy containers each element has a weight (an unsigned integer) indicating the number of times the same object has been included in the collection.

The types Mix and MixHash are similar to Bag and BagHash, but they also allow fractional and negative weights.

Set, Bag, and Mix are immutable types. Use the mutable variants SetHash, BagHash, and MixHash if you want to add or remove elements after the container has been constructed.

For one thing, as far as they are concerned, identical objects refer to the same element ā€“ where identity is determined using the WHICH method (i.e. the same way that the === operator checks identity). For value types like Str, this means having the same value; for reference types like Array, it means referring to the same object instance.

Secondly, they provide a Hash-like interface where the actual elements of the collection (which can be objects of any type) are the 'keys', and the associated weights are the 'values':

type of $a value of $a{$b} if $b is an element value of $a{$b} if $b is not an element
Set / SetHash True False
Bag / BagHash a positive integer 0
Mix / MixHash a non-zero real number 0

Operators with set semantics

There are several infix operators devoted to performing common operations using QuantHash semantics. Since that is a mouthful, these operators are usually referred to as "set operators".

This does not mean that the parameters of these operators must always be Set, or even a more generic QuantHash. It just means that the logic that is applied to the operators is the logic of Set Theory.

These infixes can be written using the Unicode character that represents the function (like āˆˆ or āˆŖ), or with an equivalent ASCII version (like (elem) or (|)).

So explicitly using Set (or Bag or Mix) objects with these infixes is unnecessary. All set operators work with all possible arguments, including (since 6.d) those that are not explicitly set-like. If necessary, a coercion will take place internally: but in many cases that is not actually needed.

However, if a Bag or Mix is one of the parameters to these set operators, then the semantics will be upgraded to that type (where Mix supersedes Bag if both types happen to be used).

Set operators that return Bool

infix (elem), infix āˆˆ

Returns True if $a is an element of $b, else False. More information, Wikipedia definition.

infix āˆ‰

Returns True if $a is not an element of $b, else False. More information, Wikipedia definition.

infix (cont), infix āˆ‹

Returns True if $a contains $b as an element, else False. More information, Wikipedia definition.

infix āˆŒ

Returns True if $a does not contain $b, else False. More information, Wikipedia definition.

infix (<=), infix āŠ†

Returns True if $a is a subset or is equal to $b, else False. <More information>, Wikipedia definition.

infix āŠˆ

Returns True if $a is not a subset nor equal to $b, else False. More information, Wikipedia definition.

infix (<), infix āŠ‚

Returns True if $a is a strict subset of $b, else False. <More information>, Wikipedia definition.

infix āŠ„

Returns True if $a is not a strict subset of $b, else False. More information, Wikipedia definition.

infix (>=), infix āŠ‡

Returns True if $a is a superset of or equal to $b, else False. More information|/language/operators#infix_(>=),_infix_āŠ‡, Wikipedia definition.

infix āŠ‰

Returns True if $a is not a superset nor equal to $b, else False. More information, Wikipedia definition.

infix (>), infix āŠƒ

Returns True if $a is a strict superset of $b, else False. More information|/language/operators#infix_(>),_infix_āŠƒ, Wikipedia definition.

infix āŠ…

Returns True if $a is not a strict superset of $b, else False. More information, Wikipedia definition.

infix (==), infix ā‰”

Returns True if $a and $b are identical, else False. More information, Wikipedia definition.

Available as of the 2020.07 Rakudo compiler release. Users of older versions of Rakudo can install the Set::Equality module for the same functionality.

infix ā‰¢

Returns True if $a and $b are not identical, else False. More information, Wikipedia definition.

Available as of the 2020.07 Rakudo compiler release. Users of older versions of Rakudo can install the Set::Equality module for the same functionality.

Set operators that return a QuantHash

infix (|), infix āˆŖ

Returns the union of all its arguments. More information, Wikipedia definition.

infix (&), infix āˆ©

Returns the intersection of all of its arguments. More information, Wikipedia definition.

infix (-), infix āˆ–

Returns the set difference of all its arguments. More information, Wikipedia definition.

infix (^), infix āŠ–

Returns the symmetric set difference of all its arguments. More information, Wikipedia definition.

Set operators that return a Baggy

infix (.), infix āŠ

Returns the Baggy multiplication of its arguments. More information.

infix (+), infix āŠŽ

Returns the Baggy addition of its arguments. More information.

term āˆ…

The empty set. More information, Wikipedia definition.

See Also

Containers

A low-level explanation of Raku containers

Contexts and contextualizers

What are contexts and how to switch into them

Control flow

Statements used to control the flow of execution

Enumeration

An example using the enum type

Exceptions

Using exceptions in Raku

Functions

Functions and functional programming in Raku

Grammars

Parsing and interpreting text

Hashes and maps

Working with associative arrays/dictionaries/hashes

Input/Output the definitive guide

Correctly use Raku IO

Lists, sequences, and arrays

Positional data constructs

Metaobject protocol (MOP)

Introspection and the Raku object system

Native calling interface

Call into dynamic libraries that follow the C calling convention

Raku native types

Using the types the compiler and hardware make available to you

Newline handling in Raku

How the different newline characters are handled, and how to change the behavior

Numerics

Numeric types available in Raku

Object orientation

Object orientation in Raku

Operators

Common Raku infixes, prefixes, postfixes, and more!

Packages

Organizing and referencing namespaced program elements

Performance

Measuring and improving runtime or compile-time performance

Phasers

Program execution phases and corresponding phaser blocks

Pragmas

Special modules that define certain aspects of the behavior of the code

Quoting constructs

Writing strings and word lists, in Raku

Regexes

Pattern matching against strings

Signature literals

A guide to signatures in Raku

Statement prefixes

Prefixes that alter the behavior of a statement or a set of them

Data structures

How Raku deals with data structures and what we can expect from them

Subscripts

Accessing data structure elements by index or key

Syntax

General rules of Raku syntax

System interaction

Working with the underlying operating system and running applications

Date and time functions

Processing date and time in Raku

Traits

Compile-time specification of behavior made easy

Unicode versus ASCII symbols

Unicode symbols and their ASCII equivalents

Unicode

Unicode support in Raku

Variables

Variables in Raku

Independent routines

Routines not defined within any class or role.

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.