Unicode versus ASCII symbols

Unicode symbols and their ASCII equivalents

The following Unicode symbols can be used in Raku without needing to load any additional modules. Some of them have equivalents which can be typed with ASCII-only characters.

Reference is made below to various properties of unicode codepoints. The definitive list can be found here: https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt.

Alphabetic characters

Any codepoint that has the Ll (Letter, lowercase), Lu (Letter, uppercase), Lt (Letter, titlecase), Lm (Letter, modifier), or the Lo (Letter, other) property can be used just like any other alphabetic character from the ASCII range.

my $Ī” = 1;
$Ī”++;
say $Ī”;

Numeric characters

Any codepoint that has the Nd (Number, decimal digit) property, can be used as a digit in any number. For example:

my $var = ļ¼‘ļ¼™; # U+FF11 U+FF19
  say $var + 2;   # OUTPUT: Ā«21ā¤Ā»

Numeric values

Any codepoint that has the No (Number, other) or Nl (Number, letter) property can be used standalone as a numeric value, such as Ā½ and ā…“. (These aren't decimal digit characters, so can't be combined.) For example:

my $var = ā…’ + 2 + ā…«; # here ā…’ is No and Rat and ā…« is Nl and Int
  say $var;              # OUTPUT: Ā«14.1ā¤Ā»

Whitespace characters

Besides spaces and tabs, you can use any other unicode whitespace character that has the Zs (Separator, space), Zl (Separator, line), or Zp (Separator, paragraph) property.

See Wikipedia's Whitespace section for detailed tables of the Unicode codepoints with (or associated with) whitespace characteristics. This is an important section for Raku authors of digital typography modules for print or web use.

Other acceptable single codepoints

This list contains the single codepoints [and their ASCII equivalents] that have a special meaning in Raku.

>

Symbol Codepoint ASCII Remarks
Ā« U+00AB << as part of Ā«Ā» or .Ā« or regex left word boundary
Ā» U+00BB >> as part of Ā«Ā» or .Ā» or regex right word boundary
Ɨ U+00D7 *
Ć· U+00F7 /
ā‰¤ U+2264 <=
ā‰„ U+2265 >=
ā‰  U+2260 !=
āˆ’ U+2212 -
āˆ˜ U+2218 o
ā‰… U+2245 =~=
Ļ€ U+03C0 pi 3.14159_26535_89793_238e0
Ļ„ U+03C4 tau 6.28318_53071_79586_476e0
š‘’ U+1D452 e 2.71828_18284_59045_235e0
āˆž U+221E Inf
ā€¦ U+2026 ...
ā€˜ U+2018 ' as part of ā€˜ā€™ or ā€™ā€˜
ā€™ U+2019 ' as part of ā€˜ā€™ or ā€šā€™ or ā€™ā€˜
ā€š U+201A ' as part of ā€šā€˜ or ā€šā€™
ā€œ U+201C " as part of ā€œā€ or ā€ā€œ
ā€ U+201D " as part of ā€œā€ or ā€ā€œ or ā€ā€
ā€ž U+201E " as part of ā€žā€œ or ā€žā€
ļ½¢ U+FF62 Q// as part of ļ½¢ļ½£ (Note: Q// variant cannot be used bare in regexes)
ļ½£ U+FF63 Q// as part of ļ½¢ļ½£ (Note: Q// variant cannot be used bare in regexes)
āŗ U+207A \+ (must use explicit number) as part of exponentiation
ā» U+207B - (must use explicit number) as part of exponentiation
ĀÆ U+00AF - (must use explicit number) as part of exponentiation (macron is an alternative way of writing a minus)
ā° U+2070 **0 can be combined with ā°..ā¹
Ā¹ U+00B9 **1 can be combined with ā°..ā¹
Ā² U+00B2 **2 can be combined with ā°..ā¹
Ā³ U+00B3 **3 can be combined with ā°..ā¹
ā“ U+2074 **4 can be combined with ā°..ā¹
āµ U+2075 **5 can be combined with ā°..ā¹
ā¶ U+2076 **6 can be combined with ā°..ā¹
ā· U+2077 **7 can be combined with ā°..ā¹
āø U+2078 **8 can be combined with ā°..ā¹
ā¹ U+2079 **9 can be combined with ā°..ā¹
āˆ… U+2205 set() (empty set)
āˆˆ U+2208 (elem)
āˆ‰ U+2209 !(elem)
āˆ‹ U+220B (cont)
āˆŒ U+220C !(cont)
ā‰” U+2261 (==)
ā‰¢ U+2262 !(==)
āŠ† U+2286 (<=)
āŠˆ U+2288 !(<=)
āŠ‚ U+2282 (<)
āŠ„ U+2284 !(<)
āŠ‡ U+2287 (>=)
āŠ‰ U+2289 !(>=)
āŠƒ U+2283 (>)
āŠ… U+2285 !(>)
āˆŖ U+222A (|)
āˆ© U+2229 (&)
āˆ– U+2216 (-)
āŠ– U+2296 (^)
āŠ U+228D (.)
āŠŽ U+228E (+)

Atomic operators

The atomic operators have U+269B āš› ATOM SYMBOL incorporated into them. Their ASCII equivalents are ordinary subroutines, not operators:

my atomicint $x = 42;
    $xāš›++;                # Unicode version
    atomic-fetch-inc($x); # ASCII version

The ASCII alternatives are as follows:

Symbol | ASCII | Remarks
āš›= | atomic-assign
āš› | atomic-fetch this is the prefix:<āš›> operator
āš›+= | atomic-add-fetch
āš›-= | atomic-sub-fetch
āš›āˆ’= | atomic-sub-fetch this operator uses U+2212 minus sign
++āš› | atomic-inc-fetch
āš›++ | atomic-fetch-inc
--āš› | atomic-dec-fetch
āš›-- | atomic-fetch-dec

Multiple codepoints

This list contains multiple-codepoint operators that require special composition for their ASCII equivalents. Note the codepoints are shown space-separated but should be entered as adjacent codepoints when used.

=Ā»>>=Ā«>

Symbol Codepoints ASCII Since Remarks
Ā»=Ā» U+00BB = U+00BB >>[=]>> v6.c uses ASCII '='
Ā«=Ā« U+00AB = U+00AB <<[=]<< v6.c uses ASCII '='
Ā«=Ā» U+00AB = U+00BB <<[=]>> v6.c uses ASCII '='
Ā»=Ā« U+00BB = U+00AB >>[=]<< v6.c uses ASCII '='

See Also

Containers

A low-level explanation of Raku containers

Contexts and contextualizers

What are contexts and how to switch into them

Control flow

Statements used to control the flow of execution

Enumeration

An example using the enum type

Exceptions

Using exceptions in Raku

Functions

Functions and functional programming in Raku

Grammars

Parsing and interpreting text

Hashes and maps

Working with associative arrays/dictionaries/hashes

Input/Output the definitive guide

Correctly use Raku IO

Lists, sequences, and arrays

Positional data constructs

Metaobject protocol (MOP)

Introspection and the Raku object system

Native calling interface

Call into dynamic libraries that follow the C calling convention

Raku native types

Using the types the compiler and hardware make available to you

Newline handling in Raku

How the different newline characters are handled, and how to change the behavior

Numerics

Numeric types available in Raku

Object orientation

Object orientation in Raku

Operators

Common Raku infixes, prefixes, postfixes, and more!

Packages

Organizing and referencing namespaced program elements

Performance

Measuring and improving runtime or compile-time performance

Phasers

Program execution phases and corresponding phaser blocks

Pragmas

Special modules that define certain aspects of the behavior of the code

Quoting constructs

Writing strings and word lists, in Raku

Regexes

Pattern matching against strings

Sets, bags, and mixes

Unordered collections of unique and weighted objects in Raku

Signature literals

A guide to signatures in Raku

Statement prefixes

Prefixes that alter the behavior of a statement or a set of them

Data structures

How Raku deals with data structures and what we can expect from them

Subscripts

Accessing data structure elements by index or key

Syntax

General rules of Raku syntax

System interaction

Working with the underlying operating system and running applications

Date and time functions

Processing date and time in Raku

Traits

Compile-time specification of behavior made easy

Unicode

Unicode support in Raku

Variables

Variables in Raku

Independent routines

Routines not defined within any class or role.

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.