Unicode versus ASCII symbols
The following Unicode symbols can be used in Raku without needing to load any additional modules. Some of them have equivalents which can be typed with ASCII-only characters.
Reference is made below to various properties of unicode codepoints. The definitive list can be found here: https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt.
Alphabetic characters
Any codepoint that has the Ll
(Letter, lowercase), Lu
(Letter,
uppercase), Lt
(Letter, titlecase), Lm
(Letter, modifier), or
the Lo
(Letter, other) property can be used just like any other
alphabetic character from the ASCII range.
my $Ī = 1;
$Ī++;
say $Ī;
Numeric characters
Any codepoint that has the Nd
(Number, decimal digit) property, can
be used as a digit in any number. For example:
my $var = ļ¼ļ¼; # U+FF11 U+FF19
say $var + 2; # OUTPUT: Ā«21ā¤Ā»
Numeric values
Any codepoint that has the No
(Number, other) or Nl
(Number, letter)
property can be used standalone as a numeric value, such as Ā½ and ā
. (These
aren't decimal digit characters, so can't be combined.) For example:
my $var = ā
+ 2 + ā
«; # here ā
is No and Rat and ā
« is Nl and Int
say $var; # OUTPUT: Ā«14.1ā¤Ā»
Whitespace characters
Besides spaces and tabs, you can use any other unicode whitespace
character that has the Zs
(Separator, space), Zl
(Separator,
line), or Zp
(Separator, paragraph) property.
See Wikipedia's Whitespace section for detailed tables of the Unicode codepoints with (or associated with) whitespace characteristics. This is an important section for Raku authors of digital typography modules for print or web use.
Other acceptable single codepoints
This list contains the single codepoints [and their ASCII equivalents] that have a special meaning in Raku.
>
Symbol | Codepoint | ASCII | Remarks |
---|---|---|---|
Ā« | U+00AB | << | as part of Ā«Ā» or .Ā« or regex left word boundary |
Ā» | U+00BB | >> | as part of Ā«Ā» or .Ā» or regex right word boundary |
Ć | U+00D7 | * | |
Ć· | U+00F7 | / | |
ā¤ | U+2264 | <= | |
ā„ | U+2265 | >= | |
ā | U+2260 | != | |
ā | U+2212 | - | |
ā | U+2218 | o | |
ā | U+2245 | =~= | |
Ļ | U+03C0 | pi | 3.14159_26535_89793_238e0 |
Ļ | U+03C4 | tau | 6.28318_53071_79586_476e0 |
š | U+1D452 | e | 2.71828_18284_59045_235e0 |
ā | U+221E | Inf | |
ā¦ | U+2026 | ... | |
ā | U+2018 | ' | as part of āā or āā |
ā | U+2019 | ' | as part of āā or āā or āā |
ā | U+201A | ' | as part of āā or āā |
ā | U+201C | " | as part of āā or āā |
ā | U+201D | " | as part of āā or āā or āā |
ā | U+201E | " | as part of āā or āā |
ļ½¢ | U+FF62 | Q// | as part of ļ½¢ļ½£ (Note: Q// variant cannot be used bare in regexes) |
ļ½£ | U+FF63 | Q// | as part of ļ½¢ļ½£ (Note: Q// variant cannot be used bare in regexes) |
āŗ | U+207A | \+ | (must use explicit number) as part of exponentiation |
ā» | U+207B | - | (must use explicit number) as part of exponentiation |
ĀÆ | U+00AF | - | (must use explicit number) as part of exponentiation (macron is an alternative way of writing a minus) |
ā° | U+2070 | **0 | can be combined with ā°..ā¹ |
Ā¹ | U+00B9 | **1 | can be combined with ā°..ā¹ |
Ā² | U+00B2 | **2 | can be combined with ā°..ā¹ |
Ā³ | U+00B3 | **3 | can be combined with ā°..ā¹ |
ā“ | U+2074 | **4 | can be combined with ā°..ā¹ |
āµ | U+2075 | **5 | can be combined with ā°..ā¹ |
ā¶ | U+2076 | **6 | can be combined with ā°..ā¹ |
ā· | U+2077 | **7 | can be combined with ā°..ā¹ |
āø | U+2078 | **8 | can be combined with ā°..ā¹ |
ā¹ | U+2079 | **9 | can be combined with ā°..ā¹ |
ā | U+2205 | set() | (empty set) |
ā | U+2208 | (elem) | |
ā | U+2209 | !(elem) | |
ā | U+220B | (cont) | |
ā | U+220C | !(cont) | |
ā” | U+2261 | (==) | |
ā¢ | U+2262 | !(==) | |
ā | U+2286 | (<=) | |
ā | U+2288 | !(<=) | |
ā | U+2282 | (<) | |
ā | U+2284 | !(<) | |
ā | U+2287 | (>=) | |
ā | U+2289 | !(>=) | |
ā | U+2283 | (>) | |
ā | U+2285 | !(>) | |
āŖ | U+222A | (|) | |
ā© | U+2229 | (&) | |
ā | U+2216 | (-) | |
ā | U+2296 | (^) | |
ā | U+228D | (.) | |
ā | U+228E | (+) |
Atomic operators
The atomic operators have U+269B ā ATOM SYMBOL
incorporated into them. Their
ASCII equivalents are ordinary subroutines, not operators:
my atomicint $x = 42;
$xā++; # Unicode version
atomic-fetch-inc($x); # ASCII version
The ASCII alternatives are as follows:
Symbol | ASCII | | Remarks |
---|---|
ā= | atomic-assign | |
ā | atomic-fetch | this is the prefix:<ā> operator |
ā+= | atomic-add-fetch | |
ā-= | atomic-sub-fetch | |
āā= | atomic-sub-fetch | this operator uses U+2212 minus sign |
++ā | atomic-inc-fetch | |
ā++ | atomic-fetch-inc | |
--ā | atomic-dec-fetch | |
ā-- | atomic-fetch-dec |
Multiple codepoints
This list contains multiple-codepoint operators that require special composition for their ASCII equivalents. Note the codepoints are shown space-separated but should be entered as adjacent codepoints when used.
=Ā»>>=Ā«>
Symbol | Codepoints | ASCII | Since | Remarks |
---|---|---|---|---|
Ā»=Ā» | U+00BB = U+00BB | >>[=]>> | v6.c | uses ASCII '=' |
Ā«=Ā« | U+00AB = U+00AB | <<[=]<< | v6.c | uses ASCII '=' |
Ā«=Ā» | U+00AB = U+00BB | <<[=]>> | v6.c | uses ASCII '=' |
Ā»=Ā« | U+00BB = U+00AB | >>[=]<< | v6.c | uses ASCII '=' |