class Grammar
class Grammar is Match {}
Every type declared with grammar
, and not explicitly stating its superclass,
becomes a subclass of Grammar
.
grammar Identifier {
token TOP { <initial> <rest>* }
token initial { <+myletter +[_]> }
token rest { <+myletter +mynumber +[_]> }
token myletter { <[A..Za..z]> }
token mynumber { <[0..9]> }
}
say Identifier.isa(Grammar); # OUTPUT: «True»
my $match = Identifier.parse('W4anD0eR96');
say ~$match; # OUTPUT: «W4anD0eR96»
More documentation on grammars is available.
Methods
method parse
method parse($target, :$rule = 'TOP', Capture() :$args = \(), Mu :$actions = Mu, *%opt)
Parses the $target
, which will be coerced to Str if it isn't
one, using $rule
as the starting rule. Additional $args
will be passed
to the starting rule if provided.
grammar RepeatChar {
token start($character) { $character+ }
}
say RepeatChar.parse('aaaaaa', :rule('start'), :args(\('a')));
say RepeatChar.parse('bbbbbb', :rule('start'), :args(\('b')));
# OUTPUT:
# 「aaaaaa」
# 「bbbbbb」
If the actions
named argument is provided, it will be used as an actions
object, that is, for each successful regex match, a method of the same name,
if it exists, is called on the actions object, passing the match object as the
sole positional argument.
my $actions = class { method TOP($/) { say "7" } };
grammar { token TOP { a { say "42" } b } }.parse('ab', :$actions);
# OUTPUT: «427»
Additional named arguments are used as options for matching, so you can for example specify
things like :pos(4)
to start parsing from the fifth (:pos is zero-based) character.
All matching adverbs are allowed, but not all of
them take effect. There are several types of adverbs that a regex can have,
some of which apply at compile time, like :s
and :i
. You cannot pass those
to .parse
, because the regexes have already been compiled. But, you can pass
those adverbs that affect the runtime behavior, such as :pos
and :continue
.
say RepeatChar.parse('bbbbbb', :rule('start'), :args(\('b')), :pos(4)).Str;
# OUTPUT: «bb»
Method parse
only succeeds if the cursor has arrived at the end of the
target string when the match is over. Use method subparse
if you want to be able to stop in the middle.
The top regex in the grammar will be allowed to backtrack.
Returns a Match on success, and Nil on failure.
method subparse
method subparse($target, :$rule = 'TOP', Capture() :$args = \(), Mu :$actions = Mu, *%opt)
Does exactly the same as method parse, except that cursor doesn't have to reach the end of the string to succeed. That is, it doesn't have to match the whole string.
Note that unlike method parse, subparse
always returns
a Match, which will be a failed match (and thus falsy),
if the grammar failed to match.
grammar RepeatChar {
token start($character) { $character+ }
}
say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('b')));
say RepeatChar.parse( 'bbbabb', :rule('start'), :args(\('b')));
say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('a')));
say RepeatChar.subparse('bbbabb', :rule('start'), :args(\('a')), :pos(3));
# OUTPUT:
# 「bbb」
# Nil
# #<failed match>
# 「a」
method parsefile
method parsefile(Str(Cool) $filename, :$enc, *%opts)
Reads file $filename
encoding by $enc
, and parses it. All named arguments
are passed on to method parse.
grammar Identifiers {
token TOP { [<identifier><.ws>]+ }
token identifier { <initial> <rest>* }
token initial { <+myletter +[_]> }
token rest { <+myletter +mynumber +[_]> }
token myletter { <[A..Za..z]> }
token mynumber { <[0..9]> }
}
say Identifiers.parsefile('users.txt', :enc('UTF-8'))
.Str.trim.subst(/\n/, ',', :g);
# users.txt :
# TimToady
# lizmat
# jnthn
# moritz
# zoffixznet
# MasterDuke17
# OUTPUT: «TimToady,lizmat,jnthn,moritz,zoffixznet,MasterDuke17»