Text::SubParsers
Text::SubParsers
Raku package for extracting and processing of interpret-able sub-strings in texts.
Installation
From Zef ecosystem:
zef install Text::SubParsersFrom GitHub:
zef install https://github.com/antononcube/Raku-Text-SubParsers.gitUsage examples
Date extractions
Here we extract dates from a text:
use Text::SubParsers;
my $res = "Openheimer's birthday is April 22, 1905 or April 2, 1905, as far as I know.";
Text::SubParsers::Core.new('DateTime').subparse($res).raku;# $["Openheimer's birthday is ", DateTime.new(1905,4,22,0,0,0), " or ", DateTime.new(1905,4,2,0,0,0), ", as far as I know."]Compare with the result of the parse method over the same text:
Text::SubParsers::Core.new('DateTime').parse($res);#ERROR: Cannot interpret the given input with the given spec 'DateTime'.
# (Any)Here are the results of both subparse and parse on string that is a valid date specification:
Text::SubParsers::Core.new('DateTime').subparse('April 22, 1905');# 1905-04-22T00:00:00ZText::SubParsers::Core.new('DateTime').parse('April 22, 1905');# 1905-04-22T00:00:00ZSub-parsing with user supplied subs
Instead of using Text::SubParsers::Core.new the functions get-sub-parser and get-parser
can be used.
Here is an example of using:
Invocation of
get-sub-parser(Sub-)parsing with a user supplied function (sub)
sub known-cities(Str $x) {
$x ā ['Seattle', 'Chicago', 'New York', 'Sao Paulo', 'Miami', 'Los Angeles'] ?? $x.uc !! Nil
}
get-sub-parser(&known-cities).subparse("
1. New York City, NY - 8,804,190
2. Los Angeles, CA - 3,976,322
3. Chicago, IL - 2,746,388
4. Houston, TX - 2,304,580
5. Philadelphia, PA - 1,608,162
6. San Antonio, TX - 1,5
")# [
# 1. NEW YORK City, NY - 8,804,190
# 2. LOS ANGELES , CA - 3,976,322
# 3. CHICAGO , IL - 2,746,388
# 4. Houston, TX - 2,304,580
# 5. Philadelphia, PA - 1,608,162
# 6. San Antonio, TX - 1,5
# ]Here is the "full form" of the last result
_.raku# $["\n1. ", "NEW YORK", " City, NY - 8,804,190\n2. ", "LOS ANGELES", ", CA - 3,976,322\n3. ", "CHICAGO", ", IL - 2,746,388\n4. Houston, TX - 2,304,580\n5. Philadelphia, PA - 1,608,162\n6. San Antonio, TX - 1,5\n"]Sub-parsing with WhateverCode
With the parser spec WhateverCode an attempt is made to extract dates, JSON expressions, numbers, and Booleans (in that order).
Here is an example:
get-sub-parser(WhateverCode).subparse('
Is it true that the JSON expression {"date": "2023-03-08", "rationalNumber": "11/3"} contains the date 2023-03-08 and the rational number 11/3?
').raku# $["\nIs it", Bool::True, "that the JSON expression", {:date("2023-03-08"), :rationalNumber("11/3")}, "contains the date", DateTime.new(2023,3,8,0,0,0), "and the rational number", <11/3>, "?\n"]Processing LLM outputs
A primary motivation for creating this package is the post-processing the outputs of Large Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].
Here is an example of creating a LLM-function and its invocation over a string:
use LLM::Functions;
my &fs = llm-function(
{"What is the average speed of $_ ?"},
llm-evaluator => llm-configuration(
'PaLM',
prompts => 'You are knowledgeable engineer and you give concise, numeric answers.'));
say &fs('car in USA highway');# 79.5 mphHere is the corresponding interpretation using sub-parsers:
get-sub-parser('Numeric').subparse(_.trim).raku;# $[79.5, "mph"]Here is a more involved example in which:
An LLM is asked to produce a certain set of events in JSON format
The JSON fragment of the result is parsed
The obtained list of hashes is transformed into Mermaid-JS timeline diagram
my &ft = llm-function(
{"What are the $^a most significant events of $^b? Give the answer with date-event pairs in JSON format."},
form => get-sub-parser('JSON'),
llm-evaluator => llm-configuration('PaLM', max-tokens => 500));
my @ftRes = |&ft(9, 'WWI');
@ftRes = @ftRes.grep({ $_ !~~ Str });# [{date => 1914-07-28, event => Austria-Hungary declares war on Serbia} {date => 1914-07-29, event => Germany declares war on Russia} {date => 1914-07-30, event => France declares war on Germany} {date => 1914-08-01, event => Great Britain declares war on Germany} {date => 1914-08-04, event => Japan declares war on Germany} {date => 1914-11-09, event => First Battle of Ypres} {date => 1915-05-07, event => Second Battle of Ypres} {date => 1916-07-01, event => Battle of the Somme} {date => 1917-03-08, event => United States declares war on Germany}]my @timeline = ['timeline', 'title WW1 events'];
for @ftRes -> $record {
@timeline.append( "{$record<date>} : {$record<event>}");
}
@timeline.join("\n\t")References
Articles
[AA1] Anton Antonov, "LLM::Functions", (2023), RakuForPrediction at WordPress.
Packages
[AAp1] Anton Antonov, LLM::Functions Raku package, (2023), GitHub/antononcube.
[AAp2] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.
[AAp3] Anton Antonov, WWW::PaLM Raku package, (2023), GitHub/antononcube.