README-work

Text::SubParsers

Raku package for extracting and processing of interpret-able sub-strings in texts.

Installation

From Zef ecosystem:

zef install Text::SubParsers

From GitHub:

zef install https://github.com/antononcube/Raku-Text-SubParsers.git

Usage examples

Date extractions

Here we extract dates from a text:

use Text::SubParsers;
my $res = "Openheimer's birthday is April 22, 1905 or April 2, 1905, as far as I know.";

Text::SubParsers::Core.new('DateTime').subparse($res).raku;

Compare with the result of the parse method over the same text:

Text::SubParsers::Core.new('DateTime').parse($res);

Here are the results of both subparse and parse on string that is a valid date specification:

Text::SubParsers::Core.new('DateTime').subparse('April 22, 1905');
Text::SubParsers::Core.new('DateTime').parse('April 22, 1905');

Sub-parsing with user supplied subs

Instead of using Text::SubParsers::Core.new the functions get-sub-parser and get-parser can be used.

Here is an example of using:

  • Invocation of get-sub-parser

  • (Sub-)parsing with a user supplied function (sub)

sub known-cities(Str $x) { 
    $x āˆˆ ['Seattle', 'Chicago', 'New York', 'Sao Paulo', 'Miami', 'Los Angeles'] ?? $x.uc !! Nil 
}

get-sub-parser(&known-cities).subparse("
1. New York City, NY - 8,804,190
2. Los Angeles, CA - 3,976,322
3. Chicago, IL - 2,746,388
4. Houston, TX - 2,304,580
5. Philadelphia, PA - 1,608,162
6. San Antonio, TX - 1,5
")

Here is the "full form" of the last result

_.raku

Sub-parsing with WhateverCode

With the parser spec WhateverCode an attempt is made to extract dates, JSON expressions, numbers, and Booleans (in that order). Here is an example:

get-sub-parser(WhateverCode).subparse('
Is it true that the JSON expression {"date": "2023-03-08", "rationalNumber": "11/3"} contains the date 2023-03-08 and the rational number 11/3?
').raku

Processing LLM outputs

A primary motivation for creating this package is the post-processing the outputs of Large Language Models (LLMs), [AA1, AAp1, AAp2, AAp3].

Here is an example of creating a LLM-function and its invocation over a string:

use LLM::Functions;

my &fs = llm-function(
        {"What is the average speed of $_ ?"},
        llm-evaluator => llm-configuration(
                'PaLM',
                prompts => 'You are knowledgeable engineer and you give concise, numeric answers.'));

say &fs('car in USA highway');

Here is the corresponding interpretation using sub-parsers:

get-sub-parser('Numeric').subparse(_.trim).raku;

Here is a more involved example in which:

  1. An LLM is asked to produce a certain set of events in JSON format

  2. The JSON fragment of the result is parsed

  3. The obtained list of hashes is transformed into Mermaid-JS timeline diagram

my &ft = llm-function(
        {"What are the $^a most significant events of $^b? Give the answer with date-event pairs in JSON format."},
        form => get-sub-parser('JSON'),
        llm-evaluator => llm-configuration('PaLM', max-tokens => 500));

my @ftRes = |&ft(9, 'WWI');
@ftRes = @ftRes.grep({ $_ !~~ Str });
my @timeline = ['timeline', 'title WW1 events'];
for @ftRes -> $record {
    @timeline.append( "{$record<date>} : {$record<event>}");
}
@timeline.join("\n\t")

References

Articles

[AA1] Anton Antonov, "LLM::Functions", (2023), RakuForPrediction at WordPress.

Packages

[AAp1] Anton Antonov, LLM::Functions Raku package, (2023), GitHub/antononcube.

[AAp2] Anton Antonov, WWW::OpenAI Raku package, (2023), GitHub/antononcube.

[AAp3] Anton Antonov, WWW::PaLM Raku package, (2023), GitHub/antononcube.

Text::SubParsers v0.1.0

Text::SubParsers is for extracting and processing of interpret-able sub-strings in texts

Authors

  • Anton Antonov

License

Artistic-2.0

Dependencies

DateTime::Grammar:ver<0.1.2+>JSON::Fast:ver<0.19+>

Test Dependencies

Provides

  • Text::SubParsers
  • Text::SubParsers::Core
  • Text::SubParsers::Functions

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.