Incremental-grammar-enhancement
Incremental grammar enhancement
Introduction
Procedure outline
Come up with sentences from a certain Domain Specific Language (DSL).
Request a certain Large Language Model (LLM) -- for example, ChatGPT or PaLM -- to generate a corresponding grammar in Backus-Naur Form (BNF).
Using the obtained BNF string create a corresponding Raku object that can be used generate new random sentences. One of:
Raku class for "FunctionalParsers"
Raku grammar
With Raku object generate a set of random sentences.
Request LLM to come up with, say, 5-10 variations of each sentence.
Request BNF for the new, enhanced set of sentences.
Is the obtained grammar large or comprehensive enough?
If not then go to step 2.
If yes finish.
Setup
Here are the packages we are going to use:
use Grammar::TokenProcessing;
use EBNF::Grammar;
use FunctionalParsers;
use WWW::OpenAI;
use WWW::PaLM;
# (Any)
Several iterations
my @startSentences = [
'I hate R', 'I love WL', 'We hate WL', 'I love R',
'I love Julia', 'I hate R', 'We hate R', 'I hate WL'
];
# [I hate R I love WL We hate WL I love R I love Julia I hate R We hate R I hate WL]
my $request1 = "Generate BNF grammar for the sentences: {@startSentences.join(', ')}";
my $variations1 = palm-generate-text($request1, format=>'values', temperature => 0.15, max-output-tokens => 600);
$variations1
# <sentence> ::= <subject> <verb> <object>
# <subject> ::= I | we
# <verb> ::= hate | love
# <object> ::= R | WL | Julia
my $variations2 = $variations1.lines.grep({ EBNF::Grammar::Relaxed.parse($_, rule => 'rule') }).join("\n");
# <sentence> ::= <subject> <verb> <object>
# <subject> ::= I | we
# <verb> ::= hate | love
# <object> ::= R | WL | Julia
my $grCode = ebnf-interpret($variations2, style => 'inverted', name => 'First');
say $grCode;
# grammar First {
# regex sentence { <subject> <verb> <object> }
# regex subject { 'I' | 'we' }
# regex verb { 'hate' | 'love' }
# regex object { 'R' | 'WL' | 'Julia' }
# }
my $gr = ebnf-interpret($variations2, style => 'inverted', name=>'First'):eval;
# (First)
my $grTopRule = "<{grammar-top-rule($grCode)}>";
say $grTopRule;
# <sentence>
my @genSentences = random-sentence-generation($gr, $grTopRule) xx 12;
.say for @genSentences;
# we love R
# we hate R
# we love R
# we hate R
# we hate Julia
# we love Julia
# I love WL
# I love Julia
# we hate R
# I hate WL
# I love Julia
# I love R