Incremental-grammar-enhancement

Incremental grammar enhancement

Introduction

Procedure outline

  1. Come up with sentences from a certain Domain Specific Language (DSL).

  2. Request a certain Large Language Model (LLM) -- for example, ChatGPT or PaLM -- to generate a corresponding grammar in Backus-Naur Form (BNF).

  3. Using the obtained BNF string create a corresponding Raku object that can be used generate new random sentences. One of:

    • Raku class for "FunctionalParsers"

    • Raku grammar

  4. With Raku object generate a set of random sentences.

  5. Request LLM to come up with, say, 5-10 variations of each sentence.

  6. Request BNF for the new, enhanced set of sentences.

  7. Is the obtained grammar large or comprehensive enough?

    • If not then go to step 2.

    • If yes finish.

Setup

Here are the packages we are going to use:

use Grammar::TokenProcessing;
use EBNF::Grammar;
use FunctionalParsers;
use WWW::OpenAI;
use WWW::PaLM;
# (Any)

Several iterations

my @startSentences = [
  'I hate R', 'I love WL', 'We hate WL', 'I love R',
  'I love Julia', 'I hate R', 'We hate R', 'I hate WL'
];
# [I hate R I love WL We hate WL I love R I love Julia I hate R We hate R I hate WL]
my $request1 = "Generate BNF grammar for the sentences: {@startSentences.join(', ')}";
my $variations1 = palm-generate-text($request1, format=>'values', temperature => 0.15, max-output-tokens => 600);
$variations1
# <sentence> ::= <subject> <verb> <object>
# <subject> ::= I | we
# <verb> ::= hate | love
# <object> ::= R | WL | Julia
my $variations2 = $variations1.lines.grep({ EBNF::Grammar::Relaxed.parse($_, rule => 'rule') }).join("\n");
# <sentence> ::= <subject> <verb> <object>
# <subject> ::= I | we
# <verb> ::= hate | love
# <object> ::= R | WL | Julia
my $grCode = ebnf-interpret($variations2, style => 'inverted', name => 'First');
say $grCode;
# grammar First {
# 	regex sentence { <subject> <verb> <object> }
# 	regex subject { 'I' | 'we' }
# 	regex verb { 'hate' | 'love' }
# 	regex object { 'R' | 'WL' | 'Julia' }
# }
my $gr = ebnf-interpret($variations2, style => 'inverted', name=>'First'):eval;
# (First)
my $grTopRule = "<{grammar-top-rule($grCode)}>";
say $grTopRule;
# <sentence>
my @genSentences = random-sentence-generation($gr, $grTopRule) xx 12;

.say for @genSentences;
# we love R
# we hate R
# we love R
# we hate R
# we hate Julia
# we love Julia
# I love WL
# I love Julia
# we hate R
# I hate WL
# I love Julia
# I love R

References

Articles

Packages, repositories

EBNF::Grammar v0.1.0

EBNF grammar and interpreters.

Authors

  • Anton Antonov

License

Artistic-2.0

Dependencies

Test Dependencies

Provides

  • EBNF::Actions::Raku::FunctionalParsers
  • EBNF::Actions::Raku::Grammar
  • EBNF::Actions::WL::FunctionalParsers
  • EBNF::Grammar
  • EBNF::Grammar::Standardish

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.