DSL::English::ClassificationWorkflows

Classification workflows building by natural language commands.

DSL::English::ClassificationWorkflows (Raku package)

In brief

This Raku (Perl 6) package has grammar classes and action classes for the parsing and interpretation of natural Domain Specific Language (DSL) commands that specify classification workflows.

The interpreters (actions) target different programming languages: Python, R, Raku, Wolfram Language (WL), and others. Also, different natural languages.

Currently, the generated pipelines are for the software monad "ClCon" implemented in WL, [AAp1], and WL's built-in commands.

Remark: "ClCon" stands for "Classification using a Context".

Remark: "WL" stands for "Wolfram Language". "Mathematica" and "WL" are used as synonyms.

Installation

Zef ecosystem:

zef install DSL::English::ClassificationWorkflows

GitHub:

zef install https://github.com/antononcube/Raku-DSL-English-ClassificationWorkflows.git

Examples

Programming languages

Here is a simple invocation:

use DSL::English::ClassificationWorkflows;

ToClassificationWorkflowCode('make a logistic regression classifier', 'WL::ClCon');
# ClConMakeClassifier[ "LogisticRegression" ]

Here is a more complicated pipeline specification used to generate the code for two WL classification systems:

my $command = q:to/END/;
use dfTitanic;
split the data with ratio 0.73;
create a logistic regression classifier;
show precision and recall;
show top confusions, misclassified examples, least certain examples;
show ROC plots;
END

say $_.key, "\n", $_.value, "\n"  for ($_ => ToClassificationWorkflowCode($command, $_ ) for <WL::ClCon WL::System>);
# WL::ClCon
# ClConUnit[ dfTitanic ] \[DoubleLongRightArrow]
# ClConSplitData[ "TrainingFraction" -> 0.73 ] \[DoubleLongRightArrow]
# ClConMakeClassifier[ "LogisticRegression" ] \[DoubleLongRightArrow]
# ClConClassifierMeasurements[ {"Precision", "Recall"} ] \[DoubleLongRightArrow] ClConEchoValue[] \[DoubleLongRightArrow]
# ClConClassifierMeasurements[ {"TopConfusions", "MisclassifiedExamples", "LeastCertainExamples"} ] \[DoubleLongRightArrow] ClConEchoValue[] \[DoubleLongRightArrow]
# ClConROCPlot[]
#
# WL::System
# data = ClConToNormalClassifierData @ dfTitanic;
# {dataTraining, dataTesting} = TakeDrop[ RandomSample[data], Floor[ 0.73 * Length[data] ] ];
# dataValidation = Automatic;
# clObj = Classify[ dataTraining, Method -> "LogisticRegression", ValidationSet -> dataValidation ];
# Echo @ ClassifierMeasurements[clObj, dataTesting, {"Precision", "Recall"} ];
# Echo @ ClassifierMeasurements[clObj, dataTesting, {"TopConfusions", "MisclassifiedExamples", "LeastCertainExamples"} ];
# Echo @ ROCPlot[]

Natural languages

say $_.key, "\n", $_.value, "\n"  for ($_ => ToClassificationWorkflowCode($command, $_ ) for <Bulgarian English Russian>);
# Bulgarian
# ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΉ Π΄Π°Π½Π½ΠΈΡ‚Π΅: dfTitanic
# Ρ€Π°Π·Π΄Π΅Π»ΠΈ Π΄Π°Π½Π½ΠΈΡ‚Π΅ с Ρ‚Ρ€Π΅Π½ΠΈΡ€ΠΎΠ²ΡŠΡ‡Π½Π° част 0.73
# Ρ‚Ρ€Π΅Π½ΠΈΡ€Π°ΠΉ класификатор с ΠΌΠ΅Ρ‚ΠΎΠ΄: logistic regression
# ΠΏΠΎΠΊΠ°ΠΆΠΈ Π΄ΠΈΠ°Π³Ρ€Π°ΠΌΠ° с ΠΏΡ€ΠΈΠ΅ΠΌΠ°Ρ‚Π΅Π»Π½ΠΈΡ‚Π΅ ΠΎΠΏΠ΅Ρ€Π°Ρ†ΠΈΠΎΠ½Π½ΠΈ характСристики (ROC)
#
# English
# use the data: dfTitanic
# split data with training part 0.73
# train classifier with method: logistic regression
# show Receiver Operating Characteristics (ROC) diagram
#
# Russian
# ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚ΡŒ Π΄Π°Π½Π½Ρ‹Π΅: dfTitanic
# Ρ€Π°Π·Π΄Π΅Π»ΠΈΡ‚ΡŒ Π΄Π°Π½Π½Ρ‹Π΅ с трСнировочная Ρ‡Π°ΡΡ‚ΡŒ 0.73
# ΠΎΠ±ΡƒΡ‡ΠΈΡ‚ΡŒ классификатор ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠΌ: logistic regression
# ΠΏΠΎΠΊΠ°Π·Π°Ρ‚ΡŒ Π΄ΠΈΠ°Π³Ρ€Π°ΠΌΠΌΡƒ Ρ€Π°Π±ΠΎΡ‡ΠΈΡ… характСристик ΠΏΡ€ΠΈΠ΅ΠΌΠ½ΠΈΠΊΠ° (RΠ₯П)

Command line interface

The package provides Command Line Interface (CLI) for its functionalities:

> ToClassificationWorkflowCode --help
# Usage:
#   ToClassificationWorkflowCode <command> [--target=<Str>] [--language=<Str>] [--format=<Str>] -- Translates natural language commands into (machine learning) classification workflow programming code.
#   ToClassificationWorkflowCode <target> <command> [--language=<Str>] [--format=<Str>] -- Both target and command as arguments.
#
#     <command>           A string with one or many commands (separated by ';').
#     --target=<Str>      Target (programming language with optional library spec.) [default: 'WL-ClCon']
#     --language=<Str>    The natural language to translate from. [default: 'English']
#     --format=<Str>      The format of the output, one of 'automatic', 'code', 'hash', or 'raku'. [default: 'automatic']
#     <target>            Programming language.

Versions

The original version of this Raku package was developed/hosted at [ AAp2 ].

A dedicated GitHub repository was made in order to make the installation with Raku's zef more direct. (As shown above.)

References

[AAp1] Anton Antonov, Monadic contextual classification Mathematica package, (2017-2022), MathematicaForPrediction at GitHub.

[AAp2] Anton Antonov, "Classification workflows conversational agent", (2017), ConversationalAgents at GitHub.

DSL::English::ClassificationWorkflows v0.1.0

Classification workflows building by natural language commands.

Authors

  • Anton Antonov

License

GPL-3.0-or-later

Dependencies

DSL::Shared:ver<0.1.2+>DSL::Entity::MachineLearning:ver<0.1.1+>

Test Dependencies

Provides

  • DSL::English::ClassificationWorkflows
  • DSL::English::ClassificationWorkflows::Actions::Bulgarian::Standard
  • DSL::English::ClassificationWorkflows::Actions::English::Standard
  • DSL::English::ClassificationWorkflows::Actions::Russian::Standard
  • DSL::English::ClassificationWorkflows::Actions::WL::ClCon
  • DSL::English::ClassificationWorkflows::Actions::WL::System
  • DSL::English::ClassificationWorkflows::Grammar
  • DSL::English::ClassificationWorkflows::Grammar::ClassificationPhrases
  • DSL::English::ClassificationWorkflows::Grammarish

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.