README-work
Lingua::Stem::Portuguese Raku package
Introduction
This Raku package is for stemming Portuguese words. It implements the Snowball algorithm presented in [SNa1].
Usage examples
The PortugueseStem function is used to find stems:
use Lingua::Stem::Portuguese;
say PortugueseStem('brotação')PortugueseStem also works with lists of words:
say PortugueseStem('Os brotos são aguardados com paciência, bebida e bacon.'.words)The function portuguese-word-stem can be used as a synonym of PortugueseStem.
Command Line Interface (CLI)
The package provides the CLI function PortugueseStem. Here is its usage message:
PortugueseStem --helpHere are example shell commands of using the CLI function PortugueseStem:
PortugueseStem BoatariaPortugueseStem --format=raku "Módulo Raku que fornece um procedimento para a língua portuguesa."PortugueseStem Verificar a exatidão da seleção usando dicionários e regrasHere is a pipeline example using the CLI function get-tokens of the package
"Grammar::TokenProcessing",
[AAp1]:
get-tokens ./DataQueryPhrases-template | PortugueseStem --format=rakuRemark: These kind of tokens (literals) transformations are used in the packages "DSL::Bulgarian", [AAp2], "DSL::Portuguese", [AAp3], and "DSL::Russian", [AAp4],
Implementation notes
Reprogrammed to Raku from : https://github.com/neilb/Lingua-PT-Stemmer/blob/master/lib/Lingua/PT/Stemmer.pm .
TODO
TODO Respect the word case in the returned result.
PortugueseStem('TABLADO')should return'TABL'.(Not
'tabl'as it currently does.)
DONE CLI that can be inserted in UNIX pipelines.
TODO Gallician stemmer.
TODO Performance statistics.
TODO More detailed documentation.
References
Articles
[SNa1] Snowball Team, Portuguese stemming algorithm, (2002), snowball.tartarus.org.
Packages
[AAp1] Anton Antonov, Grammar::TokenProcessing Raku package, (2022), GitHub/antononcube.
[AAp2] Anton Antonov, DSL::Bulgarian Raku package, (2022), GitHub/antononcube.
[AAp3] Anton Antonov, DSL::Portuguese Raku package, (2023), GitHub/antononcube.
[AAp3] Anton Antonov, DSL::Russian Raku package, (2022), GitHub/antononcube.