rak
NAME
rak - look for clues in stuff
SYNOPSIS
use rak;
# look for "foo" in all .txt files from current directory
for rak / foo /, :file(/ \.txt $/) -> (:key($path), :value(@found)) {
if @found {
say "$path:";
say .key ~ ':' ~ .value for @found;
}
}
DESCRIPTION
The rak
subroutine provides a mostly abstract core search (plumbing)
functionality to be used by modules such as (porcelain) App::Rak
.
THEORY OF OPERATION
The rak
subroutine basically goes through 6 steps to produce a
result.
1. Acquire sources
The first step is determining the objects that should be searched
for the specified pattern. If an object is a Str
, it will be
assume that it is a path specification of a file to be searched in
some form and an IO::Path
object will be created for it.
Related named arguments are (in alphabetical order):
:dir - filter for directory basename check to include
:file - filter for file basename check to include
:files-from - file containing filenames as source
:paths - paths to recurse into if directory
:paths-from - file containing paths to recurse into
:sources - list of objects to be considered as source
The result of this step, is a (potentially lazy and hyperable) sequence of objects.
2. Filter applicable objects
Filter down the list of sources from step 1 on any additional filesystem related properties. This assumes that the list of objects created are strings of absolute paths to be checked.
:accessed - when was path last accessed
:blocks- number of filesystem blocks
:created - when was path created
:device-number - device number on which path is located
:empty - is path empty (filesize == 0)
:executable - is path executable
:filesize - size of the path in bytes
:gid - numeric gid of the path
:group-executable - is path executable by group
:group-readable - is path readable by group
:group-writable - is path writable
:hard-links - number of hard-links to path on filesystem
:inode - inode of path on filesystem
:meta-modified - when meta information of path was modified
:mode - the mode of the path
:modified - when path was last modified
:owned-by-group - is path owned by group of current user
:owned-by-user - is path owned by current user
:readable - is path readable by current user
:uid - numeric uid of path
:symbolic-link - is path a symbolic link
:world-executable - is path executable by any user
:world-readable - is path readable by any user
:world-writable - is path writable by any user
:writable - is path writable by current user
The result of this step, is a (potentially lazy and hyperable) sequence of objects.
3. Produce items to search in
The second step is to create the logic for creating items to
search in from the objects in step 2. If search is to be done
per object, then .slurp
is called on the object. Otherwise
.lines
is called on the object. Unless one provides their
own logic for producing items to search in.
Related named arguments are (in alphabetical order):
:encoding - encoding to be used when creating items
:find - map sequence of step 1 to item producer
:per-file - logic to create one item per object
:per-line - logic to create one item per line in the object
The result of this step, is a (potentially lazy and hyperable) sequence of objects.
4. Create logic for matching
Take the logic of the pattern Callable
, and create a Callable
to
do the actual matching with the items produced in step 3.
Related named arguments are (in alphabetical order):
:invert-match - invert the logic of matching
:quietly - absorb any warnings produced by the matcher
:silently - absorb any output done by the matcher
5. Create logic for running
Take the matcher logic of the Callable
of step 4 and create a runner
Callable
that will produce the items found and their possible context
(such as extra lines before or after). Assuming no context, the runner
changes a return value of False
from the matcher into Empty
, a
return value of True
in the original line, and passes through any other
value.
Related named arguments are (in alphabetical order):
:after-context - number of lines to show after a match
:before-context - number of lines to show before a match
:context - number of lines to show around a match
:paragraph-context - lines around match until empty line
:passthru-context - pass on *all* lines
Matching lines are represented by PairMatched
objects, and lines that
have been added because of the above context arguments, are represented by
PairContext
objects.
6. Run the sequence(s)
The final step is to take the Callable
of step 5 and run that
repeatedly on the sequence of step 2, and for each item of that
sequence, run the sequence of step 5 on that. Make sure any
phasers (FIRST
, NEXT
and LAST
) are called at the appropriate
time in a thread-safe manner.
Either produces a sequence in which the key is the source, and the
value is a Slip
of Pair
s where the key is the line-number and
the value is line with the match, or whatever the pattern matcher
returned.
Or, produces sequence of whatever a specified mapper returned.
Related named arguments are (in alphabetical order):
:mapper - code to map results of a single source
:map-all - also call mapper if a source has no matches
EXPORTED SUBROUTINES
rak
The rak
subroutine takes a Callable
(or Regex
) pattern as the only
positional argument and quite a number of named arguments. Or it takes a
Callable
(or Regex
) as the first positional argument for the pattern,
and a hash with named arguments as the second positional argument. In the
latter case, the hash will have the arguments removed that the rak
subroutine needed for its configuration and execution.
It returns either a Pair
(with an Exception
as key, and the
exception message as the value), or an Iterable
of Pair
s which
contain the source object as key (by default a IO::Path
object
of the file in which the pattern was found), and a Slip
of
key / value pairs, in which the key is the line-number where the
pattern was found, and the value is the product of the search
(which, by default, is the line in which the pattern was found).
The following named arguments can be specified (in alphabetical order):
:accessed(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the access time of the path. The Callable
is
passed a Num
value of the access time (number of seconds since epoch)
and is expected to return a trueish value to have the path be considered
for further selection.
:after-context(N)
Indicate the number of lines that should also be returned after a line with a pattern match. Defaults to 0.
:batch(N)
When hypering over multiple cores, indicate how many items should be processed per thread at a time. Defaults to whatever the system thinks is best (which may be sub-optimal).
:before-context(N)
Indicate the number of lines that should also be returned before a line with a pattern match. Defaults to 0.
:blocks(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the number of blocks used by the path on the filesystem
on which the path is located. The Callable
is passed the number of blocks
of a path and is expected to return a trueish value to have the path be
considered for further selection.
:context(N)
Indicate the number of lines that should also be returned around a line with a pattern match. Defaults to 0.
:created(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the creation time of the path. The Callable
is
passed a Num
value of the creation time (number of seconds since epoch)
and is expected to return a trueish value to have the path be considered
for further selection.
:degree(N)
When hypering over multiple cores, indicate the maximum number of threads that should be used. Defaults to whatever the system thinks is best (which may be sub-optimal).
:device-number(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the device number of the path. The Callable
is
passed the device number of the device on which the path is located and is
expected to return a trueish value to have the path be considered for further
selection.
:dir(&dir-matcher)
If specified, indicates the matcher that should be used to select
acceptable directories with the paths
utility. Defaults to True
indicating all directories should be recursed into. Applicable
for any situation where paths
is used to create the list of files
to check.
:encoding("utf8-c8")
When specified with a string, indicates the name of the encoding
to be used to produce items to check (typically by calling
lines
or slurp
). Defaults to utf8-c8
, the UTF-8
encoding that is permissive of encoding issues.
:empty
Flag. If specified, indicates paths, that are empty (aka: have a filesize of 0 bytes), are (not) acceptable for further selection.
:executable
Flag. If specified, indicates paths, that are executable by the current user, are (not) acceptable for further selection.
:file(&file-matcher)
If specified, indicates the matcher that should be used to select
acceptable files with the paths
utility. Defaults to True
indicating all files should be checked. Applicable for any
situation where paths
is used to create the list of files to
check.
:filesize(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the number of bytes of the path. The Callable
is
passed the number of bytes of a path and is expected to return a trueish
value to have the path be considered for further selection.
:files-from($filename)
If specified, indicates the name of the file from which a list of files to be used as sources will be read.
:find
Flag. If specified, maps the sources of items into items to search.
:gid(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the gid of the path. The Callable
is passed the
numeric gid of a path and is expected to return a trueish value to have the
path be considered for further selection. See also owner
and group
filters.
:group-executable
Flag. If specified, indicates paths, that are executable by the current group, are (not) acceptable for further selection.
:group-readable
Flag. If specified, indicates paths, that are readable by the current group, are (not) acceptable for further selection.
:group-writable
Flag. If specified, indicates paths, that are writable by the current group, are (not) acceptable for further selection.
:hard-links(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the number of hard-links of the path. The Callable
is passed the number of hard-links of a path and is expected to return a
trueish value to have the path be considered for further selection.
:inode(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the inode of the path. The Callable
is passed the
inode of a path and is expected to return a trueish value to have the path be
considered for further selection.
:invert-match
Flag. If specified with a trueish value, will negate the return
value of the pattern if a Bool
was returned. Defaults to
False
.
:mapper(&mapper)
If specified, indicates the Callable
that will be called (in a
thread-safe manner) for each source, with the matches of that source.
The Callable
is passed the source object, and a list of matches,
if there were any matches. If you want the Callable
to be called
for every source, then you must also specify :map-all
.
Whatever the mapper Callable
returns, will become the result of the
call to the rak
subroutine. If you don't want any result to be
returned, you can return Empty
from the mapper Callable
.
:map-all
Flag. If specified with a trueish value, will call the mapper logic, as
specified with :mapper
, even if a source has no matches. Defaults to
False
:
:meta-modified(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the modification time of the path. The Callable
is
passed a Num
value of the modification time (number of seconds since epoch)
and is expected to return a trueish value to have the path be considered
for further selection.
:mode(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the mode of the path. The Callable
is passed the
mode of a path and is expected to return a trueish value to have the path be
considered for further selection. This is really for advanced types of tests:
it's probably easier to use any of the readable
, writeable
and
executable
filters.
:modified(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the modification time of the path. The Callable
is
passed a Num
value of the modification time (number of seconds since epoch)
and is expected to return a trueish value to have the path be considered
for further selection.
:paragraph-context
Flag. If specified with a trueish value, produce lines around the line with a pattern match until an empty line is encountered.
:passthru-context
Flag. If specified with a trueish value, produces all lines.
:paths-from($filename)
If specified, indicates the name of the file from which a list
of paths to be used as the base of the production of filename
with a paths
search.
:paths(@paths)
If specified, indicates a list of paths that should be used
as the base of the production of filename with a paths
search. If there is no other sources specification (from
either the :files-from
, paths-from
or sources
) then
the current directory (aka ".") will be assumed.
If a single hyphen is specified as the path, then STDIN will be assumed as the source.
:per-file(&producer)
If specified, indicates that searches should be done on a per-file basis. Defaults to doing searches on a per-line basis.
If specified with a True
value, indicates that the slurp
method will be called on each source before being checked with
pattern. If the source is a Str
, then it will be assumed to
be a path name to read from.
If specified with a Callable
, it indicates the code to be
executed from a given source to produce the single item to be
checked for the pattern.
:per-line(&producer)
If specified, indicates that searches should be done on a per-line basis.
If specified with a True
value (which is also the default),
indicates that the lines
method will be called on each source
before being checked with pattern. If the source is a Str
,
then it will be assumed to be a path name to read lines from.
If specified with a Callable
, it indicates the code to be
executed from a given source to produce the itemi to be checked
for the pattern.
:quietly
Flag. If specified with a trueish value, will absorb any warnings that may occur when looking for the pattern.
:readable
Flag. If specified, indicates paths, that are readable by the current user, are (not) acceptable for further selection.
:owned-by-group
Flag. If specified, indicates only paths that are owned by the group of the current user, are (not) acceptable for further selection.
:owned-by-user
Flag. If specified, indicates only paths that are owned by the current user, are (not) acceptable for further selection.
:silently("out,err")
When specified with True
, will absorb any output on STDOUT
and STDERR. Optionally can only absorb STDOUT ("out"), STDERR
("err") and both STDOUT and STDERR ("out,err").
:sources(@objects)
If specified, indicates a list of objects that should be used as a source for the production of lines.
:stats
Flag. If specified with a trueish value, will keep stats on number
of files and number of lines seen. And instead of just returning
the results sequence, will then return a List
of the result
sequence as the first argument, and a Map
with statistics as the
second argument.
:symbolic-link
Flag. If specified, indicates only paths that are symbolic links, are (not) acceptable for further selection.
:uid(&filter)
If specified, indicates the Callable
filter that should be used to select
acceptable paths by the uid of the path. The Callable
is passed the
numeric uid of a path and is expected to return a trueish value to have the
path be considered for further selection. See also owner
and group
filters.
:world-executable
Flag. If specified, indicates paths, that are executable by any user or group, are (not) acceptable for further selection.
:world-readable
Flag. If specified, indicates paths, that are readable by any user or group, are (not) acceptable for further selection.
:world-writeable
Flag. If specified, indicates paths, that are writable by any user or group, are (not) acceptable for further selection.
:writable
Flag. If specified, indicates paths, that are writable by the current user, are (not) acceptable for further selection.
PATTERN RETURN VALUES
The return value of the pattern Callable
is interpreted in the
following ways:
True
If the Bool
ean True value is returned, assume the pattern is found.
Produce the line unless :invert-match
was specified.
False
If the Bool
ean False value is returned, assume the pattern is not
found. Do not produce the line unless :invert-match
was specified.
Empty
Always produce the line. Even if :invert-match
was specified.
any other value
Produce that value.
PHASERS
Any FIRST
, NEXT
and LAST
phaser that are specified in the
pattern Callable
, will be executed at the correct time.
MATCHING LINES vs CONTEXT LINES
The Pair
s that contain the search result within an object, have
an additional method mixed in: matched
. This returns True
for
lines that matched, and False
for lines that have been added because
of a context specification (:context
, :before-context
, :after-context
or paragraph-context
).
These Pair
s can also be recognized by their class: PairMatched
versus
PairContext
, which are also exported.
AUTHOR
Elizabeth Mattijsen <[email protected]>
Source can be located at: https://github.com/lizmat/rak . Comments and Pull Requests are welcome.
If you like this module, or what Iām doing more generally, committing to a small sponsorship would mean a great deal to me!
COPYRIGHT AND LICENSE
Copyright 2022 Elizabeth Mattijsen
This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.
# vim: expandtab shiftwidth=4