rak

look for clues in stuff

NAME

rak - look for clues in stuff

SYNOPSIS

use rak;

# look for "foo" in all .txt files from current directory
for rak / foo /, :file(/ \.txt $/) -> (:key($path), :value(@found)) {
    if @found {
        say "$path:";
        say .key ~ ':' ~ .value for @found;
    }
}

DESCRIPTION

The rak subroutine provides a mostly abstract core search (plumbing) functionality to be used by modules such as (porcelain) App::Rak.

THEORY OF OPERATION

The rak subroutine basically goes through 6 steps to produce a result.

1. Acquire sources

The first step is determining the objects that should be searched for the specified pattern. If an object is a Str, it will be assume that it is a path specification of a file to be searched in some form and an IO::Path object will be created for it.

Related named arguments are (in alphabetical order):

  • :dir - filter for directory basename check to include

  • :file - filter for file basename check to include

  • :files-from - file containing filenames as source

  • :paths - paths to recurse into if directory

  • :paths-from - file containing paths to recurse into

  • :sources - list of objects to be considered as source

The result of this step, is a (potentially lazy and hyperable) sequence of objects.

2. Filter applicable objects

Filter down the list of sources from step 1 on any additional filesystem related properties. This assumes that the list of objects created are strings of absolute paths to be checked.

  • :accessed - when was path last accessed

  • :blocks- number of filesystem blocks

  • :created - when was path created

  • :device-number - device number on which path is located

  • :empty - is path empty (filesize == 0)

  • :executable - is path executable

  • :filesize - size of the path in bytes

  • :gid - numeric gid of the path

  • :group-executable - is path executable by group

  • :group-readable - is path readable by group

  • :group-writable - is path writable

  • :hard-links - number of hard-links to path on filesystem

  • :inode - inode of path on filesystem

  • :meta-modified - when meta information of path was modified

  • :mode - the mode of the path

  • :modified - when path was last modified

  • :owned-by-group - is path owned by group of current user

  • :owned-by-user - is path owned by current user

  • :readable - is path readable by current user

  • :uid - numeric uid of path

  • :symbolic-link - is path a symbolic link

  • :world-executable - is path executable by any user

  • :world-readable - is path readable by any user

  • :world-writable - is path writable by any user

  • :writable - is path writable by current user

The result of this step, is a (potentially lazy and hyperable) sequence of objects.

3. Produce items to search in

The second step is to create the logic for creating items to search in from the objects in step 2. If search is to be done per object, then .slurp is called on the object. Otherwise .lines is called on the object. Unless one provides their own logic for producing items to search in.

Related named arguments are (in alphabetical order):

  • :encoding - encoding to be used when creating items

  • :find - map sequence of step 1 to item producer

  • :per-file - logic to create one item per object

  • :per-line - logic to create one item per line in the object

The result of this step, is a (potentially lazy and hyperable) sequence of objects.

4. Create logic for matching

Take the logic of the pattern Callable, and create a Callable to do the actual matching with the items produced in step 3.

Related named arguments are (in alphabetical order):

  • :invert-match - invert the logic of matching

  • :quietly - absorb any warnings produced by the matcher

  • :silently - absorb any output done by the matcher

5. Create logic for running

Take the matcher logic of the Callable of step 4 and create a runner Callable that will produce the items found and their possible context (such as extra lines before or after). Assuming no context, the runner changes a return value of False from the matcher into Empty, a return value of True in the original line, and passes through any other value.

Related named arguments are (in alphabetical order):

  • :after-context - number of lines to show after a match

  • :before-context - number of lines to show before a match

  • :context - number of lines to show around a match

  • :paragraph-context - lines around match until empty line

  • :passthru-context - pass on all lines

Matching lines are represented by PairMatched objects, and lines that have been added because of the above context arguments, are represented by PairContext objects.

6. Run the sequence(s)

The final step is to take the Callable of step 5 and run that repeatedly on the sequence of step 2, and for each item of that sequence, run the sequence of step 5 on that. Make sure any phasers (FIRST, NEXT and LAST) are called at the appropriate time in a thread-safe manner.

Either produces a sequence in which the key is the source, and the value is a Slip of Pairs where the key is the line-number and the value is line with the match, or whatever the pattern matcher returned.

Or, produces sequence of whatever a specified mapper returned.

Related named arguments are (in alphabetical order):

  • :mapper - code to map results of a single source

  • :map-all - also call mapper if a source has no matches

EXPORTED SUBROUTINES

rak

The rak subroutine takes a Callable (or Regex) pattern as the only positional argument and quite a number of named arguments. Or it takes a Callable (or Regex) as the first positional argument for the pattern, and a hash with named arguments as the second positional argument. In the latter case, the hash will have the arguments removed that the rak subroutine needed for its configuration and execution.

It returns either a Pair (with an Exception as key, and the exception message as the value), or an Iterable of Pairs which contain the source object as key (by default a IO::Path object of the file in which the pattern was found), and a Slip of key / value pairs, in which the key is the line-number where the pattern was found, and the value is the product of the search (which, by default, is the line in which the pattern was found).

The following named arguments can be specified (in alphabetical order):

:accessed(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the access time of the path. The Callable is passed a Num value of the access time (number of seconds since epoch) and is expected to return a trueish value to have the path be considered for further selection.

:after-context(N)

Indicate the number of lines that should also be returned after a line with a pattern match. Defaults to 0.

:batch(N)

When hypering over multiple cores, indicate how many items should be processed per thread at a time. Defaults to whatever the system thinks is best (which may be sub-optimal).

:before-context(N)

Indicate the number of lines that should also be returned before a line with a pattern match. Defaults to 0.

:blocks(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the number of blocks used by the path on the filesystem on which the path is located. The Callable is passed the number of blocks of a path and is expected to return a trueish value to have the path be considered for further selection.

:context(N)

Indicate the number of lines that should also be returned around a line with a pattern match. Defaults to 0.

:created(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the creation time of the path. The Callable is passed a Num value of the creation time (number of seconds since epoch) and is expected to return a trueish value to have the path be considered for further selection.

:degree(N)

When hypering over multiple cores, indicate the maximum number of threads that should be used. Defaults to whatever the system thinks is best (which may be sub-optimal).

:device-number(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the device number of the path. The Callable is passed the device number of the device on which the path is located and is expected to return a trueish value to have the path be considered for further selection.

:dir(&dir-matcher)

If specified, indicates the matcher that should be used to select acceptable directories with the paths utility. Defaults to True indicating all directories should be recursed into. Applicable for any situation where paths is used to create the list of files to check.

:encoding("utf8-c8")

When specified with a string, indicates the name of the encoding to be used to produce items to check (typically by calling lines or slurp). Defaults to utf8-c8, the UTF-8 encoding that is permissive of encoding issues.

:empty

Flag. If specified, indicates paths, that are empty (aka: have a filesize of 0 bytes), are (not) acceptable for further selection.

:executable

Flag. If specified, indicates paths, that are executable by the current user, are (not) acceptable for further selection.

:file(&file-matcher)

If specified, indicates the matcher that should be used to select acceptable files with the paths utility. Defaults to True indicating all files should be checked. Applicable for any situation where paths is used to create the list of files to check.

:filesize(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the number of bytes of the path. The Callable is passed the number of bytes of a path and is expected to return a trueish value to have the path be considered for further selection.

:files-from($filename)

If specified, indicates the name of the file from which a list of files to be used as sources will be read.

:find

Flag. If specified, maps the sources of items into items to search.

:gid(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the gid of the path. The Callable is passed the numeric gid of a path and is expected to return a trueish value to have the path be considered for further selection. See also owner and group filters.

:group-executable

Flag. If specified, indicates paths, that are executable by the current group, are (not) acceptable for further selection.

:group-readable

Flag. If specified, indicates paths, that are readable by the current group, are (not) acceptable for further selection.

:group-writable

Flag. If specified, indicates paths, that are writable by the current group, are (not) acceptable for further selection.

:hard-links(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the number of hard-links of the path. The Callable is passed the number of hard-links of a path and is expected to return a trueish value to have the path be considered for further selection.

:inode(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the inode of the path. The Callable is passed the inode of a path and is expected to return a trueish value to have the path be considered for further selection.

:invert-match

Flag. If specified with a trueish value, will negate the return value of the pattern if a Bool was returned. Defaults to False.

:mapper(&mapper)

If specified, indicates the Callable that will be called (in a thread-safe manner) for each source, with the matches of that source. The Callable is passed the source object, and a list of matches, if there were any matches. If you want the Callable to be called for every source, then you must also specify :map-all.

Whatever the mapper Callable returns, will become the result of the call to the rak subroutine. If you don't want any result to be returned, you can return Empty from the mapper Callable.

:map-all

Flag. If specified with a trueish value, will call the mapper logic, as specified with :mapper, even if a source has no matches. Defaults to False:

:meta-modified(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the modification time of the path. The Callable is passed a Num value of the modification time (number of seconds since epoch) and is expected to return a trueish value to have the path be considered for further selection.

:mode(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the mode of the path. The Callable is passed the mode of a path and is expected to return a trueish value to have the path be considered for further selection. This is really for advanced types of tests: it's probably easier to use any of the readable, writeable and executable filters.

:modified(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the modification time of the path. The Callable is passed a Num value of the modification time (number of seconds since epoch) and is expected to return a trueish value to have the path be considered for further selection.

:paragraph-context

Flag. If specified with a trueish value, produce lines around the line with a pattern match until an empty line is encountered.

:passthru-context

Flag. If specified with a trueish value, produces all lines.

:paths-from($filename)

If specified, indicates the name of the file from which a list of paths to be used as the base of the production of filename with a paths search.

:paths(@paths)

If specified, indicates a list of paths that should be used as the base of the production of filename with a paths search. If there is no other sources specification (from either the :files-from, paths-from or sources) then the current directory (aka ".") will be assumed.

If a single hyphen is specified as the path, then STDIN will be assumed as the source.

:per-file(&producer)

If specified, indicates that searches should be done on a per-file basis. Defaults to doing searches on a per-line basis.

If specified with a True value, indicates that the slurp method will be called on each source before being checked with pattern. If the source is a Str, then it will be assumed to be a path name to read from.

If specified with a Callable, it indicates the code to be executed from a given source to produce the single item to be checked for the pattern.

:per-line(&producer)

If specified, indicates that searches should be done on a per-line basis.

If specified with a True value (which is also the default), indicates that the lines method will be called on each source before being checked with pattern. If the source is a Str, then it will be assumed to be a path name to read lines from.

If specified with a Callable, it indicates the code to be executed from a given source to produce the itemi to be checked for the pattern.

:quietly

Flag. If specified with a trueish value, will absorb any warnings that may occur when looking for the pattern.

:readable

Flag. If specified, indicates paths, that are readable by the current user, are (not) acceptable for further selection.

:owned-by-group

Flag. If specified, indicates only paths that are owned by the group of the current user, are (not) acceptable for further selection.

:owned-by-user

Flag. If specified, indicates only paths that are owned by the current user, are (not) acceptable for further selection.

:silently("out,err")

When specified with True, will absorb any output on STDOUT and STDERR. Optionally can only absorb STDOUT ("out"), STDERR ("err") and both STDOUT and STDERR ("out,err").

:sources(@objects)

If specified, indicates a list of objects that should be used as a source for the production of lines.

:stats

Flag. If specified with a trueish value, will keep stats on number of files and number of lines seen. And instead of just returning the results sequence, will then return a List of the result sequence as the first argument, and a Map with statistics as the second argument.

:symbolic-link

Flag. If specified, indicates only paths that are symbolic links, are (not) acceptable for further selection.

:uid(&filter)

If specified, indicates the Callable filter that should be used to select acceptable paths by the uid of the path. The Callable is passed the numeric uid of a path and is expected to return a trueish value to have the path be considered for further selection. See also owner and group filters.

:world-executable

Flag. If specified, indicates paths, that are executable by any user or group, are (not) acceptable for further selection.

:world-readable

Flag. If specified, indicates paths, that are readable by any user or group, are (not) acceptable for further selection.

:world-writeable

Flag. If specified, indicates paths, that are writable by any user or group, are (not) acceptable for further selection.

:writable

Flag. If specified, indicates paths, that are writable by the current user, are (not) acceptable for further selection.

PATTERN RETURN VALUES

The return value of the pattern Callable is interpreted in the following ways:

True

If the Boolean True value is returned, assume the pattern is found. Produce the line unless :invert-match was specified.

False

If the Boolean False value is returned, assume the pattern is not found. Do not produce the line unless :invert-match was specified.

Empty

Always produce the line. Even if :invert-match was specified.

any other value

Produce that value.

PHASERS

Any FIRST, NEXT and LAST phaser that are specified in the pattern Callable, will be executed at the correct time.

MATCHING LINES vs CONTEXT LINES

The Pairs that contain the search result within an object, have an additional method mixed in: matched. This returns True for lines that matched, and False for lines that have been added because of a context specification (:context, :before-context, :after-context or paragraph-context).

These Pairs can also be recognized by their class: PairMatched versus PairContext, which are also exported.

AUTHOR

Elizabeth Mattijsen [email protected]

Source can be located at: https://github.com/lizmat/rak . Comments and Pull Requests are welcome.

If you like this module, or what Iā€™m doing more generally, committing to a small sponsorship would mean a great deal to me!

COPYRIGHT AND LICENSE

Copyright 2022 Elizabeth Mattijsen

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.

rak v0.0.1

look for clues in stuff

Authors

  • Elizabeth Mattijsen

License

Artistic-2.0

Dependencies

has-word:ver<0.0.3>:auth<zef:lizmat>hyperize:ver<0.0.2>:auth<zef:lizmat>paths:ver<10.0.7>:auth<zef:lizmat>path-utils:ver<0.0.5>:auth<zef:lizmat>Trap:ver<0.0.1>:auth<zef:lizmat>

Test Dependencies

Provides

  • rak

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.