README
NAME
File::Metadata::Libextractor - Use libextractor to read file metadata
SYNOPSIS
use File::Metadata::Libextractor;
#| This program extracts all the information about a file
sub MAIN($file! where { .IO.f // die "file '$file' not found" })
{
my File::Metadata::Libextractor $e .= new;
my @info = $e.extract($file);
for @info -> %record {
for %record.kv -> $k, $v {
say "$k: $v"
}
say '-' x 50;
}
}
DESCRIPTION
File::Metadata::Libextractor provides an OO interface to libextractor in order to query files' metadata.
As the Libextractor site (https://www.gnu.org/software/libextractor) states, it is able to read information in the following file types:
HTML
MAN
PS
DVI
OLE2 (DOC, XLS, PPT)
OpenOffice (sxw)
StarOffice (sdw)
FLAC
MP3 (ID3v1 and ID3v2)
OGG
WAV
S3M (Scream Tracker 3)
XM (eXtended Module)
IT (Impulse Tracker)
NSF(E) (NES music)
SID (C64 music)
EXIV2
JPEG
GIF
PNG
TIFF
DEB
RPM
TAR(.GZ)
LZH
LHA
RAR
ZIP
CAB
7-ZIP
AR
MTREE
PAX
CPIO
ISO9660
SHAR
RAW
XAR FLV
REAL
RIFF (AVI)
MPEG
QT
ASF
Also, various additional MIME types are detected.
new(Bool :$in-process?)
Creates a File::Metadata::Libextractor object.
libextractor interfaces to several libraries in order to extract the metadata. To work safely it starts sub-processes to perform the actual extraction work.
This might cause problems in a concurrent envirnment with locks. A possible solution is to run the extraction process inside the program's own process. It's less secure, but it may avoid locking problems.
The optional argument $in-process allows the execution of the extraction job in the parent's process.
extract(file' not found" --> List)
Reads all the possible information from an existing file, or fails if the file doesn't exist. The output List is actually a List of Hashes. Each hash has the following keys:
mime-type The file's mime-type
plugin-name The name of the plugin the library used to find out the information
plugin-type The plugin subtype used for the operation
plugin-format The format of the plugin's output
data-type The value returned by the plugin subtype
The possible values for plugin-format are:
EXTRACTOR_METAFORMAT_UNKNOWN
EXTRACTOR_METAFORMAT_UTF8
EXTRACTOR_METAFORMAT_BINARY
EXTRACTOR_METAFORMAT_C_STRING
The possible values for the plugin-type field are listed in File::Metadata::Libextractor::Constants, in the EXTRACTOR_MetaType enum (231 values as for v3.1.6).
Prerequisites
This module requires the libextractor library to be installed. It has been successfully tested on the following Linux distributions:
Debian 9
Debian sid
Ubuntu 16.04
Ubuntu 18.04
It doesn't work with the version of the library that comes with Ubuntu 14.04.
sudo apt-get install libextractor3
This module looks for a library called libextractor.so.3 .
Installation
To install it using zef (a module management tool):
$ zef install File::Metadata::Libextractor
Testing
To run the tests:
$ prove -e "raku -Ilib"
AUTHOR
Fernando Santagata [email protected]
COPYRIGHT AND LICENSE
Copyright 2018 Fernando Santagata
This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.