Structured

NAME

Binary::Structured - read and write binary formats defined by classes

SYNOPSIS


use Binary::Structured;

# Binary format definition
class PascalString is Binary::Structured {
	has uint8 $!length is written(method {$!string.bytes});
	has Buf $.string is read(method {self.pull($!length)}) is rw;
}

# Reading
my $parser = PascalString.new;
$parser.parse(Buf.new("\x05hello world".ords));
say $parser.string.decode("ascii"); # "hello"

# Writing
$parser.string = Buf.new("some new data".ords);
say $parser.build; # Buf:0x<0d 73 6f 6d 65 20 6e 65 77 20 64 61 74 61>

DESCRIPTION

Binary::Structured provides a way to define classes which know how to parse and emit binary data based on the class attributes. The goal of this module is to provide building blocks to describe an entire file (or well-defined section of a file), which can easily be parsed, edited, and rebuilt.

This module was inspired by the Python library construct, with the class-based representation inspired by Perl 6's NativeCall.

Types of the attributes are used whenever possible to drive behavior, with custom traits provided to add more smarts when needed to parse more formats.

These attributes are parsed in order of declaration, regardless of if they are public or private, but only attributes declared in that class directly. The readonly or rw traits are ignored for attributes. Methods are also ignored.

WARNING: As this is a pre-1.0 module, the API is subject to change between versions without deprecation.

TYPES

Perl 6 provides a wealth of native sized types. The following native types may be used on attributes for parsing and building without the help of any traits:

  • int8

  • int16

  • int32

  • uint8

  • uint16

  • uint32

These types consume 1, 2, or 4 bytes as appropriate for the type. These values are interpreted as little endian by default. Big endian representations may be indicated by using the is big-endian trait, see the traits section below.

  • Buf

Buf is another type that lends itself to representing this data. It has no obvious length and requires the read trait to consume it (see the traits section below).

Note that you can provide both is read and is written to compute the value when parsing and building, allowing you to put in arbitrary bytes at this position. See StreamPosition below if you just want to keep track of the current position.

  • StaticData

A variant of Buf, StaticData, is provided to represent bytes that are known in advance. It requires a default value of a Buf, which is used to determine the number of bytes to consume, and these bytes are checked with the default value. An exception is raised if these bytes do not match. An appropriate use of this would be the magic bytes at the beginning of many file formats, or the null terminator at the end of a CString, for example:


# Magic for PNG files
class PNGFile is Binary::Structured {
	has StaticData $.magic = Buf.new(0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a);
}

  • StreamPosition

This exported class consumes no bytes, and writes no bytes. It just records the current stream position into this attribute when reading or writing so other variables can reference it later. Reader and writer traits are ignored on this attribute.

  • StreamEvent

This exported class consumes no bytes, and writes no bytes. It executes the `is read` and `is written` attributes, allowing you to put arbitrary code in the parse or build process at this point. This is a good place to put a call to `rewrite-attribute`, allowing you to update a previous value once you know what it should be.

  • Binary::Structured subclass

These structures may be nested. Provide an attribute that subclasses Binary::Structured to include another structure at this position. This inner structure takes over control until it is done parsing or building, and then the outer structure resumes parsing or building.


class Inner is Binary::Structured {
	has int8 $.value;
}
class Outer is Binary::Structured {
	has int8 $.before;
	has Inner $.inner;
	has int8 $.after;
}
# This would be able to parse Buf.new(1, 2, 3)
# $outer.before would be 1, $outer.inner.value would be 2,
# and $outer.after would be 3.

  • Array[Binary::Structured]

Multiple structures can be handled by using an Array of subclasses. Use the read trait to control when it stops trying to adding values into the array. See the traits section below for examples on controlling iteration.

METHODS

TRAITS

Traits are provided to add additional parsing control. Most of them take methods as arguments, which operate in the context of the parsed (or partially parsed) object, so you can refer to previous attributes.

is read

The is read trait controls reading of Bufs and Arrays. For Buf, return a Buf built using self.pull($count) (to ensure the position is advanced properly). $count here could be a reference to a previously parsed value, could be a constant value, or you can use a loop along with peek-one/peek to concatenate to a Buf.

For Array, return a count of bytes as an Int, or return a number of elements to read using self.pull-elements($count). Note that pull-elements does not advance the position immediately so peek is less useful here.

is written

The is written trait controls how a given attribute is constructed when build is called. It provides a way to update values based on other attributes. It's best used on things that would be private attributes, like lengths and some checksums. Since build is only called when all attributes are filled, you can refer to attributes that have not been written (unlike is read).

is big-endian

Applies to native integers (int16, int32, uint16, uint32), and indicates that this value should be read and written as a big endian value (with the most significant byte first) rather than the default of little endian.

is little-endian

Little endian is the default for numeric values, but the trait is provided for completeness.

REQUIREMENTS

  • Rakudo Perl v6.c or above (tested on 2016.08.1)

TODO

See TODO.

The Camelia image is copyright 2009 by Larry Wall. "Raku" is trademark of the Yet Another Society. All rights reserved.