pdf-tag-dump
SYNOPSIS
pdf-dom-dump.raku [options] file.pdf
Options: --password password for an encrypted PDF --max-depth=n maximum tag-depth to descend --select=XPath nodes to be included --omit=tag-name nodes to be excluded --root-tag=tag-name define outer root tag --marks decend into marked content --debug add debugging to output --/atts omit attributes in tags --/strict suppress warnings --/style omit root stylesheet link
DESCRIPTION
Dumps structure elements from a tagged PDF.
Produces tagged output in an XML format.
Only some PDF files contain tagged PDF. pdf-info.raku can be used to check this:
% pdf-info.raku my-doc.pdf | grep Tagged:
Tagged: yes
DEPENDENCIES
This script requires the freetype6 native library and the PDF::Font::Loader Raku module to be installed on your system.
BUGS AND LIMITATIONS
TODO
processing of links and fields