Usage of Prest

Author: Mark Nodine
Contact: mnodine@alum.mit.edu
Revision: 762
Date: 2006-01-27 11:47:47 -0600 (Fri, 27 Jan 2006)
Copyright: This document has been placed in the public domain.

Contents

This document gives the description and usage for the prest parser. It is compiled by running prest -h using a system:: directive.


Description of prest

This program converts the DocUtils reStructuredText or Document Object Model (DOM) (aka pseudo-XML) formats into an output format. The default output format is HTML, but different formats can be specified by using different writer schemas.

Usage: prest [options] file(s)

Options:
-d Print debugging info on STDERR. May be used multiple times to get more information.
-e <encoding>
 Specifies an encoding to use for output (default 'utf8')
-h Print full usage help
-w <writer>
 Process the writer schema from <writer>.wrt (default 'html')
-D var[=val] Define a variable that affects parsing (may be multiple)
-W var[=val] Define a variable that affects a writer (may be multiple)
-V Print version info

Available writers: dom, html, index, latex, toc, xml, xref. Available encodings: ascii, ascii-ctrl, iso-8859-1, null, utf-8-strict, utf8.

Defines for reStructuredText parser

-D align=<0|1>
 Allow inferring right/center alignment in single line simple table cells (default is 1).
-D entry-attr=<text>
 Specifies attributes to be passed to table entry (default is ''). Note: this option can be changed on the fly within a table by using a perl directive to set $main::opt_D{entry_attr}.
-D file-insertion-enabled[=<0|1>]
 Allows include directives to include files, which may be a potential security hole. Default is 1.
-D ignore-include-errs[=<0|1>]
 Specifies that no error should be generated for missing include files. Default is 0.
-D image-exts=<ext-list>
 A comma-separated list of "ext1=ext2" pairs where any URI with extension ext1 has it mapped to ext2. This option allows using a single document source with multiple writers by using whatever figure extension is appropriate for a given writer.
-D include-ext=<text>
 A colon-separated list of extensions to check for included files. Default is ":.rst:.txt".
-D include-path=<text>
 A colon-separated list of directories for the include directive to search. The special token "<.>" represents the directory of the file containing the directive (which may not be the same as the directory in which trip is invoked, ".". Default is "<.>".
-D mstyle=<comma-separated-attr-val-list>
 A comma-separated set of attribute=value pairs to be used for the <mstyle> elements of MathML markup. Default is displaystyle=true for ascii-mathml directive.
-D nest-inline[=<0|1>]
 Specify whether to allow nesting of inline markup. There are some limitations, like strong cannot be nested within emphasis. Default is 1 (1 if specified with no value).
-D perl-path=<text>
 A colon-separated list of directories to search for Perl modules. The special token "<INC>" represents the default Perl include path. Default is "<INC>".
-D raw-enabled[=<0|1>]
 Allows raw directives to be processed, which may be a potential security hole. Default is 1.
-D report=<level>
 Set verbosity threshold; report system messages at or higher than <level> (by name or number: "info" or "1", warning/2, error/3, severe/4; also, "none" or 5). Default is 2 (warning).
-D row-attr=<text>
 Specifies attributes to be passed to table rows (default is ''). Note: this option can be changed on the fly within a table by using a perl directive to set $main::opt_D{row_attr}.
-D source=<text>
 Overrides the file name as the source.
-D table-attr=<text>
 Specifies attributes to be passed to tables (default is 'border="1" class="docutils"'). Note: this option can be changed on the fly to have tables with different characteristics by using a perl directive to set $main::opt_D{table_attr}.
-D tabstops=<num>
 Specifies that tab characters are assumed to tab out to every <num> characters (default is 8).
-D xformoff=<regexp>
 Turns off default transforms matching regexp. (Used for internal testing.)

Defines for reStructuredText transforms

-D generator=<0|1>
 Include a "Generated by" credit at the end of the document (default is 1).
-D date=<0|1>
 Include the date at the end of the document (default is 0).
-D docinfo-levels=<number>
 Indicates how many section levels to go down to process docinfo field lists (default is 0). (Values greater than 0 technically violate the DTD).
-D time=<0|1>
 Include the date and time at the end of the document (default is 1, overrides date if 1).
-D source-link=<0|1>
 Include a "View document source" link (default is 1).
-D source-url=<URL>
 Use the supplied <URL> verbatim for a "View document source" link; implies -D source_link=1.
-D keep-title-section
 Keeps the section intact from which the document title is taken.
-D section-subtitles
 Promote lone subsection titles to section subtitles.

Descriptions of Plug-in Directives

Documentation for plug-in directive 'code_block'

Formats a block of text as a code block. This directive depends upon the availability of the "states" program, part of the Unix "enscript" suite, to mark up the code; otherwise the code block will be returned as a simple literal block. The argument is optional and specifies the source language of the code block. If the code block is read from a file, the language will usually default correctly. The following language specifications are recognized:

ada asm awk c changelog cpp elisp fortran haskell html idl java javascript mail makefile nroff objc pascal perl postscript python scheme sh states synopsys tcl vba verilog vhdl

The directive has the following options:

:file: <filename>
Reads the code sample from a file rather than using the content block.
:color:

Specifies that "color" markup should be done. What this actually means is that the following interpreted-text roles are used for parts of the code markup:

comment A comment in the language
function-name A function name
variable-name A variable name
keyword A reserved keyword
reference-name A reference name
string A quoted string
builtin Variable names built into language
type-name Names associated with the language's type system

If any of these roles is undefined before processing the macro, a null definition is entered for them.

:level: <level>
The level of markup. <level> can be one of none, light, or heavy (default heavy). Ignored if :color: is specified.
:numbered:
Number the lines of the code block.

This directive also uses the following command-line definition:

-D code-block-states-file=<file>
 The file to be passed to the states program to specify how to do the formatting. Searches the perl include path to find the file. Default is "Text/Restructured/Directive/rst.st".

Documentation for plug-in directive 'if'

Executes its argument as a perl expression and returns its content if the perl expression is true. The content is interpreted as reStructuredText. It has no options. It processes the following defines:

-D perl='perl-code'
 

Specifies some perl code that is executed prior to evaluating the first perl directive. This option can be used to specify variables on the command line; for example:

-D perl='$a=1; $b=2'

defines constants $a and $b that can be used in the perl expression.

-D trusted Must be specified for if directives to use any operators normally masked out in a Safe environment. This requirement is to prevent an if directive in a file written elsewhere from doing destructive things on your computer.

Documentation for plug-in directive 'perl'

Executes perl code and interpolates the results. The code can be contained either in the arguments or the contents section (or both). It has the following options:

:lenient:
Causes the exit code for the subprocess to be ignored.
:file: <filename>
Takes the perl code from file <filename>.
:literal:
Interpret the returned value as a literal block.

If this option is not present, the return value is interpreted based on its type. If you return a text string, the text is interpreted as reStructuredText and is parsed again. If you return an internal DOM object (or list of them), the object is included directly into the parsed DOM structure. (This latter option requires knowledge of trip internals, but is the only way to create a pending DOM object for execution at transformation time rather than parse time.)

The perl directive makes the following global variables available for use within the perl code:

$SOURCE
The name of the source file containing the perl directive.
$LINENO
The line number of the perl directive within $SOURCE.
$DIRECTIVE
The literal text of the perl directive.
@INCLUDES
Array of [filename, linenumber] pairs of files which have included this one.
$opt_<x>
The <x> option from the command line. Changing one of these variables has no effect upon the parser. If you need to change one of the -D options to affect subsequent parsing, use $PARSER->{opt}{D}{option}.
$PARSER
The Text::Restructured parser object to allow text parsing within a perl directive.
$TOP_FILE
The name of the top-level file.
$VERSION
The version of prest (0.3.36).

The following defines are processed by the perl directive:

-D perl='perl-code'
 

Specifies some perl code that is executed prior to evaluating the first perl directive. This option can be used to specify variables on the command line; for example:

-D perl='$a=1; $b=2'

defines constants $a and $b that can be used in a perl block.

-D trusted Must be specified for perl directives to use any operators normally masked out in a Safe environment. This requirement is to prevent a perl directive in a file written elsewhere from doing destructive things on your computer.

Documentation for plug-in directive 'system'

Executes a system (shell) command and interpolates the results. It has the following options:

:lenient:
Causes the exit code for the subprocess to be ignored.
:literal:
Interpret the returned value as a literal block.

If this option is not present, the return value is interpreted as reStructuredText and is parsed again.

Any error returned by the shell generates a level 3 error message. To see the output of a command that is expected to generate an error, do:

.. system:: <your command> 2>&1 | cat

The following defines are processed by the system directive:

-D trusted Must be specified for system directives to be processed. This requirement is to prevent a system directive in a file written elsewhere from doing destructive things on your computer.

Descriptions of Writers

Documentation for writer 'dom'

This writer dumps out the internal Document Object Model (DOM, also known as a doctree) in an indented format known as pseudo-XML. It is useful for checking the results of the parser or transformations. It recognizes the following defines:

-W nobackn Disables placing "\n\" at ends of lines that would otherwise end in whitespace.

Documentation for writer 'html'

This writer creates HTML output. It uses the following output defines:

-W attribution=<dash|parentheses|parens|none>
 Specifies how the attribution of a block quote is to be formatted (default is 'dash').
-W body-attr=<text>
 Specifies attributes to be passed to the <body> tag (default is '').
-W body-only[=<0|1>]
 Only the contents of the HTML body tag are output. Default is 0 unless specified with no value.
-W cloak-email-addresses[=<0|1>]
 Enables cloaking of email addresses to keep spambots from harvesting email addresses. Default is 0.
-W colspecs[=<0|1>]
 Output colgroup width sections in tables based upon the relative widths of the table columns in the source. Default is 1.
-W embed-stylesheet[=<0|1>]
 Embed the primary stylesheet verbatim in the HTML output if possible. Stylesheets with http: URLs are not embeddable. If prest is installed with no default URL specified, the default stylesheet is always embedded. Default is 0.
-W enum-list-prefixes[=<0|1>]
 Specify whether to keep information on prefixes and suffixes of enumerated lists in the output; can be used to specify styles based upon the prefix and suffix attributes.
-W field-limit=<num>
 Specify the maximum width (in characters) for field names in field lists. Longer fields will span an entire row of the table used to render the field list. Default is 14 characters.
-W footnote-backlinks=<0|1>
 Enable backlinks from footnotes and citations to their references if 1 (default is 1).
-W footnote-references=<superscript|brackets>
 Format for footnote references. Default is "superscript".
-W html-prolog=<0|1>
 Generate file prolog for XHTML if 0 or HTML if 1 (default is 0).
-W image-exts=<ext-list>
 A comma-separated list of "ext1=ext2" pairs where any URI with extension ext1 has it mapped to ext2. This option allows using a single document source with multiple writers by using whatever figure extension is appropriate for a given writer. (Deprecated: use "-D image-exts=" instead.)
-W link-target=<expr>
 An expression that determines what the target frame will be in link references. The link URL is available in $_ so that the target frame can depend upon the URL (default is "").
-W option-limit=<num>
 Specify the maximum width (in characters) for options in option lists. Longer options will span an entire row of the table used to render the option list. Default is 14 characters.
-W stylesheet[=<0|URL|file>]
 Specify a URL or file for the primary stylesheet in the HTML header, or 0 or 'none' to omit the primary stylesheet. A file or "file:" URL should be either a full path or a path relative to where the HTML file will be served. The stylesheet will be a link unless -W embed-stylesheet is specified and the stylesheet is embeddable. Defaults to "None"
-W stylesheet2=file
 Specify a file to be embedded in the HTML header as a secondary stylesheet.
-W target-tag=<a|span>
 The HTML tag to use for target definitions (default is "a").

Documentation for writer 'index'

This writer dumps index entries from one or more input files out in reST format. The following items are indexed:

  1. Inline targets (if -W index-inline-targets)
  2. Targets created using a target role (either the target role or some other role based upon target) specified with -W index-role-target). The index entry is either the visible text, or if there is none, the target name.
  3. Indirect targets pointing to indexed targets (if -W index-indirect-targets)

The index writer sorts indices from all input files and put them into a table. Each row of the table contains an index entry and the location of the entry in the html version of the source file. An entry is also a reference to the definition in the corresponding html file.

This writer uses the following output defines:

-W doc-titles=<0|1>
 Put the document title in the index (default is 1).
-W file-suffix=<suffix>
 Specify a file suffix to be used for the html version of the source files (default is "html").
-W filename-ext=<ext>
 Specify an extension to the filename, (e.g. "_main") so the location of targets becomes <file><ext>.<suffix> (default is "").
-W index-inline-targets=<0|1>
 Specifies whether inline targets should be indexed (default is 1).
-W index-indirect-targets=<0|1>
 Specifies whether indirect targets pointing to indexed targets should be indexed (default is 0).
-W index-role-target[=<0|name>]
 Specifies whether targets originating from an interpreted text role should be indexed, and if so, what role name should be used (default is 0, or if specified with no name, 'target').
-W output-header=<0|1>
 Output a title and contents header (default is 1).
-W short-titles=<0|1>
 Use short titles (no section titles) in the index (default is 1).
-W title-underline=<char>
 Specify the underline character to use for the title in the header (default is '*').

Documentation for writer 'latex'

This writer creates an output in LaTeX format. It is definitely in alpha test state and will be reworked when the final docutils LaTeX writer becomes available.

It uses the following output defines:

-W caption=<after|before>
 Specify that a figure caption should appear after or before the figure (default is "after").
-W documentclass=<text>
 Specify documentclass for the output (default is "article").
-W documentclass-opts=<text>
 Specify the options for the documentclass command (default is "").
-W footer[=1] If 1, specify that the footer decoration generated by the RST transform should be included in the document (default is 0).
-W footnote-links[=1]
 If 1, specify that link URIs should be placed into footnotes (default is 0).
-W image-ext=<text>
 The file type to which to coerce figures ( default is "eps").
-W index[=1] If 1, specify that an index should be created from inline targets and indirect references to them (default is 0).
-W inputs=<list>
 Specify comma-separated list of files to \input
-W max-unwrapped-colsize
 The maximum length of a string in a column of a table without forcing the width of the column and turning on wrapping for the entry. Setting it to 0 ensures that all tables will be exactly the width of the text, even if the table's natural width may be smaller. Setting it too large may result in tables that overflow the column boundary (default is 8).
-W sidebar=<margin|float>
 Whether a sidebar should be processed as a paragraph in the margin or as a floating box within the text. Processing as a margin paragraph requires that \marginparwidth have a reasonable value and may require a raw directive with a \vspace -<dist> at the top to keep it from running off the page (default is "float").

There are a number of commands defined which specify default styles for rendering various items. These default styles can be overridden by putting \renewcommand definitions for them into some file.tex and then invoking with -W inputs=file. These commands are

\styleadmonitiontitle
Argument: the title (type) of the admonition. Default is centered boldface followed by a colon.
\styleauthor
Argument: an author. Default is emphasized.
\stylefieldname
Argument: the name of a field. Default is boldface followed by a colon.
\stylelegendtitle
Argument: the word "Legend". Default is boldface followed by a colon.
\styleoption
Arguments: the option string, the option argument(s) (may be null). Default is teletype option string followed by non-breaking space and italic option argument(s).
\stylesidebartitles
Arguments: the title for a sidebar, the subtitle for a sidebar (may be null). Default is centered boldface title followed by centered italic subtitle. Is not invoked if sidebar's title is null.
\styleterm
Arguments: description term, description classifier (may be null). Default is italic term followed by italic ": classifier" if the classifier is not null.
\styletitle
Arguments: the document's title, the document's subtitle (may be null). Default is the title followed by ":", a newline and the subtitle in a smaller font, if the subtitle is not null.
\styletopictitle
Arguments: the title of a topic. Default is bold face.

It is probably easiest to define these by first creating a .tex file with the writer and then copying the definition you want to modify from the top of the generated file into your style file, changing the "\newcommand" to "\renewcommand" and supplanting the definition.

Documentation for writer 'toc'

This writer creates table of content (TOC) for one or more reStructuredText documents. The output includes section headers from the source files organized into lists, one per top level section. The TOC entries are references to the corresponding sections in the source documents.

This writer uses the following output defines:

-W depth=<positive_integer>
 Specify the depth of TOC, with 1 corresponding to the top level sections (default is 99).
-W exclude-top=<0|1>
 Specify whether the top level section headers should be excluded from the TOC (default is 0, do not exclude). This has no effect if -W symbol='<char>' is present.
-W file-suffix=<suffix>
 Specify a file suffix to be used as the targets of TOC entries. It is either a string or a comma-separated list of "ext1=ext2" pairs where any file with extension ext1 has it mapped to ext2. This option allows generating a table of contents for multiple documents where more than one extension is needed (e.g., .html and .xhtml). Default is "html".
-W filename-ext=<ext>
 Specify an extension to the filename, (e.g. "_main") so the file location of targets becomes <file><ext>.<suffix> (default is "").
-W include-noheader=<0|1>
 If set to 1, the writer includes in the TOC the document source bare name for any document that does not have a header (default is 0, do not include).
-W parent-role=<role_name>
 If specified, items which have sublists are created as interpreted text with the given role. The role must be defined elsewhere.
-W symbol=<header_marking_character>
 If specified, the writer uses the symbol to create additional headers for the document titles. (default is "", no headers created). This output define overwrites -W exclude_top=1. If a document does not have a title, use the document source bare name instead.
-W top-in-list=<0|1>
 Specify whether the top level section headers should be part of the list with other section headers (default is 0, top level header being outside of the list). This has no effect if the top level headers are excluded.

Documentation for writer 'xml'

This writer dumps out the internal Document Object Model (DOM, also known as a doctree) as an XML file. It is useful for checking the results of the parser or transformations. It recognizes no defines.

Documentation for writer 'xref'

This writer exports cross reference targets defined in the source reStructuredText (reST) file. The output is in reST format and includes:

  1. All non-anonymous internal targets, exported as

    .. _<targetName>: <base-file><ext>.<suffix>#<targetID>
    

    Here <base-file> is the base file name (file name without extension).

  2. All citations, exported as:

    .. _<citationName>: <base-file><ext>.<suffix>#<citationID>
    
  3. Substitution definitions that have the same names as internal targets but do not, directly or indirectly through substitution references, define internal target or refer to external references or footnotes.

  4. Definition of substitutions referred to by the substitutions exported according to 3. Substitution definitions are exported as defined in the source file.

  5. If -W xref-sections is specified,

    1. Targets for all section titles, exported as

      .. _<base-file>.<base-title>: <base-file><ext>.<suffix>#<titleID>
      

      Here <base-title> is the title name prior to any autonumbering.

    2. Substitution definitions for all section titles, exported as

      .. |<base-file>.<base-title>| replace:: <section-title>
      
    3. Target for the file itself, exported as

      .. _<base-file>.: <sourceFileName><ext>.<suffix>
      
    4. Substitution definition for the file itself, exported as

      .. |<base-file>.| replace:: <document-title>
      
  6. If -W xref-role-targets is specified, then inline targets that are defined through an interpreted text target role are included in the index.

This writer uses the following output definitions:

-W appendix=<string>
 String to use for an appendix reference when xref-sections=1 (default "Appendix"). An appendix is a document with a "number-prefixed" title that is not a number.
-W chapter=<string>
 String to use for a chapter reference when xref-sections=1 (default "Chapter"). A chapter is a document with a "number-prefixed" title that is a number.
-W file-suffix=<suffix>
 Specify a file suffix to be used for exporting the cross reference targets (default is html).
-W filename-ext=<ext>
 Specify an extension to the filename, (e.g. "_main") so the file location of targets becomes <file><ext>.<suffix> (default is "").
-W file-path=<dir>
 Specify additional path information for the target file (default is "")
-W section=<string>
 String to use for a section reference when xref-sections=1 (default "Section").
-W sprintf=<string>
 Specify an sprintf string for formatting the output definitions (default is "%s").
-W xref-role-target[=<name>]
 Output cross-references for non-empty internal targets created from interpreted text roles (default is 0, or if specified with no name, 'target').
-W xref-sections=<0|1>
 Output cross-reference for section titles (default is 0).
-W xref-targets=<0|1>
 Output cross-reference for internal targets (default is 1)