reStructuredText标记语言参考规范

译自http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html

reStructuredText是纯文本格式,采用简单直观的结构表示结构化的文档。这些结构很容易阅读和处理。本篇文档本身就是reStructuredText例子。reStructureText解析器是Docutils的一个组件。

用简单,隐式的标记来表示特殊的结构,如节标题,无序列表和强调。使用的标记尽可能的简单和恰当。不经常使用的标记可能会比较复杂或有明确的标记。

reStructuredText既可以应用于篇幅较小的文档(如内联程序文档,如python文档),也可一个用于篇幅很长的文档(如本文)。

第一部分通过例子来展示一些reStructuredText基本语法,然后会给出详细的语法规范。

本篇文章通过文本块来展示示例。

内容

语法概述

reStructuredText文档由主体或块级元素构成,还可以结构化为节。是根据标题样式(下划线和可选的上划线)决定的。节包含主体元素和/或子节。主体元素包含更多的元素,如列表项,段落和其他主体元素。其他的,例如段落包含文本和内联标记元素。

下面是一些主体元素示例:

  • 段落(和内联标记):

    段落包含文本,也可包含内联标记:
    *强调*,**着重强调**,`解释文本`, ``内联文本``,标准链接 (http://www.python.org),
    外部链接(Python_) 内部交叉引用(example_), 脚注引用([1]_),引文引用([CIT2002]_),
    替代引用(|example|),和 _`内联内部目标`。
    
    段落由空行分割,并且是左对齐的。
    
  • 五种列表类型

    1. 无序列表:

      - 这是一个列表项
      - 列表项可以使用“*”,“+”和“-”符号。
      
    2. 有序列表:

      1. 这是一个有序列表项。
      2. 有序列表项可以使用阿拉伯数字,字母或罗马字符。
      
    3. 定义列表:

      what
          定义列表关联一个术语和它的定义。
      
      how
          “术语”是一个单行的短语。它的定义是一个相对于术语进行缩进的一个或多个段落。
      
    4. 字段列表:

      :what: 字段列表映射字段名称到字段主体,类似数据库中的记录。通常,他们是扩展语法。
      :how: 字段标记的格式为冒号加字段名加冒号。
      
            字段主体可包含一个或多个相对于字段标记进行缩进的主体元素。
      
    5. 选项列表用来列命令行选项:

      -a            命令行选项“a”
      -b file       选项可以有参数
                    和长描述
      --long        选项可以是长格式
      --input=file  长格式同样可以有
                    参数
      /V            也支持DOS/VMS风格选项
      

      命令行选项和描述之间至少要有两个空格。

  • 文本块:

    文本块是缩进的或者是行前加引用(line-prefix-quoted)的区块。
    并且前一个段落的最后是一个双引号“::”,就是这个--> ::
    
        if literal_block:
            text = 'is left as-is'
            spaces_and_linebreaks = 'are preserved'
            markup_processing = None
    
  • 块引用:

    块引用由带有缩进的主体元素构成:
    
        This theory, that is mine, is mine.
    
        -- Anne Elk (Miss)
    
  • Doctest块:

    >>> print 'Python-specific usage examples; begun with ">>>;"'
    Python-specific usage examples; begun with ">>>;"
    >>>; print '(cut and pasted from interactive Python sessions)'
    (cut and pasted from interactive Python sessions)
    
  • 表格的两种语法

    1. 网格表格;完整,但是复杂,冗余:

      +------------------------+------------+----------+
      | Header row, column 1   | Header 2   | Header 3 |
      +========================+============+==========+
      | body row 1, column 1   | column 2   | column 3 |
      +------------------------+------------+----------+
      | body row 2             | Cells may span        |
      +------------------------+-----------------------+
      
    2. 简单表格;简单,紧密,但是有限:

      ====================  ==========  ==========
      Header row, column 1  Header 2    Header 3
      ====================  ==========  ==========
      body row 1, column 1  column 2    column 3
      body row 2            Cells may span columns
      ====================  ======================
      
  • 显式标记块 都以显式块标记开始:两个句号加一个空格:

    • 脚注

      .. [1] A footnote contains body elements, consistently
         indented by at least 3 spaces.
      
    • 引文:

      .. [CIT2002] Just like a footnote, except the label is
         textual.
      
    • 超链接目标:

      .. _Python: http://www.python.org
      
      .. _example:
      
      The "_example" target above points to this paragraph.
      
    • 指令:

      .. image:: mylogo.png
      
    • 替代定义

      .. |symbol here| image:: symbol.png
      
    • 注释:

      .. Comments begin with two dots and a space.  Anything may
         follow, except for the syntax of footnotes/citations,
         hyperlink targets, directives, or substitution definitions.
      

语法详细

下面的列表中的”文档树元素”(文档树中的元素名称;XML DTD的通用标识符)对应的语法结构。在层次结构中的元素的详细信息,请参阅Docutils文档树和Docutils通用DTD的XML文档类型定义。

空白字符

推荐使用空格作为缩进字符,也可以使用tab作为缩进字符,tab会被转换成空格,每个tab转换成8个空格。
其他的空白字符(换页[chr(12)]和垂直制表符chr[11])在处理前会被转换为一个空格

空行

空行用来分割段落和其他元素。除了文本块(所有的空白字符都会保留),多个连续的空行等同于一个空行。如果标记能够明确区分出各元素,则可以结合缩进来省略空行。一篇文档的的第一行会被认为前面有一个空行,文档的最后一行会认为后面也有一个空行

缩进

缩进只用来表示块引用,定义(定义列表项目),和局部嵌套内容:

  • 列表项内容(multi-line contents of list items, and multiple body elements within a list item,
    including nested lists),
  • 文本块内容
  • 明确的标记块内容

任意少于当前缩进级别的文本(未缩进或少于当前缩进(dedents))都会结束当前的缩进级别
所有的缩进都是有意义的,缩进级别必须一致。例如,缩进是块引用的唯一标记:

This is a top-level paragraph.

    This paragraph belongs to a first-level block quote.

    Paragraph 2 of the first-level block quote.

Multiple levels of indentation within a block quote will result in more complex structures:
一个块引用中的多个缩进级别会产生更复杂的结构:

This is a top-level paragraph.

    This paragraph belongs to a first-level block quote.

        This paragraph belongs to a second-level block quote.

Another top-level paragraph.

        This paragraph belongs to a second-level block quote.

    This paragraph belongs to a first-level block quote.  The
    second-level block quote above is inside this first-level
    block quote.

当一个段落或其他的结构包行文本时,这些行必须要左对齐:

This is a paragraph.  The lines of
this paragraph are aligned at the left.

    This paragraph has problems.  The
lines are not left-aligned.  In addition
  to potential misinterpretation, warning
    and/or error messages will be generated
  by the parser.

一些以标记开始的结构,并且这些结构的主体必须相对于标记缩进。对于使用简单比较的结构(无序列表有序列表脚注引文超链接目标指令注释),主体的缩进级别由第一行文本的位置决定。例如,无序列表主体必须至少相对于该列表的左侧缩进两列:

- This is the first line of a bullet list
  item's paragraph.  All lines must align
  relative to the first line.  [1]_

      This indented paragraph is interpreted
      as a block quote.

Because it is not sufficiently indented,
this paragraph does not belong to the list
item.

.. [1] Here's a footnote.  The second line is aligned
   with the beginning of the footnote label.  The ".."
   marker is what determines the indentation.

对于使用复杂标记的结构(字段列表选项列表),标记可能包含任意的文本,标记后面第一行的缩进决定龄主体的左侧边缘。例如,字段列表可能包含非常长的标记:

:Hello: This field has a short field name, so aligning the field
        body with the first line is feasible.

:Number-of-African-swallows-required-to-carry-a-coconut: It would
    be very difficult to align the field body with the left edge
    of the first line.  It may even be preferable not to begin the
    body on the same line as the marker.

转义机制

对于纯文本文档,我们普遍使用的的7-bit ASCII字符集是很有限的。但无论使用什么字符集,在书写文本时,它们都有可能有多个含义。因此,有时会要求标记字符不能以标记字符的意义出现在文本中。任意严格的标记系统都需要转义机制,来改变某些字符的默认含义。在reStructuredText中,我们使用反斜线,这也是在其他领域经常使用的转义字符。
紧跟反斜线后面的任意字符(除了非URI上下文中的空白字符),都会对该字符转义。转义的字符代表其字符本身,不会在标记解析过程中起到任何作用。在输出中,会将反斜线移除。反斜线自身需要同一行中的两个连续反斜线表示。
在非URI上下文中,由反斜线转义的空白字符会被从文档中移除。这允许字符级别的内联标记
在URI中,反斜线转义的空白字符代表一个空格
在两种情况下,反斜线没有特殊含义:文本块中和内联文本。在这两种情况下,一个反斜线代表字面上的反斜线,不需要两个反斜线进行转义

Please note that the reStructuredText specification and parser do not address the issue of the representation or
extraction of text input (how and in what form the text actually reaches the parser). Backslashes and
other characters may serve a character-escaping purpose in certain contexts and must be dealt with
appropriately. For example, Python uses backslashes in strings to escape certain characters, but not others. The
simplest solution when backslashes appear in Python docstrings is to use raw docstrings:

r"""This is a raw docstring.  Backslashes (\) are not touched."""

引用名称

简单的引用名称由字母数字,加上单独的(不能两个连续的)内部连字符,下划线,句号,冒号和加号构成;不允许出现空白字符或其他字符。脚注标记(脚注 & 脚注引用),引文标记(引文 & 引文引用),解释文本和一些超链接引用使用简单引用名称语法。
引用名称使用了标点符号或者名称包含短语(两个或更多空格分割的单词)称为“短语引用”。要表示一个短语引用,需要用反引号“`”将器括起来:

Want to learn about `my favorite programming language`_?

.. _my favorite programming language: http://www.python.org

也可以使用反引号表示简单的引用名称。
引用名称不区分大小写,空白字符中立。内部解析引用名称时:

  • 空白字符统一化(一个或多个空格、横向或垂直制表符、回车符、换行符或分页符将被解释为一个空格)。
  • 大小写同一化(所有的字母字符都被转换成小写)。

例如,下面的超链接引用是等同的:

- `A HYPERLINK`_
- `a    hyperlink`_
- `A
  Hyperlink`_

超链接脚注引文都共享相同的引用名称命名空间。引文标记(简单引用名称)和手动编号的脚注(数字)和其他的超链接名称一样,进入同一个数据库。This means that a footnote (defined as “.. [1]“) which can be referred to by a footnote reference ([1]_), can also be referred to by a plain hyperlink reference (1). Of course, each type of reference (hyperlink, footnote, citation) may be processed and rendered differently. Some care should be taken to avoid reference name conflicts.

文档结构

文档

文档树元素:文档。

The top-level element of a parsed reStructuredText document is the “document” element. After initial
parsing, the document element is a simple container for a document fragment, consisting of body elementstransitions,
and sections, but lacking a document title or other
bibliographic elements. The code that calls the parser may choose to run one or more optional post-parse transforms,
rearranging the document fragment into a complete document with a title and possibly other metadata elements
(author, date, etc.; see Bibliographic Fields).

DocTitle transform for
details.

[1] The title configuration
setting can set a document title that does not become part of the document body.

Sections

Doctree elements: section, title.

Sections are identified through their titles, which are marked up with adornment: “underlines” below the
title text, or underlines and matching “overlines” above the title. An underline/overline is a single
repeated punctuation character that begins in column 1 and forms a line extending at least as far as the
right edge of the title text. Specifically, an underline/overline character may be any non-alphanumeric
printable 7-bit ASCII character [2]. When an overline is used, the length and character
used must match the underline. Underline-only adornment styles are distinct from overline-and-underline
styles that use the same character. There may be any number of levels of section titles, although some
output formats may have limits (HTML has 6 levels).

[2] The following are all valid section title adornment characters:

! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~

Some characters are more suitable than others. The following are recommended:

= - ` : . ' " ~ ^ _ * + #

Rather than imposing a fixed number and order of section title adornment styles, the order enforced will be
the order as encountered. The first style encountered will be an outermost title (like HTML H1), the second
style will be a subtitle, the third will be a subsubtitle, and so on.

Below are examples of section title styles:

===============
 Section Title
===============

---------------
 Section Title
---------------

Section Title
=============

Section Title
-------------

Section Title
`````````````

Section Title
'''''''''''''

Section Title
.............

Section Title
~~~~~~~~~~~~~

Section Title
*************

Section Title
+++++++++++++

Section Title
^^^^^^^^^^^^^

When a title has both an underline and an overline, the title text may be inset, as in the first two
examples above. This is merely aesthetic and not significant. Underline-only title text may not be
inset.

A blank line after a title is optional. All text blocks up to the next title of the same or higher level are
included in a section (or subsection, etc.).

All section title styles need not be used, nor need any specific section title style be used. However, a
document must be consistent in its use of section titles: once a hierarchy of title styles is established,
sections must use that hierarchy.

Each section title automatically generates a hyperlink target pointing to the section. The text of the
hyperlink target (the “reference name”) is the same as that of the section title. See Implicit Hyperlink Targets for a
complete description.

Sections may contain body elementstransitions, and nested sections.

Transitions

Doctree element: transition.

Instead of subheads, extra space or a type ornament between paragraphs may be used to mark text
divisions or to signal changes in subject or emphasis.

(The Chicago Manual of Style, 14th edition, section 1.80)

Transitions are commonly seen in novels and short fiction, as a gap spanning one or more lines, with or
without a type ornament such as a row of asterisks. Transitions separate other body elements. A transition
should not begin or end a section or document, nor should two transitions be immediately adjacent.

The syntax for a transition marker is a horizontal line of 4 or more repeated punctuation characters. The
syntax is the same as section title underlines without title text. Transition markers require blank lines
before and after:

Para.

----------

Para.

Unlike section title underlines, no hierarchy of transition markers is enforced, nor do differences in
transition markers accomplish anything. It is recommended that a single consistent style be used.

The processing system is free to render transitions in output in any way it likes. For example, horizontal
rules (<hr>) in HTML output would be an obvious choice.

Body Elements

Paragraphs

Doctree element: paragraph.

Paragraphs consist of blocks of left-aligned text with no markup indicating any other body element. Blank
lines separate paragraphs from each other and from other body elements. Paragraphs may contain inline markup.

Syntax diagram:

+------------------------------+
| paragraph                    |
|                              |
+------------------------------+

+------------------------------+
| paragraph                    |
|                              |
+------------------------------+

Bullet Lists

Doctree elements: bullet_list, list_item.

A text block which begins with a “*”, “+”, “-“, “•”, “‣”, or “⁃”, followed by whitespace, is a bullet list
item (a.k.a. “unordered” list item). List item bodies must be left-aligned and indented relative to the
bullet; the text immediately after the bullet determines the indentation. For example:

- This is the first bullet list item.  The blank line above the
  first list item is required; blank lines between list items
  (such as below this paragraph) are optional.

- This is the first paragraph in the second item in the list.

  This is the second paragraph in the second item in the list.
  The blank line above this paragraph is required.  The left edge
  of this paragraph lines up with the paragraph above, both
  indented relative to the bullet.

  - This is a sublist.  The bullet lines up with the left edge of
    the text blocks above.  A sublist is a new list so requires a
    blank line above and below.

- This is the third item of the main list.

This paragraph is not part of the list.

Here are examples of incorrectly formatted bullet lists:

- This first line is fine.
A blank line is required between list items and paragraphs.
(Warning)

- The following line appears to be a new sublist, but it is not:
  - This is a paragraph continuation, not a sublist (since there's
    no blank line).  This line is also incorrectly indented.
  - Warnings may be issued by the implementation.

Syntax diagram:

+------+-----------------------+
| "- " | list item             |
+------| (body elements)+      |
       +-----------------------+

Enumerated Lists

Doctree elements: enumerated_list, list_item.

Enumerated lists (a.k.a. “ordered” lists) are similar to bullet lists, but use enumerators instead of
bullets. An enumerator consists of an enumeration sequence member and formatting, followed by whitespace.
The following enumeration sequences are recognized:

  • arabic numerals: 1, 2, 3, … (no upper limit).
  • uppercase alphabet characters: A, B, C, …, Z.
  • lower-case alphabet characters: a, b, c, …, z.
  • uppercase Roman numerals: I, II, III, IV, …, MMMMCMXCIX (4999).
  • lowercase Roman numerals: i, ii, iii, iv, …, mmmmcmxcix (4999).

In addition, the auto-enumerator, “#”, may be used to automatically enumerate a list. Auto-enumerated lists
may begin with explicit enumeration, which sets the sequence. Fully auto-enumerated lists use arabic
numerals and begin with 1. (Auto-enumerated lists are new in Docutils 0.3.8.)

The following formatting types are recognized:

  • suffixed with a period: “1.”, “A.”, “a.”, “I.”, “i.”.
  • surrounded by parentheses: “(1)”, “(A)”, “(a)”, “(I)”, “(i)”.
  • suffixed with a right-parenthesis: “1)”, “A)”, “a)”, “I)”, “i)”.

While parsing an enumerated list, a new list will be started whenever:

  • An enumerator is encountered which does not have the same format and sequence type as the current
    list (e.g. “1.”, “(a)” produces two separate lists).
  • The enumerators are not in sequence (e.g., “1.”, “3.” produces two separate lists).

It is recommended that the enumerator of the first list item be ordinal-1 (“1”, “A”, “a”, “I”, or “i”).
Although other start-values will be recognized, they may not be supported by the output format. A level-1
[info] system message will be generated for any list beginning with a non-ordinal-1 enumerator.

Lists using Roman numerals must begin with “I”/”i” or a multi-character value, such as “II” or “XV”. Any
other single-character Roman numeral (“V”, “X”, “L”, “C”, “D”, “M”) will be interpreted as a letter of the
alphabet, not as a Roman numeral. Likewise, lists using letters of the alphabet may not begin with “I”/”i”,
since these are recognized as Roman numeral 1.

The second line of each enumerated list item is checked for validity. This is to prevent ordinary paragraphs
from being mistakenly interpreted as list items, when they happen to begin with text identical to
enumerators. For example, this text is parsed as an ordinary paragraph:

A. Einstein was a really
smart dude.

However, ambiguity cannot be avoided if the paragraph consists of only one line. This text is parsed as an
enumerated list item:

A. Einstein was a really smart dude.

If a single-line paragraph begins with text identical to an enumerator (“A.”, “1.”, “(b)”, “I)”, etc.), the
first character will have to be escaped in order to have the line parsed as an ordinary paragraph:

\A. Einstein was a really smart dude.

Examples of nested enumerated lists:

1. Item 1 initial text.

   a) Item 1a.
   b) Item 1b.

2. a) Item 2a.
   b) Item 2b.

Example syntax diagram:

+-------+----------------------+
| "1. " | list item            |
+-------| (body elements)+     |
        +----------------------+

Definition Lists

Doctree elements: definition_list, definition_list_item, term, classifier, definition.

Each definition list item contains a term, optional classifiers, and a definition. A term is a simple
one-line word or phrase. Optional classifiers may follow the term on the same line, each after an inline ” :
” (space, colon, space). A definition is a block indented relative to the term, and may contain multiple
paragraphs and other body elements. There may be no blank line between a term line and a definition block
(this distinguishes definition lists from block
quotes
). Blank lines are required before the first and after the last definition list item, but are
optional in-between. For example:

term 1
    Definition 1.

term 2
    Definition 2, paragraph 1.

    Definition 2, paragraph 2.

term 3 : classifier
    Definition 3.

term 4 : classifier one : classifier two
    Definition 4.

Inline markup is parsed in the term line before the classifier delimiter (” : “) is recognized. The
delimiter will only be recognized if it appears outside of any inline markup.

A definition list may be used in various ways, including:

  • As a dictionary or glossary. The term is the word itself, a classifier may be used to indicate the
    usage of the term (noun, verb, etc.), and the definition follows.
  • To describe program variables. The term is the variable name, a classifier may be used to indicate
    the type of the variable (string, integer, etc.), and the definition describes the variable’s use in
    the program. This usage of definition lists supports the classifier syntax of Grouch,
    a system for describing and enforcing a Python object schema.

Syntax diagram:

+----------------------------+
| term [ " : " classifier ]* |
+--+-------------------------+--+
   | definition                 |
   | (body elements)+           |
   +----------------------------+

Field Lists

Doctree elements: field_list, field, field_name, field_body.

Field lists are used as part of an extension syntax, such as options for directives,
or database-like records meant for further processing. They may also be used for two-column table-like
structures resembling database records (label & data pairs). Applications of reStructuredText may
recognize field names and transform fields or field bodies in certain contexts. For examples, see Bibliographic Fields below, or the “image
and “meta
directives in reStructuredText
Directives
.

interpreted text with explicit roles in field names: the role must be a
suffix to the interpreted text. Field names are case-insensitive when further processed or transformed.
The field name, along with a single colon prefix and suffix, together form the field marker. The field
marker is followed by whitespace and the field body. The field body may contain multiple body elements,
indented relative to the field marker. The first line after the field name marker determines the
indentation of the field body. For example:

:Date: 2001-08-16
:Version: 1
:Authors: - Me
          - Myself
          - I
:Indentation: Since the field marker may be quite long, the second
   and subsequent lines of the field body do not have to line up
   with the first line, but they must be indented relative to the
   field name marker, and they must line up with each other.
:Parameter i: integer

The interpretation of individual words in a multi-word field name is up to the application. The application
may specify a syntax for the field name. For example, second and subsequent words may be treated as
“arguments”, quoted phrases may be treated as a single argument, and direct support for the “name=value”
syntax may be added.

Standard RFC822 headers
cannot be used for this construct because they are ambiguous. A word followed by a colon at the beginning of
a line is common in written text. However, in well-defined contexts such as when a field list invariably
occurs at the beginning of a document (PEPs and email messages), standard RFC822 headers could be used.

Syntax diagram (simplified):

+--------------------+----------------------+
| ":" field name ":" | field body           |
+-------+------------+                      |
        | (body elements)+                  |
        +-----------------------------------+
[3] Up to Docutils 0.14, field markers were not recognized when containing a colon.

Bibliographic Fields

Doctree elements: docinfo, author, authors, organization, contact, version, status, date, copyright,
field, topic.

When a field list is the first non-comment element in a document (after the document title, if there is
one), it may have its fields transformed to document bibliographic data. This bibliographic data
corresponds to the front matter of a book, such as the title page and copyright page.

Certain registered field names (listed below) are recognized and transformed to the corresponding
doctree elements, most becoming child elements of the “docinfo” element. No ordering is required of
these fields, although they may be rearranged to fit the document structure, as noted. Unless otherwise
indicated below, each of the bibliographic elements’ field bodies may contain a single paragraph only.
Field bodies may be checked for RCS keywords and
cleaned up. Any unrecognized fields will remain as generic fields in the docinfo element.

The registered bibliographic field names and their corresponding doctree elements are as follows:

  • Field name “Author”: author element.
  • “Authors”: authors.
  • “Organization”: organization.
  • “Contact”: contact.
  • “Address”: address.
  • “Version”: version.
  • “Status”: status.
  • “Date”: date.
  • “Copyright”: copyright.
  • “Dedication”: topic.
  • “Abstract”: topic.

The “Authors” field may contain either: a single paragraph consisting of a list of authors, separated by
“;” or “,”; or a bullet list whose elements each contain a single paragraph per author. “;” is checked
first, so “Doe, Jane; Doe, John” will work. In some languages (e.g. Swedish), there is no
singular/plural distinction between “Author” and “Authors”, so only an “Authors” field is provided, and
a single name is interpreted as an “Author”. If a single name contains a comma, end it with a semicolon
to disambiguate: “:Authors: Doe, Jane;”.

The “Address” field is for a multi-line surface mailing address. Newlines and whitespace will be
preserved.

The “Dedication” and “Abstract” fields may contain arbitrary body elements. Only one of each is allowed.
They become topic elements with “Dedication” or “Abstract” titles (or language equivalents) immediately
following the docinfo element.

This field-name-to-element mapping can be replaced for other languages. See the DocInfo transform implementation
documentation for details.

Unregistered/generic fields may contain one or more paragraphs or arbitrary body elements. The field
name is also used as a “classes” attribute value after being converted into a valid identifier form.

RCS Keywords

Bibliographic fields recognized by the
parser are normally checked for RCS [6], they are expanded to “$keyword: expansion
text $”. For example, a “Status” field will be transformed to a “status” element:

:Status: $keyword: expansion text $
[4] Revision Control System.
[5] RCS keyword processing can be turned off (unimplemented).
[6] Concurrent Versions System. CVS uses the same keywords as RCS.

Processed, the “status” element’s text will become simply “expansion text”. The dollar sign delimiters
and leading RCS keyword name are removed.

The RCS keyword processing only kicks in when the field list is in bibliographic context (first
non-comment construct in the document, after a document title if there is one).

Option Lists

Doctree elements: option_list, option_list_item, option_group, option, option_string, option_argument,
description.

Option lists are two-column lists of command-line options and descriptions, documenting a program’s options.
For example:

-a         Output all.
-b         Output both (this description is
           quite long).
-c arg     Output just arg.
--long     Output all day long.

-p         This option has two paragraphs in the description.
           This is the first.

           This is the second.  Blank lines may be omitted between
           options (as above) or left in (as here and below).

--very-long-option  A VMS-style option.  Note the adjustment for
                    the required two spaces.

--an-even-longer-option
           The description can also start on the next line.

-2, --two  This option has two variants.

-f FILE, --file=FILE  These two options are synonyms; both have
                      arguments.

/V         A VMS/DOS-style option.

There are several types of options recognized by reStructuredText:

  • Short POSIX options consist of one dash and an option letter.
  • Long POSIX options consist of two dashes and an option word; some systems use a single dash.
  • Old GNU-style “plus” options consist of one plus and an option letter (“plus” options are deprecated
    now, their use discouraged).
  • DOS/VMS options consist of a slash and an option letter or word.

Please note that both POSIX-style and DOS/VMS-style options may be used by DOS or Windows software. These
and other variations are sometimes used mixed together. The names above have been chosen for convenience
only.

The syntax for short and long POSIX options is based on the syntax supported by Python’s getopt.py module,
which implements an option parser similar to the GNU
libc getopt_long()
 function but with some restrictions. There are many variant option systems, and
reStructuredText option lists do not support all of them.

Although long POSIX and DOS/VMS option words may be allowed to be truncated by the operating system or the
application when used on the command line, reStructuredText option lists do not show or support this with
any special syntax. The complete option word should be given, supported by notes about truncation if and
when applicable.

Options may be followed by an argument placeholder, whose role and syntax should be explained in the
description text. Either a space or an equals sign may be used as a delimiter between options and option
argument placeholders; short options (“-” or “+” prefix only) may omit the delimiter. Option arguments may
take one of two forms:

  • Begins with a letter ([a-zA-Z]) and
    subsequently consists of letters, numbers, underscores and hyphens ([a-zA-Z0-9_-]).
  • Begins with an open-angle-bracket (<) and ends with a
    close-angle-bracket (>); any characters except angle brackets
    are allowed internally.

Multiple option “synonyms” may be listed, sharing a single description. They must be separated by
comma-space.

There must be at least two spaces between the option(s) and the description. The description may contain
multiple body elements. The first line after the option marker determines the indentation of the
description. As with other types of lists, blank lines are required before the first option list item and
after the last, but are optional between option entries.

Syntax diagram (simplified):

+----------------------------+-------------+
| option [" " argument] "  " | description |
+-------+--------------------+             |
        | (body elements)+                 |
        +----------------------------------+

Literal Blocks

Doctree element: literal_block.

A paragraph consisting of two colons (“::”) signifies that the following text block(s) comprise a literal
block. The literal block must either be indented or quoted (see below). No markup processing is done within
a literal block. It is left as-is, and is typically rendered in a monospaced typeface:

This is a typical paragraph.  An indented literal block follows.

::

    for a in [5,4,3,2,1]:   # this is program code, shown as-is
        print a
    print "it's..."
    # a literal block continues until the indentation ends

This text has returned to the indentation of the first paragraph,
is outside of the literal block, and is therefore treated as an
ordinary paragraph.

The paragraph containing only “::” will be completely removed from the output; no empty paragraph will
remain.

As a convenience, the “::” is recognized at the end of any paragraph. If immediately preceded by whitespace,
both colons will be removed from the output (this is the “partially minimized” form). When text immediately
precedes the “::”, one colon will be removed from the output, leaving only one colon visible (i.e.,
“::” will be replaced by “:”; this is the “fully minimized” form).

In other words, these are all equivalent (please pay attention to the colons after “Paragraph”):

  1. Expanded form:

    Paragraph:
    
    ::
    
        Literal block
    
  2. Partially minimized form:

    Paragraph: ::
    
        Literal block
    
  3. Fully minimized form:

    Paragraph::
    
        Literal block
    

All whitespace (including line breaks, but excluding minimum indentation for indented literal blocks) is
preserved. Blank lines are required before and after a literal block, but these blank lines are not included
as part of the literal block.

Indented Literal Blocks

Indented literal blocks are indicated by indentation relative to the surrounding text (leading
whitespace on each line). The minimum indentation will be removed from each line of an indented literal
block. The literal block need not be contiguous; blank lines are allowed between sections of indented
text. The literal block ends with the end of the indentation.

Syntax diagram:

+------------------------------+
| paragraph                    |
| (ends with "::")             |
+------------------------------+
   +---------------------------+
   | indented literal block    |
   +---------------------------+

Quoted Literal Blocks

Quoted literal blocks are unindented contiguous blocks of text where each line begins with the same
non-alphanumeric printable 7-bit ASCII character [7]. A blank line ends a quoted
literal block. The quoting characters are preserved in the processed document.

[7] The following are all valid quoting characters:

! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~

Note that these are the same characters as are valid for title adornment of sections.

Possible uses include literate programming in Haskell and email quoting:

John Doe wrote::

>> Great idea!
>
> Why didn't I think of that?

You just did!  ;-)

Syntax diagram:

+------------------------------+
| paragraph                    |
| (ends with "::")             |
+------------------------------+
+------------------------------+
| ">" per-line-quoted          |
| ">" contiguous literal block |
+------------------------------+

Line Blocks

Doctree elements: line_block, line. (New in Docutils 0.3.5.)

Line blocks are useful for address blocks, verse (poetry, song lyrics), and unadorned lists, where the
structure of lines is significant. Line blocks are groups of lines beginning with vertical bar (“|”)
prefixes. Each vertical bar prefix indicates a new line, so line breaks are preserved. Initial indents are
also significant, resulting in a nested structure. Inline markup is supported. Continuation lines are
wrapped portions of long lines; they begin with a space in place of the vertical bar. The left edge of a
continuation line must be indented, but need not be aligned with the left edge of the text above it. A line
block ends with a blank line.

This example illustrates continuation lines:

| Lend us a couple of bob till Thursday.
| I'm absolutely skint.
| But I'm expecting a postal order and I can pay you back
  as soon as it comes.
| Love, Ewan.

This example illustrates the nesting of line blocks, indicated by the initial indentation of new lines:

Take it away, Eric the Orchestra Leader!

    | A one, two, a one two three four
    |
    | Half a bee, philosophically,
    |     must, *ipso facto*, half not be.
    | But half the bee has got to be,
    |     *vis a vis* its entity.  D'you see?
    |
    | But can a bee be said to be
    |     or not to be an entire bee,
    |         when half the bee is not a bee,
    |             due to some ancient injury?
    |
    | Singing...

Syntax diagram:

+------+-----------------------+
| "| " | line                  |
+------| continuation line     |
       +-----------------------+

Block Quotes

Doctree element: block_quote, attribution.

A text block that is indented relative to the preceding text, without preceding markup indicating it to be a
literal block or other content, is a block quote. All markup processing (for body elements and inline
markup) continues within the block quote:

This is an ordinary paragraph, introducing a block quote.

    "It is my business to know things.  That is my trade."

    -- Sherlock Holmes

A block quote may end with an attribution: a text block beginning with “–“, “—“, or a true em-dash, flush
left within the block quote. If the attribution consists of multiple lines, the left edges of the second and
subsequent lines must align.

Multiple block quotes may occur consecutively if terminated with attributions.

Unindented paragraph.

Block quote 1.
—Attribution 1

Block quote 2.

Empty comments may be used to explicitly terminate
preceding constructs that would otherwise consume a block quote:

* List item.

..

    Block quote 3.

Empty comments may also be used to separate block quotes:

    Block quote 4.

..

    Block quote 5.

Blank lines are required before and after a block quote, but these blank lines are not included as part of
the block quote.

Syntax diagram:

+------------------------------+
| (current level of            |
| indentation)                 |
+------------------------------+
   +---------------------------+
   | block quote               |
   | (body elements)+          |
   |                           |
   | -- attribution text       |
   |    (optional)             |
   +---------------------------+

Doctest Blocks

Doctree element: doctest_block.

Doctest blocks are interactive Python sessions cut-and-pasted into docstrings. They are meant to illustrate
usage by example, and provide an elegant and powerful testing environment via the doctest
module
 in the Python standard library.

Doctest blocks are text blocks which begin with ">>> ", the Python interactive interpreter main prompt, and end with a
blank line. Doctest blocks are treated as a special case of literal blocks, without requiring the literal
block syntax. If both are present, the literal block syntax takes priority over Doctest block syntax:

This is an ordinary paragraph.

>>> print 'this is a Doctest block'
this is a Doctest block

The following is a literal block::

    >>> This is not recognized as a doctest block by
    reStructuredText.  It *will* be recognized by the doctest
    module, though!

Indentation is not required for doctest blocks.

Tables

Doctree elements: table, tgroup, colspec, thead, tbody, row, entry.

ReStructuredText provides two syntaxes for delineating table cells: Grid Tables and Simple Tables.

As with other body elements, blank lines are required before and after tables. Tables’ left edges should
align with the left edge of preceding text blocks; if indented, the table is considered to be part of a
block quote.

Once isolated, each table cell is treated as a miniature document; the top and bottom cell boundaries act as
delimiting blank lines. Each cell contains zero or more body elements. Cell contents may include left and/or
right margins, which are removed before processing.

Grid Tables

Grid tables provide a complete table representation via grid-like “ASCII art”. Grid tables allow
arbitrary cell contents (body elements), and both row and column spans. However, grid tables can be
cumbersome to produce, especially for simple data sets. The Emacs
table mode
 is a tool that allows easy editing of grid tables, in Emacs. See Simple Tables for a simpler (but limited)
representation.

Grid tables are described with a visual grid made up of the characters “-“, “=”, “|”, and “+”. The
hyphen (“-“) is used for horizontal lines (row separators). The equals sign (“=”) may be used to
separate optional header rows from the table body (not supported by the Emacs
table mode
). The vertical bar (“|”) is used for vertical lines (column separators). The plus sign
(“+”) is used for intersections of horizontal and vertical lines. Example:

+------------------------+------------+----------+----------+
| Header row, column 1   | Header 2   | Header 3 | Header 4 |
| (header rows optional) |            |          |          |
+========================+============+==========+==========+
| body row 1, column 1   | column 2   | column 3 | column 4 |
+------------------------+------------+----------+----------+
| body row 2             | Cells may span columns.          |
+------------------------+------------+---------------------+
| body row 3             | Cells may  | - Table cells       |
+------------------------+ span rows. | - contain           |
| body row 4             |            | - body elements.    |
+------------------------+------------+---------------------+

Some care must be taken with grid tables to avoid undesired interactions with cell text in rare cases.
For example, the following table contains a cell in row 2 spanning from column 2 to column 4:

+--------------+----------+-----------+-----------+
| row 1, col 1 | column 2 | column 3  | column 4  |
+--------------+----------+-----------+-----------+
| row 2        |                                  |
+--------------+----------+-----------+-----------+
| row 3        |          |           |           |
+--------------+----------+-----------+-----------+

If a vertical bar is used in the text of that cell, it could have unintended effects if accidentally
aligned with column boundaries:

+--------------+----------+-----------+-----------+
| row 1, col 1 | column 2 | column 3  | column 4  |
+--------------+----------+-----------+-----------+
| row 2        | Use the command ``ls | more``.   |
+--------------+----------+-----------+-----------+
| row 3        |          |           |           |
+--------------+----------+-----------+-----------+

Several solutions are possible. All that is needed is to break the continuity of the cell outline
rectangle. One possibility is to shift the text by adding an extra space before:

+--------------+----------+-----------+-----------+
| row 1, col 1 | column 2 | column 3  | column 4  |
+--------------+----------+-----------+-----------+
| row 2        |  Use the command ``ls | more``.  |
+--------------+----------+-----------+-----------+
| row 3        |          |           |           |
+--------------+----------+-----------+-----------+

Another possibility is to add an extra line to row 2:

+--------------+----------+-----------+-----------+
| row 1, col 1 | column 2 | column 3  | column 4  |
+--------------+----------+-----------+-----------+
| row 2        | Use the command ``ls | more``.   |
|              |                                  |
+--------------+----------+-----------+-----------+
| row 3        |          |           |           |
+--------------+----------+-----------+-----------+

Simple Tables

Simple tables provide a compact and easy to type but limited row-oriented table representation for
simple data sets. Cell contents are typically single paragraphs, although arbitrary body elements may be
represented in most cells. Simple tables allow multi-line rows (in all but the first column) and column
spans, but not row spans. See Grid Tables above
for a complete table representation.

Simple tables are described with horizontal borders made up of “=” and “-” characters. The equals sign
(“=”) is used for top and bottom table borders, and to separate optional header rows from the table
body. The hyphen (“-“) is used to indicate column spans in a single row by underlining the joined
columns, and may optionally be used to explicitly and/or visually separate rows.

A simple table begins with a top border of equals signs with one or more spaces at each column boundary
(two or more spaces recommended). Regardless of spans, the top border must fully describe all
table columns. There must be at least two columns in the table (to differentiate it from section
headers). The top border may be followed by header rows, and the last of the optional header rows is
underlined with ‘=’, again with spaces at column boundaries. There may not be a blank line below the
header row separator; it would be interpreted as the bottom border of the table. The bottom boundary of
the table consists of ‘=’ underlines, also with spaces at column boundaries. For example, here is a
truth table, a three-column table with one header row and four body rows:

=====  =====  =======
  A      B    A and B
=====  =====  =======
False  False  False
True   False  False
False  True   False
True   True   True
=====  =====  =======

Underlines of ‘-‘ may be used to indicate column spans by “filling in” column margins to join adjacent
columns. Column span underlines must be complete (they must cover all columns) and align with
established column boundaries. Text lines containing column span underlines may not contain any other
text. A column span underline applies only to one row immediately above it. For example, here is a table
with a column span in the header:

=====  =====  ======
   Inputs     Output
------------  ------
  A      B    A or B
=====  =====  ======
False  False  False
True   False  True
False  True   True
True   True   True
=====  =====  ======

Each line of text must contain spaces at column boundaries, except where cells have been joined by
column spans. Each line of text starts a new row, except when there is a blank cell in the first column.
In that case, that line of text is parsed as a continuation line. For this reason, cells in the first
column of new rows (not continuation lines) must contain some text; blank cells would
lead to a misinterpretation (but see the tip below). Also, this mechanism limits cells in the first
column to only one line of text. Use grid tables if
this limitation is unacceptable.

Tip
To start a new row in a simple table without text in the first column in the processed output, use
one of these:

  • an empty comment (“..”), which may be omitted from the processed output (see Comments below)
  • a backslash escape (“\“) followed by a space (see Escaping Mechanism above)

Underlines of ‘-‘ may also be used to visually separate rows, even if there are no column spans. This is
especially useful in long tables, where rows are many lines long.

Blank lines are permitted within simple tables. Their interpretation depends on the context. Blank
lines between rows are ignored. Blank lines within multi-line rows may separate
paragraphs or other body elements within cells.

The rightmost column is unbounded; text may continue past the edge of the table (as indicated by the
table borders). However, it is recommended that borders be made long enough to contain the entire text.

The following example illustrates continuation lines (row 2 consists of two lines of text, and four
lines for row 3), a blank line separating paragraphs (row 3, column 2), text extending past the right
edge of the table, and a new row which will have no text in the first column in the processed output
(row 4):

=====  =====
col 1  col 2
=====  =====
1      Second column of row 1.
2      Second column of row 2.
       Second line of paragraph.
3      - Second column of row 3.

       - Second item in bullet
         list (row 3, column 2).
\      Row 4; column 1 will be empty.
=====  =====

Explicit Markup Blocks

An explicit markup block is a text block:

  • whose first line begins with “..” followed by whitespace (the “explicit markup start”),
  • whose second and subsequent lines (if any) are indented relative to the first, and
  • which ends before an unindented line.

Explicit markup blocks are analogous to bullet list items, with “..” as the bullet. The text on the lines
immediately after the explicit markup start determines the indentation of the block body. The maximum common
indentation is always removed from the second and subsequent lines of the block body. Therefore if the first
construct fits in one line, and the indentation of the first and second constructs should differ, the first
construct should not begin on the same line as the explicit markup start.

Blank lines are required between explicit markup blocks and other elements, but are optional between
explicit markup blocks where unambiguous.

The explicit markup syntax is used for footnotes, citations, hyperlink targets, directives, substitution
definitions, and comments.

Footnotes

See also: Footnote References.

Doctree elements: footnotelabel.

Configuration settings: footnote_references.

Each footnote consists of an explicit markup start (“.. “), a left square bracket, the footnote label, a
right square bracket, and whitespace, followed by indented body elements. A footnote label can be:

The footnote content (body elements) must be consistently indented (by at least 3 spaces) and
left-aligned. The first body element within a footnote may often begin on the same line as the footnote
label. However, if the first element fits on one line and the indentation of the remaining elements
differ, the first element must begin on the line after the footnote label. Otherwise, the difference in
indentation will not be detected.

Footnotes may occur anywhere in the document, not only at the end. Where and how they appear in the
processed output depends on the processing system.

Here is a manually numbered footnote:

.. [1] Body elements go here.

Each footnote automatically generates a hyperlink target pointing to itself. The text of the hyperlink
target name is the same as that of the footnote label. Auto-numbered
footnotes
 generate a number as their footnote label and reference name. See Implicit Hyperlink Targets for a
complete description of the mechanism.

Syntax diagram:

+-------+-------------------------+
| ".. " | "[" label "]" footnote  |
+-------+                         |
        | (body elements)+        |
        +-------------------------+
Auto-Numbered Footnotes

A number sign (“#”) may be used as the first character of a footnote label to request automatic
numbering of the footnote or footnote reference.

The first footnote to request automatic numbering is assigned the label “1”, the second is assigned
the label “2”, and so on (assuming there are no manually numbered footnotes present; see Mixed Manual and
Auto-Numbered Footnotes
 below). A footnote which has automatically received a label “1”
generates an implicit hyperlink target with name “1”, just as if the label was explicitly specified.

A footnote may specify a label explicitly while at the same time requesting automatic numbering: [#label]. These labels are calledautonumber labels.
Autonumber labels do two things:

  • On the footnote itself, they generate a hyperlink target whose name is the
    autonumber label (doesn’t include the “#”).
  • They allow an automatically numbered footnote to be referred to more than
    once, as a footnote reference or hyperlink reference. For example:

    If [#note]_ is the first footnote reference, it will show up as
    "[1]".  We can refer to it again as [#note]_ and again see
    "[1]".  We can also refer to it as note_ (an ordinary internal
    hyperlink reference).
    
    .. [#note] This is the footnote labeled "note".
    

The numbering is determined by the order of the footnotes, not by the order of the references. For
footnote references without autonumber labels ([#]_), the footnotes and footnote references must be in the same
relative order but need not alternate in lock-step. For example:

[#]_ is a reference to footnote 1, and [#]_ is a reference to
footnote 2.

.. [#] This is footnote 1.
.. [#] This is footnote 2.
.. [#] This is footnote 3.

[#]_ is a reference to footnote 3.

Special care must be taken if footnotes themselves contain auto-numbered footnote references, or if
multiple references are made in close proximity. Footnotes and references are noted in the order
they are encountered in the document, which is not necessarily the same as the order in which a
person would read them.

Auto-Symbol Footnotes

An asterisk (“*”) may be used for footnote labels to request automatic symbol generation for
footnotes and footnote references. The asterisk may be the only character in the label. For example:

Here is a symbolic footnote reference: [*]_.

.. [*] This is the footnote.

A transform will insert symbols as labels into corresponding footnotes and footnote references. The
number of references must be equal to the number of footnotes. One symbol footnote cannot have
multiple references.

The standard Docutils system uses the following symbols for footnote marks [8]:

  • asterisk/star (“*”)
  • dagger (HTML character entity “&dagger;”, Unicode U+02020)
  • double dagger (“&Dagger;”/U+02021)
  • section mark (“&sect;”/U+000A7)
  • pilcrow or paragraph mark (“&para;”/U+000B6)
  • number sign (“#”)
  • spade suit (“&spades;”/U+02660)
  • heart suit (“&hearts;”/U+02665)
  • diamond suit (“&diams;”/U+02666)
  • club suit (“&clubs;”/U+02663)
[8] This list was inspired by the list of symbols for “Note Reference Marks” in The Chicago
Manual of Style, 14th edition, section 12.51. “Parallels” (“||”) were given in CMoS
instead of the pilcrow. The last four symbols (the card suits) were added arbitrarily.

If more than ten symbols are required, the same sequence will be reused, doubled and then tripled,
and so on (“**” etc.).

Note
When using auto-symbol footnotes, the choice of output encoding is important.
Many of the symbols used are not encodable in certain common text encodings such as Latin-1
(ISO 8859-1). The use of UTF-8 for the output encoding is recommended. An alternative for
HTML and XML output is to use the “xmlcharrefreplace” output
encoding error handler
.

Mixed Manual and Auto-Numbered Footnotes

Manual and automatic footnote numbering may both be used within a single document, although the
results may not be expected. Manual numbering takes priority. Only unused footnote numbers are
assigned to auto-numbered footnotes. The following example should be illustrative:

[2]_ will be "2" (manually numbered),
[#]_ will be "3" (anonymous auto-numbered), and
[#label]_ will be "1" (labeled auto-numbered).

.. [2] This footnote is labeled manually, so its number is fixed.

.. [#label] This autonumber-labeled footnote will be labeled "1".
   It is the first auto-numbered footnote and no other footnote
   with label "1" exists.  The order of the footnotes is used to
   determine numbering, not the order of the footnote references.

.. [#] This footnote will be labeled "3".  It is the second
   auto-numbered footnote, but footnote label "2" is already used.

Citations

See also: Citation References.

Doctree element: citation

Citations are identical to footnotes except that they use only non-numeric labels such as [note] or [GVR2001]. Citation labels
are simple reference names (case-insensitive
single words consisting of alphanumerics plus internal hyphens, underscores, and periods; no
whitespace). Citations may be rendered separately and differently from footnotes. For example:

Here is a citation reference: [CIT2002]_.

.. [CIT2002] This is the citation.  It's just like a footnote,
   except the label is textual.

Directives

Doctree elements: depend on the directive.

Directives are an extension mechanism for reStructuredText, a way of adding support for new constructs
without adding new primary syntax (directives may support additional syntax locally). All standard
directives (those implemented and registered in the reference reStructuredText parser) are described in
the reStructuredText
Directives
 document, and are always available. Any other directives are domain-specific, and may
require special action to make them available when processing the document.

For example, here’s how an image may
be placed:

.. image:: mylogo.jpeg

figure (a graphic
with a caption) may placed like this:

.. figure:: larch.png

   The larch.

An admonition (note,
caution, etc.) contains other body elements:

.. note:: This is a paragraph

   - Here is a bullet list.

Directives are indicated by an explicit markup start (“.. “) followed by the directive type, two colons,
and whitespace (together called the “directive marker”). Directive types are case-insensitive single
words (alphanumerics plus isolated internal hyphens, underscores, plus signs, colons, and periods; no
whitespace). Two colons are used after the directive type for these reasons:

  • Two colons are distinctive, and unlikely to be used in common text.
  • Two colons avoids clashes with common comment text like:

    .. Danger: modify at your own risk!
    
  • If an implementation of reStructuredText does not recognize a directive (i.e.,
    the directive-handler is not installed), a level-3 (error) system message is generated, and
    the entire directive block (including the directive itself) will be included as a literal
    block. Thus “::” is a natural choice.

The directive block is consists of any text on the first line of the directive after the directive
marker, and any subsequent indented text. The interpretation of the directive block is up to the
directive code. There are three logical parts to the directive block:

  1. Directive arguments.
  2. Directive options.
  3. Directive content.

Individual directives can employ any combination of these parts. Directive arguments can be filesystem
paths, URLs, title text, etc. Directive options are indicated using field
lists
; the field names and contents are directive-specific. Arguments and options must form a
contiguous block beginning on the first or second line of the directive; a blank line indicates the
beginning of the directive content block. If either arguments and/or options are employed by the
directive, a blank line must separate them from the directive content. The “figure” directive employs
all three parts:

.. figure:: larch.png
   :scale: 50

   The larch.

Simple directives may not require any content. If a directive that does not employ a content block is
followed by indented text anyway, it is an error. If a block quote should immediately follow a
directive, use an empty comment in-between (see Comments below).

Actions taken in response to directives and the interpretation of text in the directive content block or
subsequent text block(s) are directive-dependent. See reStructuredText
Directives
 for details.

Directives are meant for the arbitrary processing of their contents, which can be transformed into
something possibly unrelated to the original text. It may also be possible for directives to be used as
pragmas, to modify the behavior of the parser, such as to experiment with alternate syntax. There is no
parser support for this functionality at present; if a reasonable need for pragma directives is found,
they may be supported.

Directives do not generate “directive” elements; they are a parser construct only, and have no
intrinsic meaning outside of reStructuredText. Instead, the parser will transform recognized directives
into (possibly specialized) document elements. Unknown directives will trigger level-3 (error) system
messages.

Syntax diagram:

+-------+-------------------------------+
| ".. " | directive type "::" directive |
+-------+ block                         |
        |                               |
        +-------------------------------+

Substitution Definitions

Doctree element: substitution_definition.

Substitution definitions are indicated by an explicit markup start (“.. “) followed by a vertical bar,
the substitution text, another vertical bar, whitespace, and the definition block. Substitution text may
not begin or end with whitespace. A substitution definition block contains an embedded inline-compatible
directive (without the leading “.. “), such as “image
or “replace“. For
example:

The |biohazard| symbol must be used on containers used to
dispose of medical waste.

.. |biohazard| image:: biohazard.png

It is an error for a substitution definition block to directly or indirectly contain a circular
substitution reference.

Substitution references are replaced
in-line by the processed contents of the corresponding definition (linked by matching substitution
text). Matches are case-sensitive but forgiving; if no exact match is found, a case-insensitive
comparison is attempted.

Substitution definitions allow the power and flexibility of block-level directives to
be shared by inline text. They are a way to include arbitrarily complex inline structures within text,
while keeping the details out of the flow of text. They are the equivalent of SGML/XML’s named entities
or programming language macros.

Without the substitution mechanism, every time someone wants an application-specific new inline
structure, they would have to petition for a syntax change. In combination with existing directive
syntax, any inline structure can be coded without new syntax (except possibly a new directive).

Syntax diagram:

+-------+-----------------------------------------------------+
| ".. " | "|" substitution text "| " directive type "::" data |
+-------+ directive block                                     |
        |                                                     |
        +-----------------------------------------------------+

Following are some use cases for the substitution mechanism. Please note that most of the embedded
directives shown are examples only and have not been implemented.

Objects
Substitution references may be used to associate ambiguous text with a unique
object identifier.
For example, many sites may wish to implement an inline “user” directive:

|Michael| and |Jon| are our widget-wranglers.

.. |Michael| user:: mjones
.. |Jon|     user:: jhl

Depending on the needs of the site, this may be used to index the document for later searching,
to hyperlink the inline text in various ways (mailto, homepage, mouseover Javascript with
profile and contact information, etc.), or to customize presentation of the text (include
username in the inline text, include an icon image with a link next to the text, make the text
bold or a different color, etc.).

The same approach can be used in documents which frequently refer to a particular type of
objects with unique identifiers but ambiguous common names. Movies, albums, books, photos, court
cases, and laws are possible. For example:

|The Transparent Society| offers a fascinating alternate view
on privacy issues.

.. |The Transparent Society| book:: isbn=0738201448

Classes or functions, in contexts where the module or class names are unclear and/or interpreted
text cannot be used, are another possibility:

4XSLT has the convenience method |runString|, so you don't
have to mess with DOM objects if all you want is the
transformed output.

.. |runString| function:: module=xml.xslt class=Processor
Images
Images are a common use for substitution references:

West led the |H| 3, covered by dummy's |H| Q, East's |H| K,
and trumped in hand with the |S| 2.

.. |H| image:: /images/heart.png
   :height: 11
   :width: 11
.. |S| image:: /images/spade.png
   :height: 11
   :width: 11

* |Red light| means stop.
* |Green light| means go.
* |Yellow light| means go really fast.

.. |Red light|    image:: red_light.png
.. |Green light|  image:: green_light.png
.. |Yellow light| image:: yellow_light.png

|-><-| is the official symbol of POEE_.

.. |-><-| image:: discord.png
.. _POEE: http://www.poee.org/

The “image
directive has been implemented.

Styles [10]
Substitution references may be used to associate inline text with an externally
defined presentation style:

Even |the text in Texas| is big.

.. |the text in Texas| style:: big

The style name may be meaningful in the context of some particular output format (CSS class name
for HTML output, LaTeX style name for LaTeX, etc), or may be ignored for other output formats
(such as plaintext).

[10] There may be sufficient need for a “style” mechanism to warrant simpler syntax such
as an extension to the interpreted text role syntax. The substitution mechanism is
cumbersome for simple text styling.
Templates
Inline markup may be used for later processing by a template engine. For
example, a Zope author might
write:

Welcome back, |name|!

.. |name| tal:: replace user/getUserName

After processing, this ZPT output would result:

Welcome back,
<span tal:replace="user/getUserName">name</span>!

Zope would then transform this to something like “Welcome back, David!” during a
session with an actual user.

Replacement text
The substitution mechanism may be used for simple macro substitution. This may
be appropriate when the replacement text is repeated many times throughout one or more
documents, especially if it may need to change later. A short example is unavoidably
contrived:

|RST|_ is a little annoying to type over and over, especially
when writing about |RST| itself, and spelling out the
bicapitalized word |RST| every time isn't really necessary for
|RST| source readability.

.. |RST| replace:: reStructuredText
.. _RST: http://docutils.sourceforge.net/rst.html

Note the trailing underscore in the first use of a substitution reference. This indicates a
reference to the corresponding hyperlink target.

Substitution is also appropriate when the replacement text cannot be represented using other
inline constructs, or is obtrusively long:

But still, that's nothing compared to a name like
|j2ee-cas|__.

.. |j2ee-cas| replace::
   the Java `TM`:super: 2 Platform, Enterprise Edition Client
   Access Services
__ http://developer.java.sun.com/developer/earlyAccess/
   j2eecas/

The “replace
directive has been implemented.

Comments

Doctree element: comment.

Arbitrary indented text may follow the explicit markup start and will be processed as a comment element.
No further processing is done on the comment block text; a comment contains a single “text blob”.
Depending on the output formatter, comments may be removed from the processed output. The only
restriction on comments is that they not use the same syntax as any of the other explicit markup
constructs: substitution definitions, directives, footnotes, citations, or hyperlink targets. To ensure
that none of the other explicit markup constructs is recognized, leave the “..” on a line by itself:

.. This is a comment
..
   _so: is this!
..
   [and] this!
..
   this:: too!
..
   |even| this:: !

An explicit markup start followed by a blank line and nothing else (apart from
whitespace) is an “empty comment“. It serves to
terminate a preceding construct, and does not consume any indented text following.
To have a block quote follow a list or any indented construct, insert an unindented empty comment
in-between.

Syntax diagram:

+-------+----------------------+
| ".. " | comment              |
+-------+ block                |
        |                      |
        +----------------------+

Inline Markup

In reStructuredText, inline markup applies to words or phrases within a text block. The same whitespace and
punctuation that serves to delimit words in written text is used to delimit the inline markup syntax constructs
(see the inline markup recognition
rules
 for details). The text within inline markup may not begin or end with whitespace. Arbitrary character-level inline markup is
supported although not encouraged. Inline markup cannot be nested.

There are nine inline markup constructs. Five of the constructs use identical start-strings and end-strings to
indicate the markup:

Three constructs use different start-strings and end-strings:

Standalone hyperlinks are recognized implicitly,
and use no extra markup.

Inline markup recognition rules

Inline markup start-strings and end-strings are only recognized if the following conditions are met:

  1. Inline markup start-strings must be immediately followed by non-whitespace.
  2. Inline markup end-strings must be immediately preceded by non-whitespace.
  3. The inline markup end-string must be separated by at least one character from the start-string.
  4. Both, inline markup start-string and end-string must not be preceded by an unescaped backslash
    (except for the end-string of inline
    literals
    ). See Escaping
    Mechanism
     above for details.
  5. If an inline markup start-string is immediately preceded by one of the ASCII characters ‘ ” < ( [ { or a similar non-ASCII character quotation marks in
    international usage
    .)

If the configuration setting simple-inline-markup is
False (default), additional conditions apply to the characters “around” the inline markup:

  1. Inline markup start-strings must start a text block or be immediately preceded by
    • whitespace,
    • one of the ASCII characters - : / ' " < ( [ {
    • or a similar non-ASCII punctuation character. [13]
  2. Inline markup end-strings must end a text block or be immediately followed by
    • whitespace,
    • one of the ASCII characters - . , : ; ! ? \ / ' " ) ] }
      >
    • or a similar non-ASCII punctuation character. [14]
[11] Unicode
categories
 Ps (Open), Pi (Initial quote), or Pf (Final
quote). [15]
[12] Unicode categories Pe (Close), Pi (Initial quote), or Pf (Final
quote). [15]
[13] Unicode categories Ps (Open), Pi (Initial quote), Pf (Final
quote), Pd (Dash), or Po (Other). [15]
[14] Unicode categories Pe (Close), Pi (Initial quote), Pf (Final
quote), Pd (Dash), or Po (Other). [15]
[15] (1234) The
category of some characters changed with the development of the Unicode standard. Docutils 0.13
uses Unicode version
5.2.0
.

The inline markup recognition rules were devised to allow 90% of non-markup uses of “*”, “`”, “_”, and “|”
without escaping. For example, none of the following terms are recognized as containing inline markup
strings:

  • 2 * x a ** b (* BOM32_* ` “ _ __ | (breaks rule 1)
  • || (breaks rule 3)
  • “*” ‘|’ (*) [*] {*} <*> ‘*’ ‚*‘ ‘*‚ ’*’ ‚*’ “*” „*“ “*„ ”*” „*” »*« ›*‹ «*» »*» ›*› (breaks
    rule 5)
  • 2*x a**b O(N**2) e**(x*y) f(x)*f(y) a|b file*.* __init__ __init__() (breaks rule 6)

No escaping is required inside the following inline markup examples:

  • *2 * x  *a **b *.txt* (breaks rule 2; renders as “2 * x *a **b
    *.txt
    “)
  • *2*x a**b O(N**2) e**(x*y) f(x)*f(y) a*(1+2)* (breaks rule 7; renders as “2*x a**b O(N**2) e**(x*y)
    f(x)*f(y) a*(1+2)
    “)

It may be desirable to use inline literals for
some of these anyhow, especially if they represent code snippets. It’s a judgment call.

The following terms do require either literal-quoting or escaping to avoid misinterpretation:

*4, class_, *args, **kwargs, `TeX-quoted', *ML, *.txt

In most use cases, inline literals or literal blocks are the best choice (by default,
this also selects a monospaced font). Alternatively, the inline markup characters can be escaped:

\*4, class\_, \*args, \**kwargs, \`TeX-quoted', \*ML, \*.txt

For languages that don’t use whitespace between words (e.g. Japanese or Chinese) it is recommended to set simple-inline-markup to
True and eventually escape inline markup characters. The examples breaking rules 6 and 7 above show which
constructs may need special attention.

Recognition order

Inline markup delimiter characters are used for multiple constructs, so to avoid ambiguity there must be a
specific recognition order for each character. The inline markup recognition order is as follows:

Character-Level Inline Markup

It is possible to mark up individual characters within a word with backslash escapes (see Escaping Mechanism above). Backslash escapes
can be used to allow arbitrary text to immediately follow inline markup:

Python ``list``\s use square bracket syntax.

The backslash will disappear from the processed document. The word “list” will appear as inline literal
text, and the letter “s” will immediately follow it as normal text, with no space in-between.

Arbitrary text may immediately precede inline markup using backslash-escaped whitespace:

Possible in *re*\ ``Structured``\ *Text*, though not encouraged.

The backslashes and spaces separating “re”, “Structured”, and “Text” above will disappear from the processed
document.

Caution!
The use of backslash-escapes for character-level inline markup is not encouraged. Such
use is ugly and detrimental to the unprocessed document’s readability. Please use this feature
sparingly and only where absolutely necessary.

Emphasis

Doctree element: emphasis.

Start-string = end-string = “*”.

Text enclosed by single asterisk characters is emphasized:

This is *emphasized text*.

Emphasized text is typically displayed in italics.

Strong Emphasis

Doctree element: strong.

Start-string = end-string = “**”.

Text enclosed by double-asterisks is emphasized strongly:

This is **strong text**.

Strongly emphasized text is typically displayed in boldface.

Interpreted Text

Doctree element: depends on the explicit or implicit role and processing.

Start-string = end-string = “`”.

Interpreted text is text that is meant to be related, indexed, linked, summarized, or otherwise processed,
but the text itself is typically left alone. Interpreted text is enclosed by single backquote characters:

This is `interpreted text`.

The “role” of the interpreted text determines how the text is interpreted. The role may be inferred
implicitly (as above; the “default role” is used) or indicated explicitly, using a role marker. A role
marker consists of a colon, the role name, and another colon. A role name is a single word consisting of
alphanumerics plus isolated internal hyphens, underscores, plus signs, colons, and periods; no whitespace or
other characters are allowed. A role marker is either a prefix or a suffix to the interpreted text,
whichever reads better; it’s up to the author:

:role:`interpreted text`

`interpreted text`:role:

Interpreted text allows extensions to the available inline descriptive markup constructs. To emphasisstrong emphasisinline literals, and hyperlink references, we can add “title
reference”, “index entry”, “acronym”, “class”, “red”, “blinking” or anything else we want. Only
pre-determined roles are recognized; unknown roles will generate errors. A core set of standard roles is
implemented in the reference parser; see reStructuredText
Interpreted Text Roles
 for individual descriptions. The role directive
can be used to define custom interpreted text roles. In addition, applications may support specialized
roles.

In field lists, care must be taken when using
interpreted text with explicit roles in field names: the role must be a suffix to the interpreted text. The
following are recognized as field list items:

:`field name`:code:: interpreted text with explicit role as suffix

:a `complex`:code:\  field name: a backslash-escaped space
                                 is necessary

The following are not recognized as field list items:

::code:`not a field name`: paragraph with interpreted text

:code:`not a field name`: paragraph with interpreted text

Edge cases:

:field\:`name`: interpreted text (standard role) requires
                escaping the leading colon in a field name

:field:\`name`: not interpreted text

Inline Literals

Doctree element: literal.

Start-string = end-string = ““”.

Text enclosed by double-backquotes is treated as inline literals:

This text is an example of ``inline literals``.

Inline literals may contain any characters except two adjacent backquotes in an end-string context
(according to the recognition rules above). No markup interpretation (including backslash-escape
interpretation) is done within inline literals.

Line breaks are not preserved in inline literals. Although a reStructuredText parser will preserve
runs of spaces in its output, the final representation of the processed document is dependent on the output
formatter, thus the preservation of whitespace cannot be guaranteed. If the preservation of line breaks
and/or other whitespace is important, literal
blocks
 should be used.

Inline literals are useful for short code snippets. For example:

The regular expression ``[+-]?(\d+(\.\d*)?|\.\d+)`` matches
floating-point numbers (without exponents).

Inline Internal Targets

Doctree element: target.

Start-string = “_`”, end-string = “`”.

Inline internal targets are the equivalent of explicit internal
hyperlink targets
, but may appear within running text. The syntax begins with an underscore and a
backquote, is followed by a hyperlink name or phrase, and ends with a backquote. Inline internal targets may
not be anonymous.

For example, the following paragraph contains a hyperlink target named “Norwegian Blue”:

Oh yes, the _`Norwegian Blue`.  What's, um, what's wrong with it?

See Implicit Hyperlink Targets for the
resolution of duplicate reference names.

Footnote References

See also: Footnotes

Doctree element: footnote_reference.

Configuration settings: footnote_referencestrim_footnote_reference_space.

Start-string = “[“, end-string = “]_”.

Each footnote reference consists of a square-bracketed label followed by a trailing underscore. Footnote
labels are one of:

For example:

Please RTFM [1]_.

.. [1] Read The Fine Manual

Inline markup recognition rules may
require whitespace in front of the footnote reference. To remove the whitespace from the output, use an
escaped whitespace character (see Escaping
Mechanism
) or set the trim_footnote_reference_space configuration
setting. Leading whitespace is removed by default, if the footnote_references
setting
 is “superscript”.

Citation References

See also: Citations

Doctree element: citation_reference.

Start-string = “[“, end-string = “]_”.

Each citation reference consists of a square-bracketed label followed by a trailing underscore. Citation
labels are simple reference names (case-insensitive
single words, consisting of alphanumerics plus internal hyphens, underscores, and periods; no whitespace).

For example:

Here is a citation reference: [CIT2002]_.

Substitution References

Doctree element: substitution_reference, reference.

Start-string = “|”, end-string = “|” (optionally followed by “_” or “__”).

Vertical bars are used to bracket the substitution reference text. A substitution reference may also be a
hyperlink reference by appending a “_” (named) or “__” (anonymous) suffix; the substitution text is used for
the reference text in the named case.

The processing system replaces substitution references with the processed contents of the corresponding substitution definitions (which see for
the definition of “correspond”). Substitution definitions produce inline-compatible elements.

Examples:

This is a simple |substitution reference|.  It will be replaced by
the processing system.

This is a combination |substitution and hyperlink reference|_.  In
addition to being replaced, the replacement text or element will
refer to the "substitution and hyperlink reference" target.

Units

(New in Docutils 0.3.10.)

All measures consist of a positive floating point number in standard (non-scientific) notation and a unit,
possibly separated by one or more spaces.

Units are only supported where explicitly mentioned in the reference manuals.

Length Units

The following length units are supported by the reStructuredText parser:

  • em (ems, the height of the element’s font)
  • ex (x-height, the height of the letter “x”)
  • px (pixels, relative to the canvas resolution)
  • in (inches; 1in=2.54cm)
  • cm (centimeters; 1cm=10mm)
  • mm (millimeters)
  • pt (points; 1pt=1/72in)
  • pc (picas; 1pc=12pt)

This set corresponds to the length units in
CSS
.

(List and explanations taken from http://www.htmlhelp.com/reference/css/units.html#length.)

The following are all valid length values: “1.5em”, “20 mm”, “.5in”.

Length values without unit are completed with a writer-dependent default (e.g. px
with html4css1, pt with latex2e). See the writer specific documentation in the user doc for details.

Percentage Units

Percentage values have a percent sign (“%”) as unit. Percentage values are relative to other values,
depending on the context in which they occur.

Error Handling

Doctree element: system_message, problematic.

Markup errors are handled according to the specification in PEP
258
.


View document
source
. Generated on: 2017-09-08 07:20 UTC. Generated by Docutils from reStructuredText source.