lib/rdoc/markdown.kpeg

Summary

Maintainability
Test Coverage
%% name = RDoc::Markdown

%% header {
# coding: UTF-8
# frozen_string_literal: true
# :markup: markdown

##
# RDoc::Markdown as described by the [markdown syntax][syntax].
#
# To choose Markdown as your only default format see
# RDoc::Options@Saved+Options for instructions on setting up a `.rdoc_options`
# file to store your project default.
#
# ## Usage
#
# Here is a brief example of using this parse to read a markdown file by hand.
#
#     data = File.read("README.md")
#     formatter = RDoc::Markup::ToHtml.new(RDoc::Options.new, nil)
#     html = RDoc::Markdown.parse(data).accept(formatter)
#
#     # do something with html
#
# ## Extensions
#
# The following markdown extensions are supported by the parser, but not all
# are used in RDoc output by default.
#
# ### RDoc
#
# The RDoc Markdown parser has the following built-in behaviors that cannot be
# disabled.
#
# Underscores embedded in words are never interpreted as emphasis.  (While the
# [markdown dingus][dingus] emphasizes in-word underscores, neither the
# Markdown syntax nor MarkdownTest mention this behavior.)
#
# For HTML output, RDoc always auto-links bare URLs.
#
# ### Break on Newline
#
# The break_on_newline extension converts all newlines into hard line breaks
# as in [Github Flavored Markdown][GFM].  This extension is disabled by
# default.
#
# ### CSS
#
# The #css extension enables CSS blocks to be included in the output, but they
# are not used for any built-in RDoc output format.  This extension is disabled
# by default.
#
# Example:
#
#     <style type="text/css">
#     h1 { font-size: 3em }
#     </style>
#
# ### Definition Lists
#
# The definition_lists extension allows definition lists using the [PHP
# Markdown Extra syntax][PHPE], but only one label and definition are supported
# at this time.  This extension is enabled by default.
#
# Example:
#
# ```
# cat
# :   A small furry mammal
# that seems to sleep a lot
#
# ant
# :   A little insect that is known
# to enjoy picnics
#
# ```
#
# Produces:
#
# cat
# :   A small furry mammal
# that seems to sleep a lot
#
# ant
# :   A little insect that is known
# to enjoy picnics
#
# ### Strike
#
# Example:
#
# ```
# This is ~~striked~~.
# ```
#
# Produces:
#
# This is ~~striked~~.
#
# ### Github
#
# The #github extension enables a partial set of [Github Flavored Markdown]
# [GFM].  This extension is enabled by default.
#
# Supported github extensions include:
#
# #### Fenced code blocks
#
# Use ` ``` ` around a block of code instead of indenting it four spaces.
#
# #### Syntax highlighting
#
# Use ` ``` ruby ` as the start of a code fence to add syntax highlighting.
# (Currently only `ruby` syntax is supported).
#
# ### HTML
#
# Enables raw HTML to be included in the output.  This extension is enabled by
# default.
#
# Example:
#
#     <table>
#     ...
#     </table>
#
# ### Notes
#
# The #notes extension enables footnote support.  This extension is enabled by
# default.
#
# Example:
#
#     Here is some text[^1] including an inline footnote ^[for short footnotes]
#
#     ...
#
#     [^1]: With the footnote text down at the bottom
#
# Produces:
#
# Here is some text[^1] including an inline footnote ^[for short footnotes]
#
# [^1]: With the footnote text down at the bottom
#
# ## Limitations
#
# * Link titles are not used
# * Footnotes are collapsed into a single paragraph
#
# ## Author
#
# This markdown parser is a port to kpeg from [peg-markdown][pegmarkdown] by
# John MacFarlane.
#
# It is used under the MIT license:
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# The port to kpeg was performed by Eric Hodel and Evan Phoenix
#
# [dingus]: http://daringfireball.net/projects/markdown/dingus
# [GFM]: https://github.github.com/gfm/
# [pegmarkdown]: https://github.com/jgm/peg-markdown
# [PHPE]: https://michelf.ca/projects/php-markdown/extra/#def-list
# [syntax]: http://daringfireball.net/projects/markdown/syntax
#--
# Last updated to jgm/peg-markdown commit 8f8fc22ef0

}

%% {

  require_relative '../rdoc'
  require_relative 'markup/to_joined_paragraph'
  require_relative 'markdown/entities'

  require_relative 'markdown/literals'

  ##
  # Supported extensions

  EXTENSIONS = []

  ##
  # Extensions enabled by default

  DEFAULT_EXTENSIONS = [
    :definition_lists,
    :github,
    :html,
    :notes,
    :strike,
  ]

  # :section: Extensions

  ##
  # Creates extension methods for the `name` extension to enable and disable
  # the extension and to query if they are active.

  def self.extension name
    EXTENSIONS << name

    define_method "#{name}?" do
      extension? name
    end

    define_method "#{name}=" do |enable|
      extension name, enable
    end
  end

  ##
  # Converts all newlines into hard breaks

  extension :break_on_newline

  ##
  # Allow style blocks

  extension :css

  ##
  # Allow PHP Markdown Extras style definition lists

  extension :definition_lists

  ##
  # Allow Github Flavored Markdown

  extension :github

  ##
  # Allow HTML

  extension :html

  ##
  # Enables the notes extension

  extension :notes

  ##
  # Enables the strike extension

  extension :strike

  # :section:

  ##
  # Parses the `markdown` document into an RDoc::Document using the default
  # extensions.

  def self.parse markdown
    parser = new

    parser.parse markdown
  end

  # TODO remove when kpeg 0.10 is released
  alias orig_initialize initialize # :nodoc:

  ##
  # Creates a new markdown parser that enables the given +extensions+.

  def initialize extensions = DEFAULT_EXTENSIONS, debug = false
    @debug      = debug
    @formatter  = RDoc::Markup::ToJoinedParagraph.new
    @extensions = extensions

    @references          = nil
    @unlinked_references = nil

    @footnotes       = nil
    @note_order      = nil
  end

  ##
  # Wraps `text` in emphasis for rdoc inline formatting

  def emphasis text
    if text =~ /\A[a-z\d.\/]+\z/i then
      "_#{text}_"
    else
      "<em>#{text}</em>"
    end
  end

  ##
  # :category: Extensions
  #
  # Is the extension `name` enabled?

  def extension? name
    @extensions.include? name
  end

  ##
  # :category: Extensions
  #
  # Enables or disables the extension with `name`

  def extension name, enable
    if enable then
      @extensions |= [name]
    else
      @extensions -= [name]
    end
  end

  ##
  # Parses `text` in a clone of this parser.  This is used for handling nested
  # lists the same way as markdown_parser.

  def inner_parse text # :nodoc:
    parser = clone

    parser.setup_parser text, @debug

    parser.peg_parse

    doc = parser.result

    doc.accept @formatter

    doc.parts
  end

  ##
  # Finds a link reference for `label` and creates a new link to it with
  # `content` as the link text.  If `label` was not encountered in the
  # reference-gathering parser pass the label and content are reconstructed
  # with the linking `text` (usually whitespace).

  def link_to content, label = content, text = nil
    raise ParseError, 'enable notes extension' if
      content.start_with? '^' and label.equal? content

    if ref = @references[label] then
      "{#{content}}[#{ref}]"
    elsif label.equal? content then
      "[#{content}]#{text}"
    else
      "[#{content}]#{text}[#{label}]"
    end
  end

  ##
  # Creates an RDoc::Markup::ListItem by parsing the `unparsed` content from
  # the first parsing pass.

  def list_item_from unparsed
    parsed = inner_parse unparsed.join
    RDoc::Markup::ListItem.new nil, *parsed
  end

  ##
  # Stores `label` as a note and fills in previously unknown note references.

  def note label
    #foottext = "rdoc-label:foottext-#{label}:footmark-#{label}"

    #ref.replace foottext if ref = @unlinked_notes.delete(label)

    @notes[label] = foottext

    #"{^1}[rdoc-label:footmark-#{label}:foottext-#{label}] "
  end

  ##
  # Creates a new link for the footnote `reference` and adds the reference to
  # the note order list for proper display at the end of the document.

  def note_for ref
    @note_order << ref

    label = @note_order.length

    "{*#{label}}[rdoc-label:foottext-#{label}:footmark-#{label}]"
  end

  ##
  # The internal kpeg parse method

  alias peg_parse parse # :nodoc:

  ##
  # Creates an RDoc::Markup::Paragraph from `parts` and including
  # extension-specific behavior

  def paragraph parts
    parts = parts.map do |part|
      if "\n" == part then
        RDoc::Markup::HardBreak.new
      else
        part
      end
    end if break_on_newline?

    RDoc::Markup::Paragraph.new(*parts)
  end

  ##
  # Parses `markdown` into an RDoc::Document

  def parse markdown
    @references          = {}
    @unlinked_references = {}

    markdown += "\n\n"

    setup_parser markdown, @debug
    peg_parse 'References'

    if notes? then
      @footnotes       = {}

      setup_parser markdown, @debug
      peg_parse 'Notes'

      # using note_order on the first pass would be a bug
      @note_order      = []
    end

    setup_parser markdown, @debug
    peg_parse

    doc = result

    if notes? and not @footnotes.empty? then
      doc << RDoc::Markup::Rule.new(1)

      @note_order.each_with_index do |ref, index|
        label = index + 1
        note = @footnotes[ref] or raise ParseError, "footnote [^#{ref}] not found"

        link = "{^#{label}}[rdoc-label:footmark-#{label}:foottext-#{label}] "
        note.parts.unshift link

        doc << note
      end
    end

    doc.accept @formatter

    doc
  end

  ##
  # Stores `label` as a reference to `link` and fills in previously unknown
  # link references.

  def reference label, link
    if ref = @unlinked_references.delete(label) then
      ref.replace link
    end

    @references[label] = link
  end

  ##
  # Wraps `text` in strong markup for rdoc inline formatting

  def strong text
    if text =~ /\A[a-z\d.\/-]+\z/i then
      "*#{text}*"
    else
      "<b>#{text}</b>"
    end
  end

  ##
  # Wraps `text` in strike markup for rdoc inline formatting

  def strike text
    if text =~ /\A[a-z\d.\/-]+\z/i then
      "~#{text}~"
    else
      "<s>#{text}</s>"
    end
  end
}

root = Doc

Doc =       BOM? Block*:a { RDoc::Markup::Document.new(*a.compact) }

Block =     @BlankLine*
            ( BlockQuote
            | Verbatim
            | CodeFence
            | Table
            | Note
            | Reference
            | HorizontalRule
            | Heading
            | OrderedList
            | BulletList
            | DefinitionList
            | HtmlBlock
            | StyleBlock
            | Para
            | Plain )

Para =      @NonindentSpace Inlines:a @BlankLine+
            { paragraph a }

Plain =     Inlines:a
            { paragraph a }

AtxInline = !@Newline !(@Sp /#*/ @Sp @Newline) Inline

AtxStart =  < /\#{1,6}/ >
            { text.length }

AtxHeading = AtxStart:s @Sp AtxInline+:a (@Sp /#*/ @Sp)?  @Newline
            { RDoc::Markup::Heading.new(s, a.join) }

SetextHeading = SetextHeading1 | SetextHeading2

SetextBottom1 = /={1,}/ @Newline

SetextBottom2 = /-{1,}/ @Newline

SetextHeading1 =  &(@RawLine SetextBottom1)
                  @StartList:a ( !@Endline Inline:b { a << b } )+ @Sp @Newline
                  SetextBottom1
                  { RDoc::Markup::Heading.new(1, a.join) }

SetextHeading2 =  &(@RawLine SetextBottom2)
                  @StartList:a ( !@Endline Inline:b { a << b })+ @Sp @Newline
                  SetextBottom2
                  { RDoc::Markup::Heading.new(2, a.join) }

Heading = SetextHeading | AtxHeading

BlockQuote = BlockQuoteRaw:a
             { RDoc::Markup::BlockQuote.new(*a) }

BlockQuoteRaw =  @StartList:a
                 (( ">" " "? Line:l { a << l } )
                  ( !">" !@BlankLine Line:c { a << c } )*
                  ( @BlankLine:n { a << n } )*
                 )+
                 { inner_parse a.join }

NonblankIndentedLine = !@BlankLine IndentedLine

VerbatimChunk = @BlankLine*:a
                NonblankIndentedLine+:b
                { a.concat b }

Verbatim =     VerbatimChunk+:a
               { RDoc::Markup::Verbatim.new(*a.flatten) }

HorizontalRule = @NonindentSpace
                 ( "*" @Sp "*" @Sp "*" (@Sp "*")*
                 | "-" @Sp "-" @Sp "-" (@Sp "-")*
                 | "_" @Sp "_" @Sp "_" (@Sp "_")*)
                 @Sp @Newline @BlankLine+
                 { RDoc::Markup::Rule.new 1 }

Bullet = !HorizontalRule @NonindentSpace /[+*-]/ @Spacechar+

BulletList = &Bullet (ListTight | ListLoose):a
             { RDoc::Markup::List.new(:BULLET, *a) }

ListTight = ListItemTight+:a
            @BlankLine* !(Bullet | Enumerator)
            { a }

ListLoose = @StartList:a
            ( ListItem:b @BlankLine* { a << b } )+
            { a }

ListItem =  ( Bullet | Enumerator )
            @StartList:a
            ListBlock:b { a << b }
            ( ListContinuationBlock:c { a.push(*c) } )*
            { list_item_from a }

ListItemTight =
            ( Bullet | Enumerator )
            ListBlock:a
            ( !@BlankLine
              ListContinuationBlock:b { a.push(*b) } )*
            !ListContinuationBlock
            { list_item_from a }

ListBlock = !@BlankLine Line:a
            ListBlockLine*:c
            { [a, *c] }

ListContinuationBlock = @StartList:a
                        ( @BlankLine*
                          { a << "\n" } )
                        ( Indent
                          ListBlock:b { a.concat b } )+
                        { a }

Enumerator = @NonindentSpace [0-9]+ "." @Spacechar+

OrderedList = &Enumerator (ListTight | ListLoose):a
             { RDoc::Markup::List.new(:NUMBER, *a) }

ListBlockLine = !@BlankLine
                !( Indent? (Bullet | Enumerator) )
                !HorizontalRule
                OptionallyIndentedLine

# Parsers for different kinds of block-level HTML content.
# This is repetitive due to constraints of PEG grammar.

HtmlOpenAnchor = "<" Spnl ("a" | "A") Spnl HtmlAttribute* ">"
HtmlCloseAnchor = "<" Spnl "/" ("a" | "A") Spnl ">"
HtmlAnchor = HtmlOpenAnchor (HtmlAnchor | !HtmlCloseAnchor .)* HtmlCloseAnchor

HtmlBlockOpenAddress = "<" Spnl ("address" | "ADDRESS") Spnl HtmlAttribute* ">"
HtmlBlockCloseAddress = "<" Spnl "/" ("address" | "ADDRESS") Spnl ">"
HtmlBlockAddress = HtmlBlockOpenAddress (HtmlBlockAddress | !HtmlBlockCloseAddress .)* HtmlBlockCloseAddress

HtmlBlockOpenBlockquote = "<" Spnl ("blockquote" | "BLOCKQUOTE") Spnl HtmlAttribute* ">"
HtmlBlockCloseBlockquote = "<" Spnl "/" ("blockquote" | "BLOCKQUOTE") Spnl ">"
HtmlBlockBlockquote = HtmlBlockOpenBlockquote (HtmlBlockBlockquote | !HtmlBlockCloseBlockquote .)* HtmlBlockCloseBlockquote

HtmlBlockOpenCenter = "<" Spnl ("center" | "CENTER") Spnl HtmlAttribute* ">"
HtmlBlockCloseCenter = "<" Spnl "/" ("center" | "CENTER") Spnl ">"
HtmlBlockCenter = HtmlBlockOpenCenter (HtmlBlockCenter | !HtmlBlockCloseCenter .)* HtmlBlockCloseCenter

HtmlBlockOpenDir = "<" Spnl ("dir" | "DIR") Spnl HtmlAttribute* ">"
HtmlBlockCloseDir = "<" Spnl "/" ("dir" | "DIR") Spnl ">"
HtmlBlockDir = HtmlBlockOpenDir (HtmlBlockDir | !HtmlBlockCloseDir .)* HtmlBlockCloseDir

HtmlBlockOpenDiv = "<" Spnl ("div" | "DIV") Spnl HtmlAttribute* ">"
HtmlBlockCloseDiv = "<" Spnl "/" ("div" | "DIV") Spnl ">"
HtmlBlockDiv = HtmlBlockOpenDiv (HtmlBlockDiv | !HtmlBlockCloseDiv .)* HtmlBlockCloseDiv

HtmlBlockOpenDl = "<" Spnl ("dl" | "DL") Spnl HtmlAttribute* ">"
HtmlBlockCloseDl = "<" Spnl "/" ("dl" | "DL") Spnl ">"
HtmlBlockDl = HtmlBlockOpenDl (HtmlBlockDl | !HtmlBlockCloseDl .)* HtmlBlockCloseDl

HtmlBlockOpenFieldset = "<" Spnl ("fieldset" | "FIELDSET") Spnl HtmlAttribute* ">"
HtmlBlockCloseFieldset = "<" Spnl "/" ("fieldset" | "FIELDSET") Spnl ">"
HtmlBlockFieldset = HtmlBlockOpenFieldset (HtmlBlockFieldset | !HtmlBlockCloseFieldset .)* HtmlBlockCloseFieldset

HtmlBlockOpenForm = "<" Spnl ("form" | "FORM") Spnl HtmlAttribute* ">"
HtmlBlockCloseForm = "<" Spnl "/" ("form" | "FORM") Spnl ">"
HtmlBlockForm = HtmlBlockOpenForm (HtmlBlockForm | !HtmlBlockCloseForm .)* HtmlBlockCloseForm

HtmlBlockOpenH1 = "<" Spnl ("h1" | "H1") Spnl HtmlAttribute* ">"
HtmlBlockCloseH1 = "<" Spnl "/" ("h1" | "H1") Spnl ">"
HtmlBlockH1 = HtmlBlockOpenH1 (HtmlBlockH1 | !HtmlBlockCloseH1 .)* HtmlBlockCloseH1

HtmlBlockOpenH2 = "<" Spnl ("h2" | "H2") Spnl HtmlAttribute* ">"
HtmlBlockCloseH2 = "<" Spnl "/" ("h2" | "H2") Spnl ">"
HtmlBlockH2 = HtmlBlockOpenH2 (HtmlBlockH2 | !HtmlBlockCloseH2 .)* HtmlBlockCloseH2

HtmlBlockOpenH3 = "<" Spnl ("h3" | "H3") Spnl HtmlAttribute* ">"
HtmlBlockCloseH3 = "<" Spnl "/" ("h3" | "H3") Spnl ">"
HtmlBlockH3 = HtmlBlockOpenH3 (HtmlBlockH3 | !HtmlBlockCloseH3 .)* HtmlBlockCloseH3

HtmlBlockOpenH4 = "<" Spnl ("h4" | "H4") Spnl HtmlAttribute* ">"
HtmlBlockCloseH4 = "<" Spnl "/" ("h4" | "H4") Spnl ">"
HtmlBlockH4 = HtmlBlockOpenH4 (HtmlBlockH4 | !HtmlBlockCloseH4 .)* HtmlBlockCloseH4

HtmlBlockOpenH5 = "<" Spnl ("h5" | "H5") Spnl HtmlAttribute* ">"
HtmlBlockCloseH5 = "<" Spnl "/" ("h5" | "H5") Spnl ">"
HtmlBlockH5 = HtmlBlockOpenH5 (HtmlBlockH5 | !HtmlBlockCloseH5 .)* HtmlBlockCloseH5

HtmlBlockOpenH6 = "<" Spnl ("h6" | "H6") Spnl HtmlAttribute* ">"
HtmlBlockCloseH6 = "<" Spnl "/" ("h6" | "H6") Spnl ">"
HtmlBlockH6 = HtmlBlockOpenH6 (HtmlBlockH6 | !HtmlBlockCloseH6 .)* HtmlBlockCloseH6

HtmlBlockOpenMenu = "<" Spnl ("menu" | "MENU") Spnl HtmlAttribute* ">"
HtmlBlockCloseMenu = "<" Spnl "/" ("menu" | "MENU") Spnl ">"
HtmlBlockMenu = HtmlBlockOpenMenu (HtmlBlockMenu | !HtmlBlockCloseMenu .)* HtmlBlockCloseMenu

HtmlBlockOpenNoframes = "<" Spnl ("noframes" | "NOFRAMES") Spnl HtmlAttribute* ">"
HtmlBlockCloseNoframes = "<" Spnl "/" ("noframes" | "NOFRAMES") Spnl ">"
HtmlBlockNoframes = HtmlBlockOpenNoframes (HtmlBlockNoframes | !HtmlBlockCloseNoframes .)* HtmlBlockCloseNoframes

HtmlBlockOpenNoscript = "<" Spnl ("noscript" | "NOSCRIPT") Spnl HtmlAttribute* ">"
HtmlBlockCloseNoscript = "<" Spnl "/" ("noscript" | "NOSCRIPT") Spnl ">"
HtmlBlockNoscript = HtmlBlockOpenNoscript (HtmlBlockNoscript | !HtmlBlockCloseNoscript .)* HtmlBlockCloseNoscript

HtmlBlockOpenOl = "<" Spnl ("ol" | "OL") Spnl HtmlAttribute* ">"
HtmlBlockCloseOl = "<" Spnl "/" ("ol" | "OL") Spnl ">"
HtmlBlockOl = HtmlBlockOpenOl (HtmlBlockOl | !HtmlBlockCloseOl .)* HtmlBlockCloseOl

HtmlBlockOpenP = "<" Spnl ("p" | "P") Spnl HtmlAttribute* ">"
HtmlBlockCloseP = "<" Spnl "/" ("p" | "P") Spnl ">"
HtmlBlockP = HtmlBlockOpenP (HtmlBlockP | !HtmlBlockCloseP .)* HtmlBlockCloseP

HtmlBlockOpenPre = "<" Spnl ("pre" | "PRE") Spnl HtmlAttribute* ">"
HtmlBlockClosePre = "<" Spnl "/" ("pre" | "PRE") Spnl ">"
HtmlBlockPre = HtmlBlockOpenPre (HtmlBlockPre | !HtmlBlockClosePre .)* HtmlBlockClosePre

HtmlBlockOpenTable = "<" Spnl ("table" | "TABLE") Spnl HtmlAttribute* ">"
HtmlBlockCloseTable = "<" Spnl "/" ("table" | "TABLE") Spnl ">"
HtmlBlockTable = HtmlBlockOpenTable (HtmlBlockTable | !HtmlBlockCloseTable .)* HtmlBlockCloseTable

HtmlBlockOpenUl = "<" Spnl ("ul" | "UL") Spnl HtmlAttribute* ">"
HtmlBlockCloseUl = "<" Spnl "/" ("ul" | "UL") Spnl ">"
HtmlBlockUl = HtmlBlockOpenUl (HtmlBlockUl | !HtmlBlockCloseUl .)* HtmlBlockCloseUl

HtmlBlockOpenDd = "<" Spnl ("dd" | "DD") Spnl HtmlAttribute* ">"
HtmlBlockCloseDd = "<" Spnl "/" ("dd" | "DD") Spnl ">"
HtmlBlockDd = HtmlBlockOpenDd (HtmlBlockDd | !HtmlBlockCloseDd .)* HtmlBlockCloseDd

HtmlBlockOpenDt = "<" Spnl ("dt" | "DT") Spnl HtmlAttribute* ">"
HtmlBlockCloseDt = "<" Spnl "/" ("dt" | "DT") Spnl ">"
HtmlBlockDt = HtmlBlockOpenDt (HtmlBlockDt | !HtmlBlockCloseDt .)* HtmlBlockCloseDt

HtmlBlockOpenFrameset = "<" Spnl ("frameset" | "FRAMESET") Spnl HtmlAttribute* ">"
HtmlBlockCloseFrameset = "<" Spnl "/" ("frameset" | "FRAMESET") Spnl ">"
HtmlBlockFrameset = HtmlBlockOpenFrameset (HtmlBlockFrameset | !HtmlBlockCloseFrameset .)* HtmlBlockCloseFrameset

HtmlBlockOpenLi = "<" Spnl ("li" | "LI") Spnl HtmlAttribute* ">"
HtmlBlockCloseLi = "<" Spnl "/" ("li" | "LI") Spnl ">"
HtmlBlockLi = HtmlBlockOpenLi (HtmlBlockLi | !HtmlBlockCloseLi .)* HtmlBlockCloseLi

HtmlBlockOpenTbody = "<" Spnl ("tbody" | "TBODY") Spnl HtmlAttribute* ">"
HtmlBlockCloseTbody = "<" Spnl "/" ("tbody" | "TBODY") Spnl ">"
HtmlBlockTbody = HtmlBlockOpenTbody (HtmlBlockTbody | !HtmlBlockCloseTbody .)* HtmlBlockCloseTbody

HtmlBlockOpenTd = "<" Spnl ("td" | "TD") Spnl HtmlAttribute* ">"
HtmlBlockCloseTd = "<" Spnl "/" ("td" | "TD") Spnl ">"
HtmlBlockTd = HtmlBlockOpenTd (HtmlBlockTd | !HtmlBlockCloseTd .)* HtmlBlockCloseTd

HtmlBlockOpenTfoot = "<" Spnl ("tfoot" | "TFOOT") Spnl HtmlAttribute* ">"
HtmlBlockCloseTfoot = "<" Spnl "/" ("tfoot" | "TFOOT") Spnl ">"
HtmlBlockTfoot = HtmlBlockOpenTfoot (HtmlBlockTfoot | !HtmlBlockCloseTfoot .)* HtmlBlockCloseTfoot

HtmlBlockOpenTh = "<" Spnl ("th" | "TH") Spnl HtmlAttribute* ">"
HtmlBlockCloseTh = "<" Spnl "/" ("th" | "TH") Spnl ">"
HtmlBlockTh = HtmlBlockOpenTh (HtmlBlockTh | !HtmlBlockCloseTh .)* HtmlBlockCloseTh

HtmlBlockOpenThead = "<" Spnl ("thead" | "THEAD") Spnl HtmlAttribute* ">"
HtmlBlockCloseThead = "<" Spnl "/" ("thead" | "THEAD") Spnl ">"
HtmlBlockThead = HtmlBlockOpenThead (HtmlBlockThead | !HtmlBlockCloseThead .)* HtmlBlockCloseThead

HtmlBlockOpenTr = "<" Spnl ("tr" | "TR") Spnl HtmlAttribute* ">"
HtmlBlockCloseTr = "<" Spnl "/" ("tr" | "TR") Spnl ">"
HtmlBlockTr = HtmlBlockOpenTr (HtmlBlockTr | !HtmlBlockCloseTr .)* HtmlBlockCloseTr

HtmlBlockOpenScript = "<" Spnl ("script" | "SCRIPT") Spnl HtmlAttribute* ">"
HtmlBlockCloseScript = "<" Spnl "/" ("script" | "SCRIPT") Spnl ">"
HtmlBlockScript = HtmlBlockOpenScript (!HtmlBlockCloseScript .)* HtmlBlockCloseScript

HtmlBlockOpenHead = "<" Spnl ("head" | "HEAD") Spnl HtmlAttribute* ">"
HtmlBlockCloseHead = "<" Spnl "/" ("head" | "HEAD") Spnl ">"
HtmlBlockHead = HtmlBlockOpenHead (!HtmlBlockCloseHead .)* HtmlBlockCloseHead

HtmlBlockInTags = HtmlAnchor
                | HtmlBlockAddress
                | HtmlBlockBlockquote
                | HtmlBlockCenter
                | HtmlBlockDir
                | HtmlBlockDiv
                | HtmlBlockDl
                | HtmlBlockFieldset
                | HtmlBlockForm
                | HtmlBlockH1
                | HtmlBlockH2
                | HtmlBlockH3
                | HtmlBlockH4
                | HtmlBlockH5
                | HtmlBlockH6
                | HtmlBlockMenu
                | HtmlBlockNoframes
                | HtmlBlockNoscript
                | HtmlBlockOl
                | HtmlBlockP
                | HtmlBlockPre
                | HtmlBlockTable
                | HtmlBlockUl
                | HtmlBlockDd
                | HtmlBlockDt
                | HtmlBlockFrameset
                | HtmlBlockLi
                | HtmlBlockTbody
                | HtmlBlockTd
                | HtmlBlockTfoot
                | HtmlBlockTh
                | HtmlBlockThead
                | HtmlBlockTr
                | HtmlBlockScript
                | HtmlBlockHead

HtmlBlock = < ( HtmlBlockInTags | HtmlComment | HtmlBlockSelfClosing | HtmlUnclosed) >
            @BlankLine+
            { if html? then
                RDoc::Markup::Raw.new text
              end }

HtmlUnclosed = "<" Spnl HtmlUnclosedType Spnl HtmlAttribute* Spnl ">"

HtmlUnclosedType = "HR" | "hr"

HtmlBlockSelfClosing = "<" Spnl HtmlBlockType Spnl HtmlAttribute* "/" Spnl ">"

HtmlBlockType = "ADDRESS" |
                "BLOCKQUOTE" |
                "CENTER" |
                "DD" |
                "DIR" |
                "DIV" |
                "DL" |
                "DT" |
                "FIELDSET" |
                "FORM" |
                "FRAMESET" |
                "H1" |
                "H2" |
                "H3" |
                "H4" |
                "H5" |
                "H6" |
                "HR" |
                "ISINDEX" |
                "LI" |
                "MENU" |
                "NOFRAMES" |
                "NOSCRIPT" |
                "OL" |
                "P" |
                "PRE" |
                "SCRIPT" |
                "TABLE" |
                "TBODY" |
                "TD" |
                "TFOOT" |
                "TH" |
                "THEAD" |
                "TR" |
                "UL" |
                "address" |
                "blockquote" |
                "center" |
                "dd" |
                "dir" |
                "div" |
                "dl" |
                "dt" |
                "fieldset" |
                "form" |
                "frameset" |
                "h1" |
                "h2" |
                "h3" |
                "h4" |
                "h5" |
                "h6" |
                "hr" |
                "isindex" |
                "li" |
                "menu" |
                "noframes" |
                "noscript" |
                "ol" |
                "p" |
                "pre" |
                "script" |
                "table" |
                "tbody" |
                "td" |
                "tfoot" |
                "th" |
                "thead" |
                "tr" |
                "ul"

StyleOpen =     "<" Spnl ("style" | "STYLE") Spnl HtmlAttribute* ">"
StyleClose =    "<" Spnl "/" ("style" | "STYLE") Spnl ">"
InStyleTags =   StyleOpen (!StyleClose .)* StyleClose
StyleBlock =    < InStyleTags >
                @BlankLine*
                { if css? then
                    RDoc::Markup::Raw.new text
                  end }

Inlines  =  ( !@Endline Inline:i { i }
            | @Endline:c !( &{ github? } Ticks3 /[^`\n]*$/ )
          &Inline { c } )+:chunks @Endline?
            { chunks }

Inline  = Str
        | @Endline
        | UlOrStarLine
        | @Space
        | Strong
        | Emph
        | Strike
        | Image
        | Link
        | NoteReference
        | InlineNote
        | Code
        | RawHtml
        | Entity
        | EscapedChar
        | Symbol

Space = @Spacechar+ { " " }

Str = @StartList:a
      < @NormalChar+ > { a = text }
      ( StrChunk:c { a << c } )* { a }

StrChunk = < (@NormalChar | /_+/ &Alphanumeric)+ > { text }

EscapedChar =   "\\" !@Newline < /[:\\`|*_{}\[\]()#+.!><-]/ > { text }

Entity =    ( HexEntity | DecEntity | CharEntity ):a { a }

Endline =   @LineBreak | @TerminalEndline | @NormalEndline

NormalEndline =   @Sp @Newline !@BlankLine !">" !AtxStart
                  !(Line /={1,}|-{1,}/ @Newline)
                  { "\n" }

TerminalEndline = @Sp @Newline @Eof

LineBreak = "  " @NormalEndline { RDoc::Markup::HardBreak.new }

Symbol = < @SpecialChar >
         { text }

# This keeps the parser from getting bogged down on long strings of '*' or '_',
# or strings of '*' or '_' with space on each side:
UlOrStarLine =  (UlLine | StarLine):a { a }
StarLine = < /\*{4,}/ > { text } |
           < @Spacechar /\*+/ &@Spacechar > { text }
UlLine   = < /_{4,}/ > { text } |
           < @Spacechar /_+/ &@Spacechar > { text }

Emph =      EmphStar | EmphUl

Whitespace = @Spacechar | @Newline

EmphStar =  "*" !@Whitespace
            @StartList:a
            ( !"*" Inline:b { a << b }
            | StrongStar:b  { a << b }
            )+
            "*"
            { emphasis a.join }

EmphUl =    "_" !@Whitespace
            @StartList:a
            ( !"_" Inline:b { a << b }
            | StrongUl:b  { a << b }
            )+
            "_"
            { emphasis a.join }

Strong = StrongStar | StrongUl

StrongStar =    "**" !@Whitespace
                @StartList:a
                ( !"**" Inline:b { a << b } )+
                "**"
                { strong a.join }

StrongUl   =    "__" !@Whitespace
                @StartList:a
                ( !"__" Inline:b { a << b } )+
                "__"
                { strong a.join }

Strike = &{ strike? }
         "~~" !@Whitespace
         @StartList:a
         ( !"~~" Inline:b { a << b } )+
         "~~"
         { strike a.join }

# TODO alt text support
Image = "!" ( ExplicitLink | ReferenceLink ):a
        { "rdoc-image:#{a[/\[(.*)\]/, 1]}" }

Link =  ExplicitLink | ReferenceLink | AutoLink

ReferenceLink = ReferenceLinkDouble | ReferenceLinkSingle

ReferenceLinkDouble = Label:content < Spnl > !"[]" Label:label
                      { link_to content, label, text }

ReferenceLinkSingle = Label:content < (Spnl "[]")? >
                      { link_to content, content, text }

ExplicitLink =  Label:l "(" @Sp Source:s Spnl Title @Sp ")"
                { "{#{l}}[#{s}]" }

Source  = ( "<" < SourceContents > ">" | < SourceContents > )
          { text }

SourceContents = ( ( !"(" !")" !">" Nonspacechar )+ | "(" SourceContents ")")*

Title = ( TitleSingle | TitleDouble | "" ):a
        { a }

TitleSingle = "'" ( !( "'" @Sp ( ")" | @Newline ) ) . )* "'"

TitleDouble = "\"" ( !( "\"" @Sp ( ")" | @Newline ) ) . )* "\""

AutoLink = AutoLinkUrl | AutoLinkEmail

AutoLinkUrl =   "<" < /[A-Za-z]+/ "://" ( !@Newline !">" . )+ > ">"
                { text }

AutoLinkEmail = "<" ("mailto:")? < /[\w+.\/!%~$-]+/i "@" ( !@Newline !">" . )+ > ">"
                { "mailto:#{text}" }

Reference = @NonindentSpace !"[]"
              Label:label ":" Spnl RefSrc:link RefTitle @BlankLine+
            { # TODO use title
              reference label, link
              nil
            }

Label = "[" ( !"^" &{ notes? } | &. &{ !notes? } )
        @StartList:a
        ( !"]" Inline:l { a << l } )*
        "]"
        { a.join.gsub(/\s+/, ' ') }

RefSrc = < Nonspacechar+ > { text }

RefTitle = ( RefTitleSingle | RefTitleDouble | RefTitleParens | EmptyTitle )

EmptyTitle = ""

RefTitleSingle = Spnl "'" < ( !( "'" @Sp @Newline | @Newline ) . )* > "'" { text }

RefTitleDouble = Spnl "\"" < ( !("\"" @Sp @Newline | @Newline) . )* > "\"" { text }

RefTitleParens = Spnl "(" < ( !(")" @Sp @Newline | @Newline) . )* > ")" { text }

References = ( Reference | SkipBlock )*

Ticks1 = "`" !"`"
Ticks2 = "``" !"`"
Ticks3 = "```" !"`"
Ticks4 = "````" !"`"
Ticks5 = "`````" !"`"

Code = ( Ticks1 @Sp < (
           ( !"`" Nonspacechar )+ | !Ticks1 /`+/ |
           !( @Sp Ticks1 ) ( @Spacechar | @Newline !@BlankLine )
         )+ > @Sp Ticks1 |
         Ticks2 @Sp < (
           ( !"`" Nonspacechar )+ |
           !Ticks2 /`+/ |
           !( @Sp Ticks2 ) ( @Spacechar | @Newline !@BlankLine )
         )+ > @Sp Ticks2 |
         Ticks3 @Sp < (
           ( !"`" Nonspacechar )+ |
           !Ticks3 /`+/ |
           !( @Sp Ticks3 ) ( @Spacechar | @Newline !@BlankLine )
         )+ > @Sp Ticks3 |
         Ticks4 @Sp < (
           ( !"`" Nonspacechar )+ |
           !Ticks4 /`+/ |
           !( @Sp Ticks4 ) ( @Spacechar | @Newline !@BlankLine )
         )+ > @Sp Ticks4 |
         Ticks5 @Sp < (
           ( !"`" Nonspacechar )+ |
           !Ticks5 /`+/ |
           !( @Sp Ticks5 ) ( @Spacechar | @Newline !@BlankLine )
         )+ > @Sp Ticks5
       )
       { "<code>#{text}</code>" }

RawHtml = < (HtmlComment | HtmlBlockScript | HtmlTag) >
          { if html? then text else '' end }

BlankLine =     @Sp @Newline { "\n" }

Quoted =        "\"" (!"\"" .)* "\"" | "'" (!"'" .)* "'"
HtmlAttribute = (AlphanumericAscii | "-")+ Spnl ("=" Spnl (Quoted | (!">" Nonspacechar)+))? Spnl
HtmlComment =   "<!--" (!"-->" .)* "-->"
HtmlTag =       "<" Spnl "/"? AlphanumericAscii+ Spnl HtmlAttribute* "/"? Spnl ">"
Eof =           !.
Nonspacechar =  !@Spacechar !@Newline .
Sp =            @Spacechar*
Spnl =          @Sp (@Newline @Sp)?
SpecialChar =   /[~*_`&\[\]()<!#\\'"]/ | @ExtendedSpecialChar
NormalChar =    !( @SpecialChar | @Spacechar | @Newline ) .
Digit = [0-9]

%literals         = RDoc::Markdown::Literals
Alphanumeric      = %literals.Alphanumeric
AlphanumericAscii = %literals.AlphanumericAscii
BOM               = %literals.BOM
Newline           = %literals.Newline
Spacechar         = %literals.Spacechar

HexEntity  = /&#x/i < /[0-9a-fA-F]+/ > ";"
             { [text.to_i(16)].pack 'U' }
DecEntity  = "&#"   < /[0-9]+/       > ";"
             { [text.to_i].pack 'U' }
CharEntity = "&"    </[A-Za-z0-9]+/  > ";"
             { if entity = HTML_ENTITIES[text] then
                 entity.pack 'U*'
               else
                 "&#{text};"
               end
             }

NonindentSpace =    / {0,3}/
Indent =            /\t|    /
IndentedLine =      Indent Line
OptionallyIndentedLine = Indent? Line

# StartList starts a list data structure that can be added to with cons:
StartList = &.
            { [] }

Line =  @RawLine:a { a }
RawLine = ( < /[^\r\n]*/ @Newline >
        | < .+ > @Eof ) { text }

SkipBlock = HtmlBlock
          | ( !"#" !SetextBottom1 !SetextBottom2 !@BlankLine @RawLine )+
            @BlankLine*
          | @BlankLine+
          | @RawLine

# Syntax extensions

ExtendedSpecialChar = &{ notes? } ( "^" )

NoteReference = &{ notes? }
                RawNoteReference:ref
                { note_for ref }

RawNoteReference = "[^" < ( !@Newline !"]" . )+ > "]" { text }

# TODO multiple paragraphs for a footnote
Note =          &{ notes? }
                @NonindentSpace RawNoteReference:ref ":" @Sp
                @StartList:a
                RawNoteBlock:i { a.concat i }
                ( &Indent RawNoteBlock:i { a.concat i } )*
                { @footnotes[ref] = paragraph a

                  nil
                }

InlineNote = &{ notes? }
             "^["
             @StartList:a
             ( !"]" Inline:l { a << l } )+
             "]"
             { ref = [:inline, @note_order.length]
               @footnotes[ref] = paragraph a

               note_for ref
             }

Notes = ( Note | SkipBlock )*

RawNoteBlock = @StartList:a
               ( !@BlankLine !RawNoteReference OptionallyIndentedLine:l { a << l } )+
               ( < @BlankLine* > { a << text } )
               { a }

# Markdown extensions added by RDoc follow

CodeFence = &{ github? }
            Ticks3 (@Sp StrChunk:format)? Spnl < (
              ( !"`" Nonspacechar )+ |
              !Ticks3 /`+/ |
              Spacechar |
              @Newline
            )+ > Ticks3 @Sp @Newline*
            { verbatim = RDoc::Markup::Verbatim.new text
              verbatim.format = format.intern if format.instance_of?(String)
              verbatim
            }

Table = &{ github? }
        TableHead:header TableLine:line TableRow+:body
        { table = RDoc::Markup::Table.new(header, line, body) }

TableHead = TableItem2+:items "|"? @Newline
            { items }

TableRow = ( ( TableItem:item1 TableItem2*:items { [item1, *items] } ):row | TableItem2+:row ) "|"? @Newline
       { row }
TableItem2 = "|" TableItem
TableItem = < /(?:\\.|[^|\n])+/ >
        { text.strip.gsub(/\\(.)/, '\1')  }

TableLine = ( ( TableAlign:align1 TableAlign2*:aligns {[align1, *aligns] } ):line | TableAlign2+:line ) "|"? @Newline
        { line }
TableAlign2 = "|" @Sp TableAlign
TableAlign = < /:?-+:?/ > @Sp
              {
                text.start_with?(":") ?
                (text.end_with?(":") ? :center : :left) :
                (text.end_with?(":") ? :right : nil)
              }

DefinitionList = &{ definition_lists? }
                 ( DefinitionListItem+:list )
                 { RDoc::Markup::List.new :NOTE, *list.flatten }

DefinitionListItem = ( DefinitionListLabel+ ):label
                     ( DefinitionListDefinition+ ):defns
                     { list_items = []
                       list_items <<
                         RDoc::Markup::ListItem.new(label, defns.shift)

                       list_items.concat defns.map { |defn|
                         RDoc::Markup::ListItem.new nil, defn
                       } unless list_items.empty?

                       list_items
                     }

DefinitionListLabel = Inline:label @Sp @Newline
                      { label }

DefinitionListDefinition = @NonindentSpace ":" @Space Inlines:a @BlankLine+
                           { paragraph a }