Skip to contents

Parses (R) Markdown file content according to the CommonMark specification and returns it as an XML parse tree.

Usage

md_to_xml(
  md,
  smart_punctuation = FALSE,
  hardbreaks = FALSE,
  normalize = TRUE,
  sourcepos = FALSE,
  footnotes = TRUE,
  extensions = c("strikethrough", "table", "tasklist"),
  eol = c("LF", "CRLF", "CR", "LFCR"),
  strip_xml_ns = TRUE
)

Arguments

md

(R) Markdown file content as a character scalar.

smart_punctuation

Whether or not to enable Pandoc's smart extension which converts straight quotes to curly quotes, --- to an em-dash (—), -- to an en-dash (–), and ... to ellipses (…). It also replaces regular spaces after certain abbreviations such as Mr. with non-breaking spaces.

hardbreaks

Whether or not to treat newlines as hard line breaks.

normalize

Consolidate adjacent text nodes.

sourcepos

Include source position attribute in output.

footnotes

parse footnotes

extensions

Enables Github extensions. Can be TRUE (all) FALSE (none) or a character vector with a subset of available extensions.

eol

End of line (EOL) control character sequence. One of

  • "LF" for the line feed (LF) character ("\n"). The standard on Unix and Unix-like systems (Linux, macOS, *BSD, etc.) and the default.

  • "CRLF" for the carriage return + line feed (CR+LF) character sequence ("\r\n"). The standard on Microsoft Windows, DOS and some other systems.

  • "CR" for the carriage return (CR) character ("\r"). The standard on classic Mac OS and some other antiquated systems.

  • "LFCR" for the line feed + carriage return (LF+CR) character sequence ("\n\r"). The standard on RISC OS and some other exotic systems.

strip_xml_ns

Whether or not to remove the default XML namespace (d1) assigned by commonmark::markdown_xml().

Value

An xml_document.

See also

Other CommonMark parsing functions: md_xml_subnode_ix(), xml_to_md()

Examples

"# A title

Some prose.

## A subtitle

More prose.

## Another subtitle

Out of prose here.

### A sub-subtitle

I'm dug in.

# Another title

A last word." |> pal::md_to_xml()
#> {xml_document}
#> <document>
#>  [1] <heading level="1">\n  <text xml:space="preserve">A title</text>\n</heading>
#>  [2] <paragraph>\n  <text xml:space="preserve">Some prose.</text>\n</paragraph>
#>  [3] <heading level="2">\n  <text xml:space="preserve">A subtitle</text>\n</heading>
#>  [4] <paragraph>\n  <text xml:space="preserve">More prose.</text>\n</paragraph>
#>  [5] <heading level="2">\n  <text xml:space="preserve">Another subtitle</text>\n</heading>
#>  [6] <paragraph>\n  <text xml:space="preserve">Out of prose here.</text>\n</paragraph>
#>  [7] <heading level="3">\n  <text xml:space="preserve">A sub-subtitle</text>\n</heading>
#>  [8] <paragraph>\n  <text xml:space="preserve">I'm dug in.</text>\n</paragraph>
#>  [9] <heading level="1">\n  <text xml:space="preserve">Another title</text>\n</heading>
#> [10] <paragraph>\n  <text xml:space="preserve">A last word.</text>\n</paragraph>