Misaka

Misaka is a CFFI-based binding for Hoedown, a fast markdown processing library written in C. It features a fast HTML renderer and functionality to make custom renderers (e.g. man pages or LaTeX).

See the Changelog for all changes.

Installation

Misaka has been tested on CPython 2.7, 3.2, 3.3, 3.4, 3.5 and PyPy 2.6. CFFI 1.0 or newer is required. This means Misaka will not work on PyPy 2.5 and older versions.

Install with pip:

pip install misaka

Or grab the source from Github:

git clone https://github.com/FSX/misaka.git
cd misaka
python setup.py install

Consult the CFFI documentation if you experience problems installing CFFI.

Usage

Very simple example:

import misaka as m
print m.html('some other text')

Or:

from misaka import Markdown, HtmlRenderer

rndr = HtmlRenderer()
md = Markdown(rndr)

print md('some text')

Here’s a simple example that uses Pygments to highlight code (houdini is used to escape the HTML):

import houdini as h
import misaka as m
from pygments import highlight
from pygments.formatters import HtmlFormatter
from pygments.lexers import get_lexer_by_name

class HighlighterRenderer(m.HtmlRenderer):
    def blockcode(self, text, lang):
        if not lang:
            return '\n<pre><code>{}</code></pre>\n'.format(
                h.escape_html(text.strip()))

        lexer = get_lexer_by_name(lang, stripall=True)
        formatter = HtmlFormatter()

        return highlight(text, lexer, formatter)

renderer = HighlighterRenderer()
md = m.Markdown(renderer, extensions=('fenced-code',))

print(md("""
Here is some code:

```python
print(123)
```

More code:

    print(123)
"""))

The above code listing subclasses HtmlRenderer and implements a BaseRenderer.blockcode() method. See tests/test_renderer.py for a renderer with all its methods implemented.

Tests

tidy is needed to run the tests. tox can be used to run the tests on all supported Python versions with one command.

Run one of the following commands to install tidy:

apt-get install tidy  # Debian and derivatives
pacman -S tidyhtml    # Arch Linux

And run the tests with:

python setup.py test

It’s also possible to include or exclude tests. -i and -e accept a comma separated list of testcases:

# Only run MarkdownConformanceTest_10
python setup.py test -i MarkdownConformanceTest_10

# Or everything except MarkdownConformanceTest_10
python setup.py test -e MarkdownConformanceTest_10

# Or everything except MarkdownConformanceTest_10 and MarkdownConformanceTest_103
python setup.py test -e MarkdownConformanceTest_10,MarkdownConformanceTest_103

-l prints a list of all testcases:

$ python setup.py test -l
[... build output ...]
MarkdownConformanceTest_10
MarkdownConformanceTest_103
BenchmarkLibraries
ArgsToIntTest
CustomRendererTest
SmartypantsTest

And -b runs benchmarks (-i and -e can also be used in combination with -b):

$ python setup.py test -b
[... build output ...]
>> BenchmarkLibraries
test_hoep                     3270         1.00 s/t     305.91 us/op
test_markdown                   20         1.23 s/t      61.44 ms/op
test_markdown2                  10         3.29 s/t     329.34 ms/op
test_misaka                   3580         1.00 s/t     280.01 us/op
test_misaka_classes           3190         1.00 s/t     314.00 us/op
test_mistune                    70         1.04 s/t      14.91 ms/o

What you see in the above output are the name, repetitions, total amount of time (in seconds) and the time taken for an operation (one repetition). A benchmark tries to stay within one second and runs a test for a minimum of ten repetitions and tries another ten if there’s time left.

API

Extensions

Name Constant
tables EXT_TABLES
fenced-code EXT_FENCED_CODE
footnotes EXT_FOOTNOTES
autolink EXT_AUTOLINK
strikethrough EXT_STRIKETHROUGH
underline EXT_UNDERLINE
highlight EXT_HIGHLIGHT
quote EXT_QUOTE
superscript EXT_SUPERSCRIPT
math EXT_MATH
no-intra-emphasis EXT_NO_INTRA_EMPHASIS
space-headers EXT_SPACE_HEADERS
math-explicit EXT_MATH_EXPLICIT
disable-indented-code EXT_DISABLE_INDENTED_CODE

HTML render flags

Name Constant
skip-html HTML_SKIP_HTML
escape HTML_ESCAPE
hard-wrap HTML_HARD_WRAP
use-xhtml HTML_USE_XHTML

Functions

misaka.html(text, extensions=0, render_flags=0)

Convert markdown text to HTML.

extensions can be a list or tuple of extensions (e.g. ('fenced-code', 'footnotes', 'strikethrough')) or an integer (e.g. EXT_FENCED_CODE | EXT_FOOTNOTES | EXT_STRIKETHROUGH).

render_flags can be a list or tuple of flags (e.g. ('skip-html', 'hard-wrap')) or an integer (e.g. HTML_SKIP_HTML | HTML_HARD_WRAP).

misaka.smartypants(text)

Transforms sequences of characters into HTML entities.

Markdown HTML Result
's (s, t, m, d, re, ll, ve) &rsquo;s ’s
"Quotes" &ldquo;Quotes&rdquo; “Quotes”
--- &mdash;
-- &ndash;
... &hellip;
. . . &hellip;
(c) &copy; ©
(r) &reg; ®
(tm) &trade;
3/4 &frac34; ¾
1/2 &frac12; ½
1/4 &frac14; ¼

Classes

class misaka.Markdown(renderer, extensions=0)

Parses markdown text and renders it using the given renderer.

extensions can be a list or tuple of extensions (e.g. ('fenced-code', 'footnotes', 'strikethrough')) or an integer (e.g. EXT_FENCED_CODE | EXT_FOOTNOTES | EXT_STRIKETHROUGH).

__call__(text)

Parses and renders markdown text.

class misaka.HtmlRenderer(flags=0, nesting_level=0)

A wrapper for the HTML renderer that’s included in Hoedown.

render_flags can be a list or tuple of flags (e.g. ('skip-html', 'hard-wrap')) or an integer (e.g. HTML_SKIP_HTML | HTML_HARD_WRAP).

nesting_level limits what’s included in the table of contents. The default value is 0, no headers.

An instance of the HtmlRenderer can not be shared with multiple Markdown instances, because it carries state that’s changed by the Markdown instance.

class misaka.HtmlTocRenderer(nesting_level=6)

A wrapper for the HTML table of contents renderer that’s included in Hoedown.

nesting_level limits what’s included in the table of contents. The default value is 6, all headers.

An instance of the HtmlTocRenderer can not be shared with multiple Markdown instances, because it carries state that’s changed by the Markdown instance.

class misaka.BaseRenderer
blockcode(text, lang='')

lang contains the language when fenced code blocks are enabled and a language is defined in ther code block.

blockquote(content)
header(content, level)

level can be a humber from 1 to 6.

hrule()
list(content, is_ordered, is_block)
listitem(content, is_ordered, is_block)
paragraph(content)
table(content)

Depends on the tables extension.

table_header(content)

Depends on the tables extension.

table_body(content)

Depends on the tables extension.

table_row(content)

Depends on the tables extension.

table_cell(content, align, is_header)

Depends on the tables extension.

align can be empty, center, left or right.

footnotes(content)

Depends on the footnotes extension.

footnote_def(content, num)

Depends on the footnotes extension.

footnote_ref(num)

Depends on the footnotes extension.

blockhtml(text)

Depends on the autolink extension.

codespan(text)
double_emphasis(content)
emphasis(content)
underline(content)

Depends on the underline extension.

highlight(content)

Depends on the highlight extension.

quote(content)

Depends on the quote extension.

image(link, title='', alt='')
linebreak()
triple_emphasis(content)
strikethrough(content)

Depends on the strikethrough extension.

superscript(content)

Depends on the superscript extension.

math(text, displaymode)

Depends on the math extension.

displaymode can be 0 or 1. This is how HtmlRenderer handles it:

if displaymode == 1:
    return '\\[{}\\]'.format(text)
else:  # displaymode == 0
    return '\\({}\\)'.format(text)
raw_html(text)
entity(text)
normal_text(text)
doc_header(inline_render)