Because sometimes you don't need HTML.
I needed small, reasonably performant, extendable and reliable tool to convert Markdown (GFM) to JSON. Every small markdown parser, that I've found, couldn't generate JSON; every module, capable of generating JSON, is insanely bloated. Therefore I decided to make my own. MDJ is mostly based on marked with many improvements.
import MDJ from 'mdj'
import source from 'source.md'
const mdj = MDJ()
const parsedSource = mdj.parse(source)
// OR
import { parse } from 'mdj'
import source from 'source.md'
const parsedSource = parse(source) // Note that this is less performant.
MDJ constructor accepts an optional settings argument:
interface IMDJOptions {
html?: boolean // enables HTML support, false by default
}
Parser returns an array of tokens. Each token contains at least one property - type
, which can be heading
, paragraph
, code
etc.
Other content of tokens may vary. For example tokens of types text
, code
, codeblock
and html
have a value
property, which represents the raw content of that token.
const md = '`console.log("test")`'
parse(md) //
/*
outputs
[
{
type: 'paragraph',
children: [
{
type: 'code',
value: 'console.log("test")
}
]
}
]
*/
As you would have noticed, other tokens may have the children
property, which will contain another array of tokens.
Parsers are divided into two parts:
- Block parsers, e.g.: paragraph, lists, tables, block quotes etc.
- Inline parsers, e.g.: links, checkboxes, images, etc.
To add new parser rule use corresponding instance method:
const blockParser = (source: string) => {/* */}
const inlineParser = (source: string) => {/* */}
const priority = 300
mdj.useBlockParser(blockParser, priority)
mdj.useInlineParser(inlineParser, priority)
Before starting the parsing process all rules are sorted by priority. You may check priorities of default parsers in the source (./src/core/MDJ.ts
)
Each parser receives from one to three parameters - source
, which is, basically, non-parsed part of the initial source, and one or two lexers (block-level and inline-level for block-level parsers and only inline-level lexer for inline-level parser). Passed lexers use the same MDJ instance and can be used to parse whatever needs to be parsed inside your parser.
Parser should return null, if it did nothing, or an object, containing a new token, which will be added to JSON and a new source. See examples in ./src/parsers
.
- Add support for reference links and images
- Add support for HTML
- Add support for checkbox lists
- Pass all tests
Performance:
- Add public benchmarks
- Move to rollup
- Try prepack.io - tried, no benefit
MDJ is released under the MIT license.