Markdown Syntax Reference: CommonMark, GFM, and the Gotchas
31 March, 2026 Web
Markdown is everywhere - READMEs, wikis, issue trackers, blog engines, documentation sites, chat tools. If you write technical content, you write Markdown daily. But "Markdown" is not one thing: there is the original Gruber spec (2004), the CommonMark standardisation effort, GitHub Flavored Markdown (GFM), and dozens of platform-specific extensions that each break in different ways.
This is a reference for developers who already know the basics and want to understand the spec differences, the edge cases, and what actually happens when your renderer is not the one you tested against. Use the Markdown previewer to test any snippet as you read.
The Three Specs You Need to Know
Original Markdown (Gruber, 2004)
John Gruber's original Markdown.pl script defined Markdown as a syntax for converting text to HTML. The spec was intentionally loose - Gruber wanted readability of source over precise rules. The problem: ambiguous cases had no canonical answer, so every implementation resolved them differently. Indentation rules for nested lists, emphasis with mixed delimiters, handling of blank lines inside block elements - all of these produced inconsistent output.
CommonMark
CommonMark (2014, led by John MacFarlane) is a rigorous specification with a test suite of over 600 examples covering every ambiguous case. Its goals:
- One canonical result for every input
- A formal grammar for block and inline elements
- Reference implementations in C (
cmark) and JavaScript (commonmark.js)
CommonMark resolved the major ambiguities: how many spaces constitute a list continuation, how emphasis with underscores works at word boundaries, how link titles are parsed. Most modern Markdown libraries implement CommonMark as their base.
GitHub Flavored Markdown (GFM)
GFM is a strict superset of CommonMark. GitHub published its spec in 2017 after previously having an undocumented dialect. GFM adds:
- Tables
- Task lists
- Strikethrough (
~~text~~) - Autolinks (bare URLs become clickable)
- Disallowed raw HTML tags (GitHub strips
<script>,<iframe>, and others for security)
GitLab, Gitea, and many other platforms implement GFM or something very close to it.
Headings
ATX Style (Recommended)
# H1
## H2
### H3
#### H4
##### H5
###### H6
The # must be followed by a space. A closing sequence of # characters is allowed but stripped - ## Heading ## renders the same as ## Heading. ATX headings are preferred because they are unambiguous, scannable in source, and work at all six levels.
Setext Style
Heading Level 1
===============
Heading Level 2
---------------
Setext only supports two levels. The underline must contain at least one = or - character. It is used in older documents and some static site generators, but ATX has won in practice.
Heading IDs and Anchor Links
GitHub, GitLab, and most renderers auto-generate id attributes on headings by:
- Converting the heading text to lowercase
- Replacing spaces with hyphens
- Removing non-alphanumeric characters (except hyphens)
So ## Platform Differences becomes id="platform-differences" and you can link to it with [see below](#platform-differences). This is not in the CommonMark spec - it is a renderer feature. Pandoc generates IDs the same way; Notion does not use # fragment links at all.
Some renderers (Pandoc, kramdown) let you set custom IDs explicitly:
## My Section {#custom-id}
Emphasis
The Four Forms
*italic* or _italic_
**bold** or __bold__
***bold and italic*** or ___bold and italic___
~~strikethrough~~ (GFM only)
Asterisk vs Underscore - the CommonMark Rule
This is where the original Markdown spec was most ambiguous. CommonMark defines precise rules:
- Left-flanking delimiter run - can open emphasis
- Right-flanking delimiter run - can close emphasis
Underscores have an additional restriction: they cannot open or close emphasis if adjacent to an alphanumeric character. This means:
foo_bar_baz → no emphasis (underscores are word-internal)
foo*bar*baz → "bar" is italic
__strong__ → works (word boundary)
__not bold__ here → works
foo__bar__baz → no emphasis (word-internal)
In practice: use asterisks for emphasis inside words, and use asterisks consistently to avoid surprises. Mixing them creates nesting problems:
*foo _bar* baz_ → implementation-dependent, avoid this
Nesting Rules
Nesting works when delimiters are properly matched:
***bold and italic*** → works
**bold with *italic* inside** → works
*italic with **bold** inside* → works
But this breaks:
*foo **bar* baz** → undefined behaviour across renderers
The CommonMark spec has a detailed algorithm (the "emphasis stack") to resolve these cases, but the safest approach is to never interleave different delimiter types across nesting boundaries.
Lists
Ordered and Unordered
- Item one
- Item two
- Item three
1. First
2. Second
3. Third
For ordered lists, CommonMark requires the first item's number to set the start value; subsequent numbers are ignored. 1. 1. 1. renders as 1. 2. 3.. This means you can write all items as 1. and the renderer handles numbering.
Unordered lists can use -, *, or + as markers. You cannot mix them in one list (mixing starts a new list).
Tight vs Loose Lists
This is one of the most important and least-known distinctions in CommonMark.
A tight list has no blank lines between items:
- Apple
- Banana
- Cherry
Renders as <li>Apple</li> - content directly in <li>.
A loose list has blank lines between items:
- Apple
- Banana
- Cherry
Renders as <li><p>Apple</p></li> - content wrapped in <p>. This changes spacing significantly in HTML output. If your rendered list items have unexpected paragraph spacing, you have a loose list.
A list becomes loose if any of its constituent list items are separated by blank lines, or if any item directly contains two block-level elements.
Nested Lists
CommonMark requires nested list content to be indented by the number of spaces after the marker, plus any leading spaces of the marker itself.
- Item one
- Nested under one (2-space indent)
- Also nested
- Item two
- Nested under two (4-space indent also works)
The rule is: continuation lines must be indented past the start of the first content character. For a - marker (2 chars), indent by 2. For 1. (3 chars), indent by 3. Four spaces always works as a safe default.
Links and Images
Inline Links
[link text](https://example.com)
[link text](https://example.com "Optional title")
[link text](https://example.com 'Single-quoted title also works')
The title appears as a tooltip on hover. Titles must be quoted (double, single, or parenthesized).
Reference-Style Links
See the [CommonMark spec][cm-spec] for details.
[cm-spec]: https://spec.commonmark.org "CommonMark Specification"
Reference definitions can be placed anywhere in the document. They do not render - they are link definitions only. Reference labels are case-insensitive: [CM-SPEC] matches [cm-spec].
This style is useful for long URLs that would make source hard to read, and for URLs used multiple times.
Autolinks
<https://example.com>
<user@example.com>
Angle brackets create autolinks - the text becomes a clickable link. GFM also renders bare URLs without angle brackets, but CommonMark does not.
Images


![Alt text][img-ref]
[img-ref]: image.png "Reference image"
The alt text matters for accessibility and SEO. It is the only required attribute. An empty alt ![]() is valid for decorative images.
Code
Inline Code
Use `backticks` for inline code.
To include a literal backtick inside inline code, use a double-backtick delimiter:
`` Use `backticks` like this ``
A tick at the start: `` `code` ``
The number of opening backticks must match the closing count, and a space is stripped from each end if present.
Fenced Code Blocks
Triple backticks or triple tildes delimit fenced blocks:
```javascript
const x = 1;
```
~~~python
def hello():
pass
~~~
The language identifier after the opening fence is called the "info string". It has no effect on CommonMark output (which produces <code class="language-javascript">), but syntax highlighters like Prism and highlight.js use it to select a grammar.
You can nest fenced blocks by using more backticks on the outer fence:
````markdown
```javascript
const x = 1;
```
````
The closing fence must use the same character as the opening fence and have at least as many of them.
Tables (GFM)
Tables are a GFM extension, not in CommonMark.
| Column A | Column B | Column C |
| :------- | :------: | -------: |
| left | center | right |
| aligned | aligned | aligned |
Alignment is controlled by colons in the separator row:
:---- left-aligned (default):---:- centre-aligned---:- right-aligned
The outer pipes are optional but recommended for readability. Cells must be single-line. To include a literal pipe inside a cell, escape it:
| Code | Output |
| ---------- | ----------- |
| `a \| b` | `a | b` |
Tables must have a header row and a separator row. There is no way to span cells across rows or columns in GFM tables - use raw HTML for that.
Task Lists (GFM)
- [x] Write the article
- [x] Review examples
- [ ] Publish
- [ ] Update index
The x is case-insensitive. GitHub renders these as actual checkboxes in issues and PRs (and they are clickable in issue bodies). In README files they render as visual checkboxes but are not interactive. Most GFM-compatible renderers produce <input type="checkbox"> elements.
Task lists must be list items. They do not work in other block types.
Escaping
Backslash Escaping
A backslash before any ASCII punctuation character escapes it:
\*not italic\*
\# not a heading
\[not a link\](url)
\`not code\`
The full set of escapable characters in CommonMark:
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
Backslash before a non-punctuation character is a literal backslash followed by that character.
HTML Entities
HTML entities work in all CommonMark-compatible renderers:
& → &
< → <
> → >
" → "
© → ©
© → © (decimal)
© → © (hex)
Entities are processed before rendering, so &amp; produces & in output, not &.
HTML in Markdown
CommonMark allows raw HTML, and most renderers pass it through unchanged. This is intentional - Gruber's original design allowed falling back to HTML for constructs Markdown cannot express.
<div class="callout">
This is a raw HTML block.
</div>
Here is <span style="color:red">inline HTML</span> inside a paragraph.
Block-level HTML (a tag on its own line, starting at column 0) creates an HTML block that ends at the next blank line. Inline HTML is parsed within paragraph text.
What CommonMark does not specify is sanitization. GitHub strips or escapes <script>, <iframe>, <object>, <embed>, and event attributes like onload. This is a security decision, not a spec decision. If you render user-supplied Markdown, you must sanitize the output regardless of what the parser passes through.
For a static site where you control the content, raw HTML in Markdown is safe and useful for complex layouts that Markdown cannot express.
Platform Differences
The same Markdown source can render differently depending on the platform:
| Feature | GitHub | GitLab | Notion | Obsidian |
|---|---|---|---|---|
| CommonMark base | yes | yes | no | yes |
| GFM tables | yes | yes | yes | yes |
| Task lists | yes | yes | yes | yes |
Strikethrough (~~) |
yes | yes | yes | yes |
| Footnotes | yes | yes | no | yes |
Math ($...$) |
yes | yes | yes | yes |
| Bare URL autolinks | yes | yes | yes | yes |
| Heading anchor IDs | yes | yes | no | yes |
| Mermaid diagrams | yes | yes | no | plugin |
| Custom heading IDs | no | no | no | no |
| HTML blocks | sanitized | sanitized | stripped | allowed |
Key things that break across platforms:
- Notion does not implement CommonMark - it has its own Markdown-like parser with different rules for nested lists and code blocks
- Obsidian uses
[[wikilinks]]which are not in any standard spec - GitLab uses
:::for callout blocks (not GFM) - GitHub recently added footnotes (
[^1]) but many other platforms do not support them - Math rendering varies: GitHub uses
$...$and$$...$$, but the delimiters conflict with currency symbols in non-math text
If your content needs to work across multiple platforms, stick to the CommonMark core plus GFM tables and task lists. Avoid platform-specific extensions unless you control the rendering environment.
Linting and Tooling
For consistency in team projects, use a Markdown linter:
- markdownlint (Node.js, also available as VS Code extension) - enforces style rules: heading increment, list indentation, line length, blank lines around headings
- remark - pluggable Markdown processor with lint plugins and formatters
- Prettier - formats Markdown consistently (normalises list markers, heading style, blank lines)
CI integration example with markdownlint:
npx markdownlint-cli "**/*.md" --ignore node_modules
Configure rules in .markdownlint.json to match your style guide. Enforcing consistent style prevents the subtle rendering differences that come from mixed list markers, inconsistent indentation, or missing blank lines around headings.