NotionRepoSync provides a CLI tool that can import in Notion markdown files from a given repository into Notion pages, while preserving the links between the imported documents. So if a.md
links to b.md
, if you browse the imported a.md
in Notion, when you click on the link, you'll get sent to the imported b.md
Notion page.
Notion is a great tool for a company knowledge database, but comes with a few tradeoffs making it not the best solution for technnical/reference documentation. In particular, it's very common to keep the reference documentation of a given project versioned next to the code, as it enables to ship documentation changes along code changes and to review the whole thing in a single PR.
| Why would we want to bring reference documentation in Notion then?
Because it allows users to search through the internal compnany knowledge database and the reference documentation at the same time. It provides users a single entry point to crawl company knowledge and lift from users the requirement to know where something is documented in the first place.
Notion supports importing markdown, and there are tools to batch import markdown files, but they come with a big caveat: they do not handle internal links in between the imported pages.
NotionRepoSync aims to address that particular problem and provides a CLI you can drop in your CI to automatically synchronize content in Notion. Becaues it keeps an index of the existing pages, across updates the page ids are stable, so no links are broken and folks can safely bookmark pages.
Without going in details the general flow is the following:
- CLI is called with a Notion page id that will hold the imported pages in a Notion database.
- Walk the provided root folder containing markdown files.
- Update the pages database in Notion, creating missing ones if needed. Original path is stored in row properties.
- For each markdown file, we convert Markdown AST into Notion block and feed it to the API.
- Creating or synchronzing the pages database.
- Parsing most of the common markdown blocks.
- Rendering and resolving links to corresponding Notion pages.
- Links to anything that isn't a markdown file is instead resolving to a code view.
- Usable as a library, for when you simply want to maintain specific content instead of integrating it with your CI.
- Handling anchors
- This requires to do post-processing, because anchors are unpredictable in Notion, as they point to block IDs, which cannot be known before their creation.
- Locking pages automatically
- We don't want users to think they can edit the content in place, as it'll be overwritten on the next sync.
- Updating blocks instead of appending
- Naive implementation should be to wipe all the existing blocks, and start over.
- Better implementation would be to check if the next block matches what we expect, and just skip it if that's the case. Otherwise delete.
- Even better might be to try to avoid as much possible to delete any block.
- If the document starts with a single
h1
and there are noth1
in the content, then we can safely assume that this is the page title.- We can remove that
h1
, use it a title and turn allh2
(and below) intoh1
(orhN-1
) to compensate the fact that Notion only has three levels as opposed to markdown who has 6.
- We can remove that
WIP
- Create a page inside the database of syncronized doc repos.
- Fill the
Repository
page property. - Click the share button and add the "GitHubDocSync" integration.
Once we're past the pages creation step on Notion, we're basically turning Markdown AST into Notion blocks. It's rather simple conceptually, but we have to keep in mind the differences. Whereas with HTML, everything is basically a tag, with Notion it's not that simple:
- We deal in Blocks and lists of RichText.
- A paragraph is a Block composed of a list of RichText elements.
foo _bar_ baz
is just one block, with three RichText elements.
- Notion doesn't have the concept of a "List Block", instead we just have "ListItem" Blocks, but the markdown parser we use does.
- So when we walk the AST, we cannot rely on the list node to know if we're about to create a list or exit from one, we instead rely on knowing if the list item is the first or last one.