Skip to main content

Notion to Docusaurus Migration

This guide documents the process of migrating documentation from Notion export to Docusaurus, including the scripts used to automate structure organization, frontmatter addition, and syntax fixing.

Overview

The migration process involves 4 main steps:

  1. Export from Notion: Export pages as Markdown & CSV.
  2. Reorganize Structure: Map flat Notion export to Docusaurus folder hierarchy.
  3. Add Frontmatter: Inject Docusaurus metadata (ID, title, sidebar position).
  4. Fix MDX Syntax: Resolve compatibility issues (angle brackets, table formatting).

Step 1: Notion Export

Export your Notion workspace or page with the following settings:

  • Export format: Markdown & CSV
  • Include subpages: Yes
  • Create folders for subpages: Yes

Step 2: Reorganize Structure

Use the reorganize_wpcli_structure.py script to move files into a clean hierarchy and create _category_.json files.

Script: reorganize_wpcli_structure.py

This script defines a STRUCTURE list mapping folders to files and moves them accordingly.

#!/usr/bin/env python3
"""
Reorganize WP-CLI docs into hierarchical folder structure with _category_.json files
"""
import os
import json
import shutil

BASE_DIR = "/opt/docker-data/apps/docusaurus/site/docs/wordpress/wp-cli"

# Example Structure (truncated)
STRUCTURE = [
{
"folder": "01-introduction",
"label": "1. Introduction",
"position": 1,
"files": ["what-is-wp-cli.md", "installation-set-up.md"]
},
# ... other folders
]

def reorganize_docs():
for module in STRUCTURE:
folder_path = os.path.join(BASE_DIR, module["folder"])
os.makedirs(folder_path, exist_ok=True)

# Create _category_.json
category_data = {
"label": module["label"],
"position": module["position"],
"link": {"type": "generated-index"}
}

with open(os.path.join(folder_path, "_category_.json"), 'w') as f:
json.dump(category_data, f, indent=4)

# Move files...

Step 3: Add Frontmatter

Use add_wpcli_frontmatter.py to add necessary Docusaurus headers.

Script: add_wpcli_frontmatter.py

#!/usr/bin/env python3
"""
Add proper frontmatter to all WP-CLI documentation files
"""
import os

BASE_DIR = "/opt/docker-data/apps/docusaurus/site/docs/wordpress/wp-cli"

FRONTMATTER_MAP = {
"01-introduction": [
{"file": "what-is-wp-cli.md", "id": "what-is-wp-cli", "title": "What is WP-CLI?", "label": "What is WP-CLI?", "pos": 1},
# ...
]
}

def create_frontmatter(id_val, title, label, position):
return f"""---
id: {id_val}
title: {title}
sidebar_label: {label}
sidebar_position: {position}
---

"""

Step 4: Fix MDX Syntax Errors

Notion exports often contain characters that break Docusaurus MDX parsing, specifically:

  • Angle brackets <...> interpreted as HTML/JSX tags.
  • Pipe characters | inside table cells breaking table structure.

Use fix_mdx_universal.py to robustly escape these characters.

Script: fix_mdx_universal.py

import os

# Known valid HTML tags to ignore (rendering as HTML is fine)
HTML_TAGS = {
'br', 'hr', 'img', 'a', 'b', 'code', 'table', 'tr', 'td', 'th', 'div', 'span',
# ... complete list
}

def fix_file(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()

lines = content.split('\n')
new_lines = []
in_code_block = False

for line in lines:
if line.strip().startswith('```'):
in_code_block = not in_code_block
new_lines.append(line)
continue

if in_code_block:
new_lines.append(line)
continue

# Logic to escape <tags> and invalid pipes |
# ... (Refer to full script in /home/rezriz/github/scripts/fix_mdx_universal.py)
# Key fix: escapes <placeholder> to `<placeholder>`
# Key fix: escapes | in tables to \|

Execution Log Example

For a real migration execution record (source path, destination path, structure mapping, cleanup rules, and validation checks), see: