A powerful JavaScript library and CLI tool for parsing and manipulating markdown files as tree structures. Built on top of the battle-tested remark/unified ecosystem.
- π³ Tree-based parsing - Treats markdown as manipulable Abstract Syntax Trees (AST)
- βοΈ Section extraction - Extract specific sections with automatic boundary detection
- π Powerful search - CSS-like selectors and custom search functions
- π Batch processing - Process multiple sections at once
- π οΈ CLI & Library - Use as a command-line tool or JavaScript library
- π Document analysis - Get statistics and generate table of contents
- π― TypeScript ready - Full type definitions included
# Using npm
npm install -g @kayvan/markdown-tree-parser
# Using pnpm (may require approval for build scripts)
pnpm install -g @kayvan/markdown-tree-parser
pnpm approve-builds -g # If prompted
# Using yarn
yarn global add @kayvan/markdown-tree-parser
npm install @kayvan/markdown-tree-parser
After global installation, use the md-tree
command:
md-tree list README.md
md-tree list README.md --format json
# Extract one section
md-tree extract README.md "Installation"
# Extract to a file
md-tree extract README.md "Installation" --output ./sections
# Extract all level-2 sections
md-tree extract-all README.md 2
# Extract to separate files
md-tree extract-all README.md 2 --output ./sections
md-tree tree README.md
# Find all level-2 headings
md-tree search README.md "heading[depth=2]"
# Find all links
md-tree search README.md "link"
md-tree stats README.md
md-tree toc README.md --max-level 3
md-tree help
import { MarkdownTreeParser } from 'markdown-tree-parser';
const parser = new MarkdownTreeParser();
// Parse markdown into AST
const markdown = `
# My Document
Some content here.
## Section 1
Content for section 1.
## Section 2
Content for section 2.
`;
const tree = await parser.parse(markdown);
// Extract a specific section
const section = parser.extractSection(tree, 'Section 1');
const sectionMarkdown = await parser.stringify(section);
console.log(sectionMarkdown);
// Output:
// ## Section 1
// Content for section 1.
import { MarkdownTreeParser, createParser, extractSection } from 'markdown-tree-parser';
// Create parser with custom options
const parser = createParser({
bullet: '-', // Use '-' for lists
emphasis: '_', // Use '_' for emphasis
strong: '__' // Use '__' for strong
});
// Extract all sections at level 2
const tree = await parser.parse(markdown);
const sections = parser.extractAllSections(tree, 2);
sections.forEach(async (section, index) => {
const heading = parser.getHeadingText(section.heading);
const content = await parser.stringify(section.tree);
console.log(`Section ${index + 1}: ${heading}`);
console.log(content);
});
// Use convenience functions
const sectionMarkdown = await extractSection(markdown, 'Installation');
// CSS-like selectors
const headings = parser.selectAll(tree, 'heading[depth=2]');
const links = parser.selectAll(tree, 'link');
const codeBlocks = parser.selectAll(tree, 'code');
// Custom search
const customNode = parser.findNode(tree, (node) => {
return node.type === 'heading' &&
parser.getHeadingText(node).includes('API');
});
// Transform content
parser.transform(tree, (node) => {
if (node.type === 'heading' && node.depth === 1) {
node.depth = 2; // Convert h1 to h2
}
});
// Get document statistics
const stats = parser.getStats(tree);
console.log(`Document has ${stats.wordCount} words and ${stats.headings.total} headings`);
// Generate table of contents
const toc = parser.generateTableOfContents(tree, 3);
console.log(toc);
import fs from 'fs/promises';
// Read and process a file
const content = await fs.readFile('README.md', 'utf-8');
const tree = await parser.parse(content);
// Extract all sections and save to files
const sections = parser.extractAllSections(tree, 2);
for (let i = 0; i < sections.length; i++) {
const section = sections[i];
const filename = `section-${i + 1}.md`;
const markdown = await parser.stringify(section.tree);
await fs.writeFile(filename, markdown);
}
- π Documentation Management - Split large docs into manageable sections
- π Static Site Generation - Process markdown for blogs and websites
- π Content Organization - Restructure and reorganize markdown content
- π Content Analysis - Analyze document structure and extract insights
- π Documentation Tools - Build custom documentation processing tools
- π Content Migration - Extract and transform content between formats
new MarkdownTreeParser(options = {})
parse(markdown)
- Parse markdown into ASTstringify(tree)
- Convert AST back to markdownextractSection(tree, headingText, level?)
- Extract specific sectionextractAllSections(tree, level)
- Extract all sections at levelselect(tree, selector)
- Find first node matching CSS selectorselectAll(tree, selector)
- Find all nodes matching CSS selectorfindNode(tree, condition)
- Find node with custom conditiongetHeadingText(headingNode)
- Get text content of headinggetHeadingsList(tree)
- Get all headings with metadatagetStats(tree)
- Get document statisticsgenerateTableOfContents(tree, maxLevel)
- Generate TOCtransform(tree, visitor)
- Transform tree with visitor function
createParser(options)
- Create new parser instanceextractSection(markdown, sectionName, options)
- Quick section extractiongetHeadings(markdown, options)
- Quick heading extractiongenerateTOC(markdown, maxLevel, options)
- Quick TOC generation
The library supports powerful CSS-like selectors for searching:
// Element selectors
parser.selectAll(tree, 'heading') // All headings
parser.selectAll(tree, 'paragraph') // All paragraphs
parser.selectAll(tree, 'link') // All links
// Attribute selectors
parser.selectAll(tree, 'heading[depth=1]') // H1 headings
parser.selectAll(tree, 'heading[depth=2]') // H2 headings
parser.selectAll(tree, 'link[url*="github"]') // Links containing "github"
// Pseudo selectors
parser.selectAll(tree, ':first-child') // First child elements
parser.selectAll(tree, ':last-child') // Last child elements
# Run tests
npm test
# Test CLI
npm run test:cli
# Run examples
npm run example
- Node.js 18+
- npm
# Clone the repository
git clone https://github.com/ksylvan/markdown-tree-parser.git
cd markdown-tree-parser
# Install dependencies
npm install
# Run tests
npm test
# Run linting
npm run lint
# Format code
npm run format
# Test CLI functionality
npm run test:cli
This project uses GitHub Actions for continuous integration. The workflow automatically:
- Tests against Node.js versions 18.x, 20.x, and 22.x
- Runs linting with ESLint
- Executes the full test suite
- Tests CLI functionality
- Verifies the package can be published
The CI badge in the README shows the current build status and links to the Actions page.
Contributions are welcome! Please read our Contributing Guide for details.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Built on top of the excellent unified ecosystem:
- remark - Markdown processing
- mdast - Markdown AST specification
- unist - Universal syntax tree utilities
- π Documentation
- π Issue Tracker
- π¬ Discussions
Made with β€οΈ by Kayvan Sylvan