A tool to generate AI-optimized llms.txt files for websites to help AI models better understand content structure and context.
![]() 1. Enter website URL to begin |
![]() 2. Crawling in progress |
![]() 3. Review crawled URLs |
![]() 4. AI generation in progress |
![]() 5. llms.txt ready to download |
Simple 5-step process to generate AI-friendly documentation for your website. Just provide a URL and let the tool do the work!
- AI-Powered Descriptions: Automatically create concise, informative descriptions for each page using ChatGPT 4o mini - chosen for its cost-efficiency and effectiveness for this specific task
- Advanced Web Crawling: Built on Puppeteer and JSDOM for reliable content extraction and structured data processing
- Content Review: Review and edit crawled links before generating descriptions
- Robots.txt Compliant: Respects website crawling preferences
- Test Mode: Option to limit crawl to 5 URLs for faster testing
- URL Validation: Ensures only valid and accessible URLs are processed
- Smart Categorization: Pages are categorized based on domain structure
The llms.txt file is an open standard that helps AI models better understand and interact with your website's content. Similar to how robots.txt guides search engines, llms.txt provides a curated index of your most important pages. Visit llmstxt.org for the official standard documentation.
It solves common AI crawling challenges by:
- Creating a clear content hierarchy
- Ensuring consistent discovery across subdomains
- Providing structured signals for valuable content
- Defining AI interaction preferences
- Predictable AI Discovery - Find and prioritize key content
- Structured Content Signals - Clear training indicators
- Enhanced Content Understanding - Better documentation comprehension
- Consistent AI Interactions - Reliable interpretation
- Input your website URL
- The tool crawls your content using Puppeteer and JSDOM
- Content structure is analyzed
- AI generates page descriptions
- AI creates an overall site description
- Download the generated llms.txt file
- Add to your website's root directory (like robots.txt)
Before you start: Add your OpenAI API key to the .env file to enable AI-powered content analysis. See Setup Instructions.
- Next.js 14 - React framework for frontend and API routes
- React 18 - Component-based UI library
- TypeScript - For type safety and better developer experience
- Tailwind CSS - Utility-first styling
- OpenAI API - Using ChatGPT 4o mini for AI-powered descriptions
- Puppeteer - Headless browser for reliable web crawling
- JSDOM - DOM parsing outside the browser
- React Hook Form & Zod - Form handling and validation
- Radix UI - Accessible UI component primitives
- Server-Sent Events - For real-time crawl progress updates
- Node.js 18+
- An OpenAI API key
- Clone the repository
git clone https://github.com/rdyplayerB/ai-llmstxt-generator.git
cd ai-llmstxt-generator
- Install dependencies
npm install
- Create a
.env.local
file in the root directory with your OpenAI API key:
OPENAI_API_KEY=your_api_key_here
- Start the development server
npm run dev
- Open http://localhost:3000 with your browser
This project can be deployed on any platform that supports Next.js applications:
- Push your code to GitHub
- Deploy to your preferred hosting service (Vercel, Netlify, or your own server)
- Add your environment variables (OPENAI_API_KEY) to your hosting configuration
- Deploy!
MIT License © 2024 rdyplayerB
Contributions are welcome! Please feel free to submit a Pull Request.
- This tool implements the official llms.txt standard
- llmstxthub.com is a great resource directory for AI-ready documentation and tools implementing the llms.txt standard
- Powered by Puppeteer for headless browser crawling and JSDOM for DOM parsing
- Thanks to the AI community for supporting and adopting this standard
- Built with positive vibes by @rdyplayerB 🤙