The UNICEF Technical Documentation RAG (Retrieval-Augmented Generation) MCP Server provides intelligent access to technical documentation through semantic search capabilities. This Model Context Protocol (MCP) server specializes in processing and retrieving information from the Children's Climate Risk Index (CCRI) Technical Documentation and related climate risk assessment materials.
This MCP server serves as the technical documentation backend for the UNICEF Geosphere project, providing access to the CCRI Technical Documentation.
- Document Processing: Automatic parsing and indexing of technical documentation
- Vector Search: Semantic similarity-based document retrieval
- Context Extraction: Relevant passages for answering specific questions
- Climate Risk Methodologies: CCRI calculation approaches and algorithms
- Dataset Specifications: Detailed descriptions of hazard and exposure datasets
- Indicator Definitions: Technical definitions of risk indicators
- Data Sources: Source documentation
- FastMCP: Model Context Protocol server framework
- Vector Database: Document embeddings and similarity search
- LlamaIndex: Document processing and RAG pipelines
- Sentence Transformers: Text embedding generation
rag/
├── server.py # MCP server and tool definitions
├── handlers.py # RAG implementation and document processing
├── config.py # Configuration and settings management
├── schemas.py # Pydantic models and validation
├── constants.py # Application constants
├── config.yaml # Server configuration
├── logging_config.py # Logging setup
└── data/vector_index/ # Document storage and vector indices
process_ccri_doc.py # Document processing script
CCRI_2025_Technical_Documentation.md # CCRI Technical Documentation
- Source Documents: CCRI Technical Documentation (Markdown format)
- Vector Storage: Persistent vector database for document embeddings
- Processing Power: Sufficient resources for document embedding generation
The MCP server exposes specialized tools for technical documentation access:
Performs semantic search against the CCRI technical documentation to find relevant information.
Parameters:
question
(required): Natural language question about climate risk methodologies, datasets, or technical specifications
Returns: Dictionary containing:
data
: List of relevant document sectionsinput_arguments
: Input arguments for the tool
# Install dependencies using uv
uv sync
Before running the server, you must process the CCRI technical documentation:
# Process and index the CCRI documentation
uv run python process_ccri_doc.py
This step:
- Parses the CCRI Technical Documentation Markdown
- Splits content into searchable chunks
- Generates vector embeddings for each chunk
- Creates a persistent vector index
- Stores metadata for each document section
rag/config.yaml
:
server:
host: "0.0.0.0" # Server bind address
port: 8002 # Server port
transport: "sse" # MCP transport protocol
# Development mode
mcp dev rag/server.py
# Production mode
uv run rag/server.py
# Run all tests
uv run pytest
# Run specific tests
uv run pytest tests/test_handlers.py -v
- Clone repository
- Install dependencies:
uv sync
- Process documentation:
uv run python process_ccri_doc.py
- Run tests:
uv run pytest
- Start server:
mcp dev rag/server.py
- Code Style: Follow PEP 8 and use type hints
- Testing: Add tests for new RAG functionality
- Documentation: Update tool descriptions and examples
- Document Preparation: Ensure documents are in markdown format
- Processing Script: Update
process_ccri_doc.py
for new documents - Metadata Schema: Extend metadata structure if needed
- Testing: Verify search functionality with new content
- Index Update: Regenerate vector index with new documents
This project is licensed under the MIT License. See the LICENSE file for details.
- Issues: Submit issues on GitHub repository
- RAG Documentation: LlamaIndex RAG Guide
- Technical Support: Repository maintainers