This repository showcases the construction of a complete data pipeline and cache system. Following the architecture described in the linked article, this project serves as a practical example of building a modern, scalable, and real-time data solution using leading open-source technologies.
- Architectural Blueprint: Implements a multi-component architecture including database clusters, message brokers, data flow management, and caching layers.
- Hands-on Docker Deployment: Provides a Docker Compose setup for easy deployment and experimentation with distributed systems.
- Data Streaming with Kafka: Illustrates the use of Kafka for building robust and fault-tolerant data pipelines.
- Visual Data Processing with NiFi: Demonstrates Apache NiFi's capabilities in designing and managing complex data flows.
- Real-time Data Updates: Implements real-time data updates using WebSockets and Pusher to reflect database changes in a web application.
The architecture’s design is the following :
- Containerization: Docker, Docker Compose
- Message Broker: Kafka, Zookeeper
- Data Flow Management: Apache NiFi
- Databases: MySQL, MongoDB, Redis
- Backend Framework: Laravel (PHP)
- Real-time Communication: Pusher, WebSockets
- Scripting: Python (for Kafka producer)
Find complete technical specifications and workflow diagrams here:
Cache System Documentation