A scalable distributed logging system that provides centralized log management, tracing, and monitoring capabilities for microservices.
- Python 3.8+
- Apache Kafka & Zookeeper
- Two VMs with network connectivity
- Ubuntu 22.x Jammy Jellyfish
- VM1: Runs microservices and Fluentd
- VM2: Runs Kafka broker, Elasticsearch, and monitoring services
# Install Python dependencies
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Install the distributed logger package
pip install -e .
# Install libssl1.1 (prerequisite)
echo "deb http://security.ubuntu.com/ubuntu focal-security main" | sudo tee /etc/apt/sources.list.d/focal-security.list
sudo apt-get update
sudo apt-get install libssl1.1
sudo rm /etc/apt/sources.list.d/focal-security.list
# Install td-agent (Fluentd)
curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-td-agent4.sh | sudo sh
# Install Kafka plugin for Fluentd
sudo td-agent-gem install fluent-plugin-kafka
Update Fluentd configuration:
sudo nano /etc/td-agent/td-agent.conf
# Copy the td-agent.conf content from config/fluentd/
# Replace VM2_IP with your actual VM2 IP address
sudo systemctl restart td-agent
# Install Elasticsearch
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update
sudo apt-get install elasticsearch
# Configure Elasticsearch
sudo nano /etc/elasticsearch/elasticsearch.yml
# Copy the elasticsearch.yml content from config/elasticsearch/
# Download and install OpenTelemetry Collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.96.0/otelcol_0.96.0_linux_amd64.deb
sudo dpkg -i otelcol_0.96.0_linux_amd64.deb
# Configure OpenTelemetry
sudo nano /etc/otel/config.yaml
# Copy the otel-collector-config.yaml content from config/otel/
sudo systemctl restart otelcol
sudo nano /usr/local/kafka/config/server.properties
# Update advertised.listeners=PLAINTEXT://YOUR_VM2_IP:9092
sudo systemctl restart kafka
# Create topics
kafka-topics.sh --create --topic service_logs --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
kafka-topics.sh --create --topic trace_logs --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
kafka-topics.sh --create --topic heartbeat_logs --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
# Start Elasticsearch
sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch
# Start OpenTelemetry collector
sudo systemctl start otelcol
sudo systemctl enable otelcol
# Start Kafka
sudo systemctl start kafka
sudo systemctl status kakfa
# Start Zookeeper
sudo systemctl start zookeeper
sudo systemctl status zookeeper
# Start Fluentd
sudo systemctl start td-agent
sudo systemctl enable td-agent
from distributed_logger import DistributedLogger
# Initialize logger
logger = DistributedLogger("MyService", dependencies=["ServiceA", "ServiceB"])
# Log different levels
logger.info("Application started")
logger.warn("High latency detected", response_time_ms=1500, threshold_limit_ms=1000)
logger.error("Operation failed", error_code="ERR_001", error_message="Database timeout")
# Log service calls
logger.log_service_call("ServiceA", success=True)
# On VM_2
# Make sure kafka, opentelemetry, zookeeper and elasticsearch are running
# Terminal 1
python elasticsearch_consumer.py
# Terminal 2
python alert_system.py
# On VM_1
# Make sure td-agent is running
# In different Terminals
python service_a.py # Start Service A
python service_b.py # Start Service B
python service_c.py # Start Service C
...
python test_client.py
- Check Elasticsearch:
curl -X GET "localhost:9200/_cluster/health?pretty"
- Check Fluentd logs:
sudo tail -f /var/log/td-agent/td-agent.log
- Monitor Kafka topics:
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic service_logs --from-beginning
- Elasticsearch data: http://VM2_IP:9200/_cat/indices
- Fluentd status:
sudo systemctl status td-agent
- Kafka topics:
kafka-topics.sh --list --bootstrap-server localhost:9092
- Setup a more robust system to handle trace information across machines to keep track of complex layouts and dependencies between microservices
- Implement visualisation on elasticsearch using ELK stack using Kibana