A state-of-the-art cache and memory hierarchy simulator featuring advanced prefetching, multi-processor support, and comprehensive performance analysis tools.
π Documentation | π Quick Start | β¨ Features | π Benchmarks | π€ Contributing
- π NRU Replacement Policy: Efficient Not Recently Used implementation with reference bit tracking
- πΎ Victim Cache: Reduces conflict misses by up to 25% with configurable fully-associative cache
- π Advanced Write Policies: No-write-allocate and write combining buffer support
- β‘ Parallel Processing: Multi-threaded simulation with up to 4x speedup on 8-core systems
- π₯οΈ Multi-Processor Support: Complete MESI coherence protocol with directory-based tracking
- π Statistical Visualization: Built-in ASCII charts including line graphs, pie charts, and heatmaps
- π§ Enhanced Tools: Cache analyzer and performance comparison utilities
- Flexible Configuration: Customizable L1/L2/L3 cache hierarchies
- Multiple Replacement Policies: LRU, FIFO, Random, Pseudo-LRU, and NRU
- Advanced Write Policies: Write-back, write-through, and no-write-allocate
- Victim Cache: Configurable 4-16 entry fully-associative cache
- Block Sizes: 32B to 256B configurable
- Stream Buffer Prefetching: Sequential access optimization
- Stride Predictor: Pattern-based prefetching with confidence tracking
- Adaptive Prefetching: Dynamic strategy selection based on workload
- Configurable Aggressiveness: Tunable prefetch distance and accuracy
- MESI Protocol: Full Modified-Exclusive-Shared-Invalid implementation
- Directory-Based Coherence: Scalable coherence tracking
- Interconnect Models: Bus, crossbar, and mesh topologies
- Atomic Operations: Support for synchronization primitives
- False Sharing Detection: Identifies and reports cache line conflicts
- Detailed Statistics: Hit/miss rates, access patterns, coherence traffic
- Real-time Visualization: ASCII-based charts and graphs
- Memory Profiling: Working set analysis and reuse distance
- Parallel Benchmarking: Compare multiple configurations simultaneously
- Trace Analysis Tools: Pattern detection and optimization recommendations
- C++17 compatible compiler (GCC 7+, Clang 5+, MSVC 19.14+)
- CMake 3.14+ or GNU Make
- Optional: Python 3.6+ for visualization scripts
# Clone the repository
git clone https://github.com/muditbhargava66/CacheSimulator.git
cd CacheSimulator
# Build with CMake (recommended)
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . -j$(nproc)
# Or build with Make
make -j$(nproc)
# Run with default configuration
./build/bin/cachesim traces/simple.txt
# Run with custom parameters
./build/bin/cachesim 64 32768 4 262144 8 1 4 traces/workload.txt
# BS L1 A1 L2 A2 P D
# BS=Block Size, L1=L1 Size, A1=L1 Assoc, L2=L2 Size, A2=L2 Assoc, P=Prefetch, D=Distance
# Run with visualization
./build/bin/cachesim --visualize --charts traces/workload.txt
# Enable victim cache
./build/bin/cachesim --victim-cache traces/workload.txt
# Parallel processing
./build/bin/cachesim -p 8 traces/large_workload.txt
Create a JSON configuration file:
{
"l1": {
"size": 32768,
"associativity": 4,
"blockSize": 64,
"replacementPolicy": "NRU",
"writePolicy": "WriteBack",
"prefetch": {
"enabled": true,
"distance": 4,
"adaptive": true
}
},
"l2": {
"size": 262144,
"associativity": 8,
"blockSize": 64,
"replacementPolicy": "LRU"
},
"victimCache": {
"enabled": true,
"size": 8
},
"multiprocessor": {
"numCores": 4,
"coherence": "MESI",
"interconnect": "Bus"
}
}
Run with configuration:
./build/bin/cachesim -c config.json traces/workload.txt
Feature | Improvement | Benchmark |
---|---|---|
Parallel Processing | 3.8x speedup | 8-core Intel i7-9700K |
Victim Cache | 25% fewer conflict misses | SPEC CPU2017 |
NRU Policy | 15% faster than LRU | Large working sets |
Write Combining | 40% reduction in memory traffic | Write-heavy workloads |
Configuration L1 Hit% L2 Hit% Overall% Avg Time Speedup
---------------------------------------------------------------------------
Basic L1 (32KB) 85.2 0.0 85.2 12.5 1.0x
L1+L2 (32KB+256KB) 85.2 78.3 96.7 4.8 2.6x
With Prefetching 89.1 82.5 98.1 3.2 3.9x
NRU + Victim Cache 87.8 79.1 97.5 3.5 3.6x
High-Performance 91.3 85.2 98.8 2.9 4.3x
Comprehensive trace analysis tool:
./build/bin/tools/cache_analyzer -v -g traces/workload.txt
# Output includes:
# - Working set analysis
# - Reuse distance distribution
# - Access pattern classification
# - Cache size recommendations
Compare multiple configurations:
./build/bin/tools/performance_comparison -g -r traces/workload.txt
# Features:
# - Parallel simulation of configurations
# - Visual comparison charts
# - Automatic recommendations
# - CSV export for further analysis
Create custom workloads:
./build/bin/tools/trace_generator -p matrix -n 10000 -o matrix.txt
./build/bin/tools/trace_generator -p mixed --locality 0.8 -o mixed.txt
CacheSimulator/
βββ src/ # Source code
β βββ core/ # Core simulation components
β β βββ multiprocessor/ # Multi-processor simulation
β β βββ cache.cpp/.h # Cache implementation
β β βββ memory_hierarchy.cpp/.h
β β βββ victim_cache.h # Victim cache implementation
β β βββ replacement_policy.h # Pluggable policies
β β βββ write_policy.cpp/.h # Write policies
β β βββ adaptive_prefetcher.cpp/.h
β βββ utils/ # Utility classes
β β βββ parallel_executor.h # Parallel processing
β β βββ visualization.h # Statistical charts
β β βββ trace_parser.cpp/.h
β β βββ config_utils.cpp/.h
β βββ main.cpp # Main application entry point
βββ tests/ # Organized test suite
β βββ unit/ # Unit tests by component
β β βββ core/ # Core component tests
β β βββ policies/ # Policy tests
β β βββ utils/ # Utility tests
β βββ integration/ # End-to-end tests
β βββ performance/ # Performance benchmarks
βββ docs/ # Comprehensive documentation
β βββ user/ # User guides and tutorials
β βββ developer/ # Development documentation
β βββ features/ # Feature-specific docs
βββ tools/ # Analysis and generation tools
βββ configs/ # Configuration examples
βββ traces/ # Example trace files
- Getting Started - Installation and basic usage
- User Guide - Complete user manual
- Configuration Guide - Configuration options and examples
- CLI Reference - Command-line options
- Architecture - System design and implementation
- Contributing - Development guidelines
- v1.2.0 Features - New features and capabilities
- Examples - Usage examples and case studies
π See docs/README.md for complete documentation index.
Run the test suite:
# Run all tests
cd build
ctest
# Run specific test category
ctest -R unit
ctest -R validation
# Run specific feature tests
ctest -R nru_policy_test
ctest -R victim_cache_test
ctest -R parallel_processing_test
ctest -R visualization_test
# Run performance tests
ctest -R performance
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Follow the existing C++17 style
- Use meaningful variable names
- Add comments for complex logic
- Include unit tests for new features
If you use this simulator in your research, please cite:
@software{CacheSimulator2025,
author = {Mudit Bhargava},
title = {Cache Simulator: A C++17 Cache and Memory Hierarchy Simulator},
version = {1.2.0},
year = {2025},
url = {https://github.com/muditbhargava66/CacheSimulator}
}
- For Large Traces: Use parallel processing with
-p
flag - For Conflict Misses: Enable victim cache with
--victim-cache
- For Write-Heavy Workloads: Use write combining buffer
- For Multi-Core: Choose appropriate interconnect topology
- For Best Performance: Use release build with
-O3
optimization
This simulator is ideal for:
- Computer Architecture courses
- Cache behavior studies
- Performance analysis research
- Learning about memory hierarchies
- Understanding cache coherence protocols
β Star this repo if you find it useful!
π« Contact: @muditbhargava66 π Report Issues: Issue Tracker
Β© 2025 Mudit Bhargava. MIT License