Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize and refine code to c++17 #2

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

ad3002
Copy link

@ad3002 ad3002 commented May 29, 2024

Overview

This pull request focuses on optimizing and modernizing the compute_mphf_seq part of the codebase. Initially, the goal was to replace std::string with std::string_view to improve performance and reduce unnecessary allocations. However, several other improvements were made to enhance the overall efficiency and maintainability of the code.

Changes Made

  1. Replaced std::string with std::string_view:

    • Improved performance by avoiding unnecessary string copies.
    • Reduced memory allocations.
  2. Added noexcept Specifier:

    • Ensured that functions that do not throw exceptions are marked with noexcept.
    • Potentially improved performance by allowing the compiler to make optimizations.
  3. Utilized constexpr:

    • Enabled compile-time evaluation for constants and functions.
    • Improved code clarity and potential performance.
  4. Optimized Memory Access Patterns:

    • Ensured contiguous memory access to improve cache performance.
    • Used std::vector::reserve to avoid multiple reallocations.
  5. Improved Logic and Readability:

    • Simplified and clarified the logic in various parts of the code.
    • Reduced unnecessary computations and redundant operations.
  6. Modernized C++ Practices:

    • Leveraged std::move for efficient resource management.
    • Used type aliases and consistent naming conventions for better readability.

Test Results

The changes were tested using the synthetic data generation tool ./gen_synthetic_data with the command ./gen_synthetic_data test.dat 100000000. The results show an improvement in performance.

Before Optimization:

2024-05-29 17:15:37: Avg. 0.0121976 usecs per base hash computation
2024-05-29 17:15:37: Performing lookups
2024-05-29 17:20:42: Avg. 0.304692 usecs per lookup
avg_lookup_time 0.304692
stddev_lookup_time_percentage   7.33947
bits_per_key    2.61375

After Optimization:

2024-05-29 17:48:29: Avg. 0.00789278 usecs per base hash computation
2024-05-29 17:48:29: Performing lookups
2024-05-29 17:52:34: Avg. 0.245929 usecs per lookup
avg_lookup_time 0.245929
stddev_lookup_time_percentage   10.0789
bits_per_key    2.61375

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant