Skip to content

Latest commit

 

History

History

roaring-bitmap

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

RoaringBitmap

Overview

RoaringBitmap is a high-performance, compressed data structure that can store and manipulate sets of integer data. Designed for fast operations on large data sets, RoaringBitmap represents bitmaps in a format that is both size and speed-efficient.

RoaringBitmap operates on the principle of dividing the overall range of values into chunks of 2^16 values each, and using the most appropriate of three container types to store the values in each chunk:

  1. Array Containers store small sets of integers as sorted arrays.

  2. Bitmap Containers represent medium-sized ranges as 2^16 bit arrays (bitmaps).

  3. Run Containers employ run-length encoding to handle large ranges or sets of consecutive values compactly.

Depending on the data distribution, RoaringBitmap dynamically switches between these container types to offer the best balance between memory usage and performance.

Key Features

  • Performance: RoaringBitmap is designed to operate efficiently on large sets of data. Its operations, including OR, AND, XOR, and NOT, as well as more advanced functions such as rank and select, are highly optimized.

  • Compression: RoaringBitmap uses three types of containers (Array, Bitmap, and Run) and intelligently chooses the most space-efficient container for each chunk of data.

  • Portability: RoaringBitmap has been implemented in multiple programming languages, such as Java, C, Go, and Rust. This allows it to be used in a wide variety of applications and environments.

  • Wide Adoption: Due to its balance of speed and memory efficiency, RoaringBitmap is used by several prominent open-source projects, such as Apache Lucene, Apache Spark, and Druid.

RoaringBitmap is an excellent choice for applications dealing with large sets of integer data where speed, memory efficiency, and data compression are of paramount importance.