Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new feature: partitions of a vector #637

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,7 @@ mod minmax;
#[cfg(feature = "use_alloc")]
mod multipeek_impl;
mod pad_tail;
mod partitions;
#[cfg(feature = "use_alloc")]
mod peek_nth;
mod peeking_take_while;
Expand Down
74 changes: 74 additions & 0 deletions src/partitions.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
/// Based on https://stackoverflow.com/a/30898130/4322240.

/// Representation of a state in which the iterator can be.
/// elements: contains the elements to be partitioned. All elements are treated
/// as distinct.
/// index_map: in terms of the linked post, index_map[i] = p_i (but 0-indexed)
/// is the partition class to which elements[i] belongs in the current
/// partition
/// initialized: marks if the Partition element was just initialized and hence
/// was not generated by next()
pub struct Partition<'a, T> where T: Copy {
elements: &'a Vec<T>,
index_map: Vec<usize>,
initialized: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename initialized to first.

}

type PartitionRepresentation<T> = Vec<Vec<T>>;


impl<'a, T> Partition<'a, T> where T: Copy {
// extracts the current partition of the iterator
fn create_partition(&self) -> PartitionRepresentation<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you inline create_partitions as a closure? (Makes code more local).

// max_index is the highest number used for a class in the partition.
// Since the first class is numbered 0, there are max_index + 1 different classes.
if let Some(&max_index) = self.index_map.iter().max() {
// initialize Vec's for the classes
let mut partition_classes = vec![Vec::new(); max_index + 1];
for i in 0..self.index_map.len() {
// elements[i] belongs to the partition class index_map[i]
partition_classes[self.index_map[i]].push(self.elements[i]);
}
return partition_classes;
} else {
// The index_map might have length 0, which means that there are no elements.
// There is precisely one partition of the empty set, namely the partition with no classes.
return Vec::new();
Comment on lines +25 to +36
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the if/else distinction by saying let number_of_partitions = self.index_map().max().map(|max_index| max_index+1).unwrap_or(0)? Sure, it may waste some cycles in the zero-case, but it reduces the number of code paths.

}
}
}
impl<'a, T> Iterator for Partition<'a, T> where T: Copy {
type Item = PartitionRepresentation<T>;

fn next(&mut self) -> Option<Self::Item> {
if self.initialized {
self.initialized = false;
return Some(self.create_partition());
}
// search for the highest index at which the index_map is incrementable (see the linked post)
for index in (1..self.index_map.len()).rev() {
if (0..index).any(|x| self.index_map[x] == self.index_map[index]) {
// increment the incrementable index
self.index_map[index] += 1;
// set all following entries to the lexicographically smallest suffix that makes the
// index_map viable (see linked post), i.e. to zero.
for x in index + 1..self.index_map.len() {
self.index_map[x] = 0;
}
return Some(self.create_partition());
}
}
// if there is no incrementable index, we have enumerated the last possible partition.
Comment on lines +48 to +61
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change documentation to describe the algorithm high-level and completely remove the very-fine grained comments (such as "increment the incrementable index").

High-level algo:

  • Partition represented as an index sequence: index_map[i]==p means: element with index i lands in the permutation class p. index_map[0]==0 always by convention.
  • For an index sequence to describe a partition, for each prefix index_map[0..i], we must have each integer from 0 to max(index_map[0..i]) present (because a skipped integer would mean that we could assign a partition's class a lower number without changing the permutation itself).
  • Adding a new element to an existing partition described by index_map[0..i] amounts to appending an index to index_map[0..i]: To construct new partitions, the new element can be added to each existing partition class (amounts to iterating index_map[i] from 0 to max(index_map[0..i])) or it can form a new partition class by itself (amounts to setting index_map[i] = max(index_map[0..i])+1).
  • Thus, in each step, we look for the rightmost index that can be incremented according to the aforementioned rules, increment it, and set the indices after it to 0.
  • (For more info, see [stackoverflow link]).

return None;
}
}
/// Returns an Iterator over all partitions of the given Vec.
/// Example usage:
/// ```
/// for partition in partitions(&vec![7,8,9]){
/// println!("{:?}", partition);
Copy link
Member

@phimuemue phimuemue Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the documentation to contain two or three examples that are fully expanded (e.g. by assert_eq(iter.partitions(), vec![vec![1,2,3], vec![4], ...])? (println requires users to actually execute it, whereas the assert_eq solution shows the result right away.) Also, please document what happens when the iterator has duplicate elements (as far as I understand, they are treated as different elements, because we only operate on indices).

Maybe even add a note that the order of partitions is not guaranteed to stay the same over different itertools versions.

/// }
/// ```
pub fn partitions<'a, T>(v: &'a Vec<T>) -> Partition<'a, T> where T: Copy {
Partition::<T> { elements: v, index_map: vec![0; v.len()], initialized: true }
}
Loading