Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new feature: partitions of a vector #637

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bruderjakob17
Copy link

This is a first draft resulting from the discussion in #611.

Closes #611.

@skylerberg
Copy link

@bruderjakob17 and @phimuemue, this is a cool feature and I'm interested in using this feature! What's needed to get this ready to merge? I'd be happy to contribute anything needed.

Copy link
Member

@phimuemue phimuemue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I think this goes in the right direction.

  • Could you alter the documentation to more describe the high-level.
  • I think we should test this extensively. Maybe we should look up the number of partitions on Wikipedia and assert that our algorithm yields the correct number of partitions. Maybe we should quickcheck that each resulting partition is unique and is actually a partition of the input.
  • We should explicitly think and test for partitions of an empty iterator: I think it should yield Some([]) once and then None. Do you agree? This should be included in the tests, too.
  • My main worry is how we could test that this algorithm actually generates all the partitions, and each of them exactly once. Maybe starting out with a vector and a shuffled copy, and assert that both of them yield - maybe in differing order - the same partitions?

pub struct Partition<'a, T> where T: Copy {
elements: &'a Vec<T>,
index_map: Vec<usize>,
initialized: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename initialized to first.

Comment on lines +25 to +36
if let Some(&max_index) = self.index_map.iter().max() {
// initialize Vec's for the classes
let mut partition_classes = vec![Vec::new(); max_index + 1];
for i in 0..self.index_map.len() {
// elements[i] belongs to the partition class index_map[i]
partition_classes[self.index_map[i]].push(self.elements[i]);
}
return partition_classes;
} else {
// The index_map might have length 0, which means that there are no elements.
// There is precisely one partition of the empty set, namely the partition with no classes.
return Vec::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the if/else distinction by saying let number_of_partitions = self.index_map().max().map(|max_index| max_index+1).unwrap_or(0)? Sure, it may waste some cycles in the zero-case, but it reduces the number of code paths.


impl<'a, T> Partition<'a, T> where T: Copy {
// extracts the current partition of the iterator
fn create_partition(&self) -> PartitionRepresentation<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you inline create_partitions as a closure? (Makes code more local).

/// Example usage:
/// ```
/// for partition in partitions(&vec![7,8,9]){
/// println!("{:?}", partition);
Copy link
Member

@phimuemue phimuemue Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the documentation to contain two or three examples that are fully expanded (e.g. by assert_eq(iter.partitions(), vec![vec![1,2,3], vec![4], ...])? (println requires users to actually execute it, whereas the assert_eq solution shows the result right away.) Also, please document what happens when the iterator has duplicate elements (as far as I understand, they are treated as different elements, because we only operate on indices).

Maybe even add a note that the order of partitions is not guaranteed to stay the same over different itertools versions.

Comment on lines +48 to +61
// search for the highest index at which the index_map is incrementable (see the linked post)
for index in (1..self.index_map.len()).rev() {
if (0..index).any(|x| self.index_map[x] == self.index_map[index]) {
// increment the incrementable index
self.index_map[index] += 1;
// set all following entries to the lexicographically smallest suffix that makes the
// index_map viable (see linked post), i.e. to zero.
for x in index + 1..self.index_map.len() {
self.index_map[x] = 0;
}
return Some(self.create_partition());
}
}
// if there is no incrementable index, we have enumerated the last possible partition.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to change documentation to describe the algorithm high-level and completely remove the very-fine grained comments (such as "increment the incrementable index").

High-level algo:

  • Partition represented as an index sequence: index_map[i]==p means: element with index i lands in the permutation class p. index_map[0]==0 always by convention.
  • For an index sequence to describe a partition, for each prefix index_map[0..i], we must have each integer from 0 to max(index_map[0..i]) present (because a skipped integer would mean that we could assign a partition's class a lower number without changing the permutation itself).
  • Adding a new element to an existing partition described by index_map[0..i] amounts to appending an index to index_map[0..i]: To construct new partitions, the new element can be added to each existing partition class (amounts to iterating index_map[i] from 0 to max(index_map[0..i])) or it can form a new partition class by itself (amounts to setting index_map[i] = max(index_map[0..i])+1).
  • Thus, in each step, we look for the rightmost index that can be incremented according to the aforementioned rules, increment it, and set the indices after it to 0.
  • (For more info, see [stackoverflow link]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: partitions
3 participants