-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new feature: partitions of a vector #637
base: master
Are you sure you want to change the base?
Conversation
@bruderjakob17 and @phimuemue, this is a cool feature and I'm interested in using this feature! What's needed to get this ready to merge? I'd be happy to contribute anything needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I think this goes in the right direction.
- Could you alter the documentation to more describe the high-level.
- I think we should test this extensively. Maybe we should look up the number of partitions on Wikipedia and assert that our algorithm yields the correct number of partitions. Maybe we should quickcheck that each resulting partition is unique and is actually a partition of the input.
- We should explicitly think and test for partitions of an empty iterator: I think it should yield
Some([])
once and thenNone
. Do you agree? This should be included in the tests, too. - My main worry is how we could test that this algorithm actually generates all the partitions, and each of them exactly once. Maybe starting out with a vector and a shuffled copy, and assert that both of them yield - maybe in differing order - the same partitions?
pub struct Partition<'a, T> where T: Copy { | ||
elements: &'a Vec<T>, | ||
index_map: Vec<usize>, | ||
initialized: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename initialized
to first
.
if let Some(&max_index) = self.index_map.iter().max() { | ||
// initialize Vec's for the classes | ||
let mut partition_classes = vec![Vec::new(); max_index + 1]; | ||
for i in 0..self.index_map.len() { | ||
// elements[i] belongs to the partition class index_map[i] | ||
partition_classes[self.index_map[i]].push(self.elements[i]); | ||
} | ||
return partition_classes; | ||
} else { | ||
// The index_map might have length 0, which means that there are no elements. | ||
// There is precisely one partition of the empty set, namely the partition with no classes. | ||
return Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid the if/else distinction by saying let number_of_partitions = self.index_map().max().map(|max_index| max_index+1).unwrap_or(0)
? Sure, it may waste some cycles in the zero-case, but it reduces the number of code paths.
|
||
impl<'a, T> Partition<'a, T> where T: Copy { | ||
// extracts the current partition of the iterator | ||
fn create_partition(&self) -> PartitionRepresentation<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you inline create_partitions
as a closure? (Makes code more local).
/// Example usage: | ||
/// ``` | ||
/// for partition in partitions(&vec![7,8,9]){ | ||
/// println!("{:?}", partition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the documentation to contain two or three examples that are fully expanded (e.g. by assert_eq(iter.partitions(), vec![vec![1,2,3], vec![4], ...])
? (println requires users to actually execute it, whereas the assert_eq solution shows the result right away.) Also, please document what happens when the iterator has duplicate elements (as far as I understand, they are treated as different elements, because we only operate on indices).
Maybe even add a note that the order of partitions is not guaranteed to stay the same over different itertools versions.
// search for the highest index at which the index_map is incrementable (see the linked post) | ||
for index in (1..self.index_map.len()).rev() { | ||
if (0..index).any(|x| self.index_map[x] == self.index_map[index]) { | ||
// increment the incrementable index | ||
self.index_map[index] += 1; | ||
// set all following entries to the lexicographically smallest suffix that makes the | ||
// index_map viable (see linked post), i.e. to zero. | ||
for x in index + 1..self.index_map.len() { | ||
self.index_map[x] = 0; | ||
} | ||
return Some(self.create_partition()); | ||
} | ||
} | ||
// if there is no incrementable index, we have enumerated the last possible partition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to change documentation to describe the algorithm high-level and completely remove the very-fine grained comments (such as "increment the incrementable index").
High-level algo:
- Partition represented as an index sequence: index_map[i]==p means: element with index i lands in the permutation class p. index_map[0]==0 always by convention.
- For an index sequence to describe a partition, for each prefix index_map[0..i], we must have each integer from 0 to max(index_map[0..i]) present (because a skipped integer would mean that we could assign a partition's class a lower number without changing the permutation itself).
- Adding a new element to an existing partition described by index_map[0..i] amounts to appending an index to index_map[0..i]: To construct new partitions, the new element can be added to each existing partition class (amounts to iterating index_map[i] from 0 to max(index_map[0..i])) or it can form a new partition class by itself (amounts to setting index_map[i] = max(index_map[0..i])+1).
- Thus, in each step, we look for the rightmost index that can be incremented according to the aforementioned rules, increment it, and set the indices after it to 0.
- (For more info, see [stackoverflow link]).
…tions of a vector
e7ada66
to
83deca8
Compare
This is a first draft resulting from the discussion in #611.
Closes #611.