Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array combinations #546

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

conradludgate
Copy link

@conradludgate conradludgate commented May 23, 2021

An alternative to TupleCombinations that works in a similar fashion to the standard Combinations iterator. It's implemented using const generics and manages to work for >12 elements too. Another benefit is that it doesn't require the iterator to implement Clone, only I::Item

I doubt I'll be able to get merged as is. Since this uses const generics, it would need to either be feature gated or we raise the MSRV from 1.32 all the way to 1.51

Benchmark results:

tuple comb for1         time:   [23.681 us 23.714 us 23.757 us]
tuple comb for2         time:   [27.554 us 27.584 us 27.617 us]
tuple comb for3         time:   [67.963 us 68.138 us 68.343 us]
tuple comb for4         time:   [60.098 us 60.182 us 60.280 us]

tuple comb c1           time:   [24.335 us 24.357 us 24.383 us]
tuple comb c2           time:   [43.833 us 43.876 us 43.924 us]
tuple comb c3           time:   [282.67 us 282.98 us 283.31 us]
tuple comb c4           time:   [1.2475 ms 1.2496 ms 1.2525 ms]

array comb c1           time:   [64.303 us 64.325 us 64.349 us]
array comb c2           time:   [201.38 us 201.49 us 201.61 us]
array comb c3           time:   [120.58 us 120.96 us 121.41 us]
array comb c4           time:   [176.37 us 176.58 us 176.79 us]

Interesting notes: c1 and c2 for this version are much higher than I would have expected but c3 and c4 are much lower than the tuple alternatives

src/lib.rs Outdated Show resolved Hide resolved
Copy link
Member

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tremendous thanks for submitting this PR. I apologize in advance for how long it's going to take me to review this; I'm in the home stretch of my dissertation and that's eating up my time.

I think I'm amenable to increasing the MSRV for const-generics in a major version bump, but it'd be great if we could identify any other parts of the library that'd also benefit from const generics first.

src/adaptors/mod.rs Outdated Show resolved Hide resolved
@conradludgate
Copy link
Author

I think I'm amenable to increasing the MSRV for const-generics in a major version bump, but it'd be great if we could identify any other parts of the library that'd also benefit from const generics first.

Certainly. Perhaps it's worth opening a tracking issue and a feature branch for 0.11?

@conradludgate
Copy link
Author

For now, I'm not in a rush to get this merged in. For my own use cases, noting the performance details above, I ended up making a macro to implement the for loop strategy simply

for_combinations!([a, b, c] in slice {
	println!("{:?} {:?} {:?}", a, b, c);
}):

But I'm happy to look into the const generics major bump

@conradludgate
Copy link
Author

conradludgate commented May 25, 2021

I've updated the implementation quite a bit. I've used array_init for now to handle the safety of array initialisation. I've also implemented the lazy approach as discussed, I've also changed the combinations algorithm to support infinite iterators from #330

Updated benchmark times:

tuple comb for1         time:   [24.681 us 24.730 us 24.781 us]
tuple comb for2         time:   [27.828 us 27.894 us 28.000 us]
tuple comb for3         time:   [66.224 us 66.343 us 66.501 us]
tuple comb for4         time:   [78.554 us 78.648 us 78.766 us]

tuple comb c1           time:   [24.809 us 24.961 us 25.130 us]
tuple comb c2           time:   [43.905 us 43.985 us 44.081 us]
tuple comb c3           time:   [287.64 us 287.92 us 288.22 us]
tuple comb c4           time:   [1.2784 ms 1.2798 ms 1.2812 ms]

array comb c1           time:   [388.77 us 396.34 us 403.18 us]
array comb c2           time:   [306.41 us 315.74 us 327.54 us]
array comb c3           time:   [458.90 us 460.14 us 461.50 us]
array comb c4           time:   [484.57 us 485.31 us 486.09 us]

#[derive(Debug, Clone)]
pub struct ArrayCombinations<I: Iterator, const R: usize> {
iter: I,
buf: Vec<I::Item>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to use an ArrayVec here, and so avoid heap allocations entirely?

Copy link
Member

@phimuemue phimuemue Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, buf caches the elements from iter, so I think it is impossible to impose an upper bound on the length here.

However, and if my suspicion is correct, I am thinking if we should actually use LazyBuffer here - unsure about that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, quite right.

Actually, my current use-case is to generate pairs of items from an iterator over a large array; so buffering all items is quite costly... in this case, my iterator implements Clone so I can instead hold R clones of the iterator and entirely avoiding dynamic/heap allocations:

struct ArrayCombinations<I: Iterator, const R: usize> {
    /// Each element of this array is an iterator that can be inspected for
    /// the corresponding element of the resulting combo.
    ///
    /// Peeking each iterator yields the last emitted value, such that
    /// invoking `next()` first advances the state before emitting the
    /// result.  For example, given `[1,2,3,4,5].array_combinations::<3>()`,
    /// the initial state of `iters` is such that peeking each iterator element
    /// will yield `[1, 2, 2]` (becoming `[1, 2, 3]` after `next()` is called).
    iters: [Peekable<I>; R],
}

impl<I, const R: usize> ArrayCombinations<I, R>
where
    I: Iterator + Clone,
    I::Item: Clone,
{
    pub fn new(iter: I) -> Self {
        let mut iters = ArrayVec::new();
        iters.push(iter.peekable());
        for idx in 1..R {
            let last = iters.last_mut().unwrap();
            if idx > 1 {
                last.next();
            }
            let last = last.clone();
            iters.push(last);
        }
        Self { iters: iters.into_inner().ok().unwrap() }
    }
}

impl<I, const R: usize> Iterator for ArrayCombinations<I, R>
where
    I: Iterator + Clone,
    I::Item: Clone,
{
    type Item = [I::Item; R];

    fn next(&mut self) -> Option<Self::Item> {
        'search: for idx in (0..R).rev() {
            self.iters[idx].next();
            if self.iters[idx].peek().is_some() {
                let mut clone = self.iters[idx].clone();
                for reset in self.iters[idx+1..].iter_mut() {
                    clone.next();
                    match clone.peek() {
                        Some(_) => *reset = clone.clone(),
                        None => continue 'search,
                    }
                }
                break;
            }
        }

        self.iters
            .iter_mut()
            .map(Peekable::peek)
            .map(Option::<&_>::cloned)
            .collect::<Option<ArrayVec<_, R>>>()
            .map(ArrayVec::into_inner)
            .map(Result::ok)
            .flatten()
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
const-generics Require Rust 1.51 or newer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants