-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fromListN functions good consumers #424
Conversation
b47cc9c
to
390d468
Compare
Looks like |
I'll drop support for 8.0 to deal with In PR 425, I've adopted the benchmark that you linked to, but my copy of it just measures the performance of the implementation that is currently in
After your change:
And then after your change but with GHC 9.8.2:
Maybe the improved performance only shows up when lists are much larger? My example set in the benchmark suite only has 1024 elements. |
If that's true, is it worth it to branch on the Int, providing the old implementation if it's sufficiently small? |
...in terms of list fusion.
390d468
to
39f7bf1
Compare
Thanks, rebased.
No, it's just that I benchmarked I've also updated the gist, and we are now faster for Array.
|
I'm wondering why the new |
As far as I can tell, this is due to GC. GHC must be traversing the large If I reduce n to 1000, the difference reduces.
Bumping n to 10^6 and collecting RTS stats for one iteration shows
So I think the PR is fine, nothing strange going on with the code. |
Now I'm seeing better performance:
Weird, GC should not be happening at all while the array of boxed values is being initialized. Nothing should be getting allocated (other than the array itself, but the PrimArray is the same size), so there is no yield point for the GC. We are just copying pointers from a source into a destination. Regardless, this is certainly an improvement. I'll poke around a little more to see if anything seems odd, and then I'll merge. Thanks! |
Oh, I was mistaken. There are allocations. Here's a bit of GHC Core from the benchmark suite:
We have to repeatedly allocate a closure at the first argument to the recursively defined Merging. |
Yes, that's the source of the allocations. Thanks for merging! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was a bit too late with my comments. I'll submit my review anyways, maybe it will be useful, but feel free to ignore it.
-- We want arrayFromListN to be a "good consumer" in list fusion, so we define | ||
-- the function using foldr and inline it to help fire fusion rules. | ||
-- If fusion occurs with a "good producer", it may reduce to a fold on some | ||
-- structure. In certain cases (such as for Data.Set) GHC is not be able to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is best to move this sentence ("In certain cases (such as for Data.Set) GHC is not be able to ...
") to a comment into the body of the function, since that is not relevant information for the user.
-- structure. In certain cases (such as for Data.Set) GHC is not be able to | |
-- structure. In certain cases (such as for Data.Set) GHC is not able to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "user" do you mean? This comment block is not part of the Haddocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, you are right. I did not notice an empty new line on line number 589 between the haddock and the comment.
then return () | ||
else die "fromListN" "list length less than specified size" | ||
go !ix (x : xs) = if ix < n | ||
f x k = GHC.Exts.oneShot $ \ix# -> if I# ix# < n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewthad You didn't need to drop support for ghc-8.0 in #426, since oneShot
is available form ghc-prim
for a very long time, so it could have been imported from GHC.Magic
for all GHC versions, instead of relying on CPP. Especially since the test suite already depends on ghc-prim
. Seems like a very minor thing to drop support for a whole ghc version, but I personally not gonna cry over it 🥲
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I would've needed to add the dependency on ghc-prim back to primitive, and it was so nice to finally have it gone. Dropping GHC 8.0 has the additional advantage of letting me delete several shims in #427. I'll probably end up dropping support for all GHCs earlier than 8.6 soon.
Here's something which GHC optimizes well and it works with zero heap-allocations (other than the array). arrayFromFoldableN :: Foldable f => Int -> f a -> Array a
arrayFromFoldableN n xs = createArray n undefined $ \arr -> do
j <- ($ 0) $ runS $ flip foldMap xs $ \x -> S $ \i ->
if i < n
then do
writeArray arr i x
pure $! i+1
else die "fromListN" "list length greater than specified size"
if j == n
then pure ()
else die "fromListN" "list length less than specified size"
{-# INLINE arrayFromFoldableN #-}
-- Might want to use oneShot, but works well enough with Set
newtype S s = S { runS :: Int -> ST s Int }
instance Semigroup (S s) where
(<>) = coerce ((>=>) @(ST _) @Int @Int @Int)
instance Monoid (S s) where
mempty = S pure
----------
arrayFromSetFoldable :: S.Set a -> Array a
arrayFromSetFoldable s = arrayFromFoldableN (S.size s) s
The catch is that we're using This function is also fusion-friendly, because |
This is pretty clever. I believe that it should be possible to accomplish the same thing with
Setting I would take another PR with a |
A correction: it allocates
Yeah this does not work as well. It is just what we have after this PR, and it requires closures being allocated as you saw above.
Which is the Being able to use go Tip = mempty
go (Bin _ x l r) = go l <> f x <> go r becomes
And no closures necessary! (unlike with
In that case I can look into it. Would you keep both fromFoldMapN :: (forall m. Monoid m => (a -> m) -> f -> m) -> Int -> f -> Array a Which is perhaps a little intimidating but clearer in intent. It can also be used with monomorphic collections.
That's true, but there are unboxed tree-like structures ( |
...in terms of list fusion.
Closes #418.
Note: Compared to the proposed change in #418, this PR also uses
GHC.Exts.oneShot
. GHC doesn't need this help if the input is just a list, but I have found that it can improve performance after fusion reduces it to a foldr on another structure, such as a Set. Benchmarks showing this effect: gist