diff --git a/Data/ByteString.hs b/Data/ByteString.hs index fc687f0e8..3951092c4 100644 --- a/Data/ByteString.hs +++ b/Data/ByteString.hs @@ -47,6 +47,22 @@ module Data.ByteString ( ByteString, StrictByteString, + -- ** Heap fragmentation + -- | With GHC, the 'ByteString' representation uses /pinned memory/, + -- meaning it cannot be moved by GC. While this is ideal for use with + -- the foreign function interface and is usually efficient, this + -- representation may lead to issues with heap fragmentation and wasted + -- space if the program selectively retains a fraction of many small + -- 'ByteString's, keeping them live in memory over long durations. + -- + -- While 'ByteString' is indispensable when working with large blobs of + -- data and especially when interfacing with native C libraries, be sure + -- to also check the 'Data.ByteString.Short.ShortByteString' type. + -- As a type backed by /unpinned/ memory, @ShortByteString@ behaves + -- similarly to @Text@ (from the @text@ package) on the heap, completely + -- avoids fragmentation issues, and in many use-cases may better suit + -- your bytestring-storage needs. + -- * Introducing and eliminating 'ByteString's empty, singleton, diff --git a/bytestring.cabal b/bytestring.cabal index 027ce2e9b..5371ac6d5 100644 --- a/bytestring.cabal +++ b/bytestring.cabal @@ -30,7 +30,8 @@ Description: . There is also a 'ShortByteString' type which has a lower memory overhead and can be converted to or from a 'ByteString'. It is suitable for keeping - many short strings in memory. + many short strings in memory, especially long-term, without incurring any + possible heap fragmentation costs. . 'ByteString's are not designed for Unicode. For Unicode strings you should use the 'Text' type from the @text@ package.