Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Very simple loop optimization with a significant performance impact. Microbenchmark results, modern x86-64: buffer size | speed up ------------+--------- 1500 | 1.7x 64 | 1.5x 8 | 1.15x Microbenchmark results, POWER7: buffer size | speed up ------------+--------- 1500 | 5x 64 | 3.3x 8 | 1.13x There is a lot of room for further improvement at the expense of code complexity - aligned multibyte reads, LE/BE considerations, architecture-specific optimizations, etc. This patch still keeps things simple and readable. Signed-off-by: Ladi Prosek <[email protected]> Reviewed-by: Dmitry Fleytman <[email protected]> Signed-off-by: Jason Wang <[email protected]>
- Loading branch information