net: optimize checksum computation

Very simple loop optimization with a significant performance impact. Microbenchmark results, modern x86-64: buffer size | speed up ------------+--------- 1500 | 1.7x 64 | 1.5x 8 | 1.15x Microbenchmark results, POWER7: buffer size | speed up ------------+--------- 1500 | 5x 64 | 3.3x 8 | 1.13x There is a lot of room for further improvement at the expense of code complexity - aligned multibyte reads, LE/BE considerations, architecture-specific optimizations, etc. This patch still keeps things simple and readable. Signed-off-by: Ladi Prosek <[email protected]> Reviewed-by: Dmitry Fleytman <[email protected]> Signed-off-by: Jason Wang <[email protected]>
n64decomp · Jan 20, 2017 · d5aa3e6 · d5aa3e6
1 parent d4aa431
commit d5aa3e6
Showing 1 changed file with 13 additions and 8 deletions.
diff --git a/net/checksum.c b/net/checksum.c
@@ -22,17 +22,22 @@
 
 uint32_t net_checksum_add_cont(int len, uint8_t *buf, int seq)
 {
-    uint32_t sum = 0;
+    uint32_t sum1 = 0, sum2 = 0;
     int i;
 
-    for (i = seq; i < seq + len; i++) {
-        if (i & 1) {
-            sum += (uint32_t)buf[i - seq];
-        } else {
-            sum += (uint32_t)buf[i - seq] << 8;
-        }
+    for (i = 0; i < len - 1; i += 2) {
+        sum1 += (uint32_t)buf[i];
+        sum2 += (uint32_t)buf[i + 1];
+    }
+    if (i < len) {
+        sum1 += (uint32_t)buf[i];
+    }
+
+    if (seq & 1) {
+        return sum1 + (sum2 << 8);
+    } else {
+        return sum2 + (sum1 << 8);
     }
-    return sum;
 }
 
 uint16_t net_checksum_finish(uint32_t sum)