Fix stride and vector width calculations.
As Selur and Julek identified in #1, my filter calculations were off when processing non-mod 32 video (like 720x480).
This should be fixed in all filters now. I also found a bug in my vector (SIMD) calculations which should be fixed as well.