Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially buffer overflow in make_block_q4_0x4 #1094

Open
ilhamsyahids opened this issue Feb 2, 2025 · 5 comments · May be fixed by #1095
Open

Potentially buffer overflow in make_block_q4_0x4 #1094

ilhamsyahids opened this issue Feb 2, 2025 · 5 comments · May be fixed by #1095

Comments

@ilhamsyahids
Copy link

I'm building ggml on debian and got warning and note:

~/ggml/build$ cmake --build . --config Release -j 8
[  1%] Building C object src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[  2%] Building CXX object examples/CMakeFiles/common.dir/common.cpp.o
[  3%] Building CXX object src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[  5%] Building C object src/CMakeFiles/ggml-base.dir/ggml.c.o
[  6%] Building CXX object src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[  6%] Building CXX object src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[  7%] Building C object src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[  8%] Building CXX object src/CMakeFiles/ggml-base.dir/gguf.cpp.o
[  9%] Linking CXX shared library libggml-base.so
[  9%] Built target ggml-base
[ 10%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.cpp.o
[ 11%] Building C object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
[ 13%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-aarch64.cpp.o
[ 15%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-hbm.cpp.o
[ 15%] Building C object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-quants.c.o
[ 15%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu-traits.cpp.o
[ 16%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/amx.cpp.o
[ 17%] Building CXX object src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/mmq.cpp.o
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at /home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3685:39:
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3614:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
 3614 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3685:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
 3685 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at /home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3685:39:
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3614:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
 3614 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
/home/admin/ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp:3685:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
 3685 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 18%] Linking CXX static library libcommon.a
[ 18%] Built target common
[ 19%] Linking CXX shared library libggml-cpu.so
[ 19%] Built target ggml-cpu
[ 20%] Building CXX object src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[ 21%] Linking CXX shared library libggml.so
[ 21%] Built target ggml
...

it referencing this method:

static block_q4_0x4 make_block_q4_0x4(block_q4_0 * in, unsigned int blck_size_interleave) {
block_q4_0x4 out;
for (int i = 0; i < 4; i++) {
out.d[i] = in[i].d;
}
const int end = QK4_0 * 2 / blck_size_interleave;
if (blck_size_interleave == 8) {
const uint64_t xor_mask = 0x8888888888888888ULL;
for (int i = 0; i < end; ++i) {
int src_id = i % 4;
int src_offset = (i / 4) * blck_size_interleave;
int dst_offset = i * blck_size_interleave;
uint64_t elems;
// Using memcpy to avoid unaligned memory accesses
memcpy(&elems, &in[src_id].qs[src_offset], sizeof(uint64_t));
elems ^= xor_mask;
memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
}
} else if (blck_size_interleave == 4) {
const uint32_t xor_mask = 0x88888888;
for (int i = 0; i < end; ++i) {
int src_id = i % 4;
int src_offset = (i / 4) * blck_size_interleave;
int dst_offset = i * blck_size_interleave;
uint32_t elems;
memcpy(&elems, &in[src_id].qs[src_offset], sizeof(uint32_t));
elems ^= xor_mask;
memcpy(&out.qs[dst_offset], &elems, sizeof(uint32_t));
}
} else {
GGML_ASSERT(false);
}
return out;
}


this might be issue in building app that requires ggml in debian such as ollama

@slaren
Copy link
Collaborator

slaren commented Feb 2, 2025

What compiler or flags are you using? I do not see it with gcc 13.3.

@ilhamsyahids
Copy link
Author

ilhamsyahids commented Feb 2, 2025

I am using the default one when installing the compiler:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 12.2.0-14' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-12 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (Debian 12.2.0-14)

i tried to reinstall but it looks like the max version gcc on debian is 12.2.0:

$ sudo apt install gcc
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
gcc is already the newest version (4:12.2.0-3).

@slaren
Copy link
Collaborator

slaren commented Feb 3, 2025

The oldest gcc version that I can easily install in my system is 12.3, and it does not generate this warning. I suspect that this is a spurious warning, possibly fixed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106904.
This is performance sensitive code, so if removing this warning is absolutely necessary for you, it would be better to do so by disabling the warning in the compiler than by adding unnecessary and potentially expensive checks.

@ilhamsyahids
Copy link
Author

what OS are you using with? @slaren
on debian the latest stable is 12.2.0. the latest version is 12.4 but its on testing, i think thats why i cant install it:

$ sudo apt list gcc
Listing... Done
gcc/stable,now 4:12.2.0-3 amd64 [installed]

$ sudo apt install gcc=12.4.0
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Package gcc is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

E: Version '12.4.0' for 'gcc' was not found

@ilhamsyahids
Copy link
Author

As context, I try to install ollama that using ggml on debian, its stop on this step:

~/ollama$ make -j12
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\"  " -trimpath -tags "avx" -o llama/build/linux-amd64/runners/cpu_avx/ollama_llama_server ./cmd/runner
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\"  " -trimpath -tags "avx,avx2" -o llama/build/linux-amd64/runners/cpu_avx2/ollama_llama_server ./cmd/runner
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 88 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 120 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\"  " -trimpath  -o ollama .
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 88 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
    inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
 3640 |             memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
      |             ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 120 into destination object ‘<anonymous>’ of size 72
 3711 |             *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
      |             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants