-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potentially buffer overflow in make_block_q4_0x4
#1094
Comments
What compiler or flags are you using? I do not see it with gcc 13.3. |
I am using the default one when installing the compiler: $ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 12.2.0-14' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-12 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-offload-defaulted --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.2.0 (Debian 12.2.0-14) i tried to reinstall but it looks like the max version gcc on debian is 12.2.0: $ sudo apt install gcc
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
gcc is already the newest version (4:12.2.0-3). |
The oldest gcc version that I can easily install in my system is 12.3, and it does not generate this warning. I suspect that this is a spurious warning, possibly fixed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106904. |
what OS are you using with? @slaren $ sudo apt list gcc
Listing... Done
gcc/stable,now 4:12.2.0-3 amd64 [installed]
$ sudo apt install gcc=12.4.0
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Package gcc is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Version '12.4.0' for 'gcc' was not found |
As context, I try to install ollama that using ggml on debian, its stop on this step: ~/ollama$ make -j12
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\" " -trimpath -tags "avx" -o llama/build/linux-amd64/runners/cpu_avx/ollama_llama_server ./cmd/runner
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\" " -trimpath -tags "avx,avx2" -o llama/build/linux-amd64/runners/cpu_avx2/ollama_llama_server ./cmd/runner
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 88 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 120 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
GOARCH=amd64 go build -buildmode=pie "-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=0.5.7-0-ga420a45\" " -trimpath -o ollama .
# github.com/ollama/ollama/llama
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 72 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 88 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 104 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘block_q4_0x4 make_block_q4_0x4(block_q4_0*, unsigned int)’,
inlined from ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’ at ggml-cpu-aarch64.cpp:3711:39:
ggml-cpu-aarch64.cpp:3640:19: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
3640 | memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t));
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml-cpu-aarch64.cpp: In function ‘int repack_q4_0_to_q4_0_4_bl(ggml_tensor*, int, const void*, size_t)’:
ggml-cpu-aarch64.cpp:3711:20: note: at offset 120 into destination object ‘<anonymous>’ of size 72
3711 | *dst++ = make_block_q4_0x4(dst_tmp, interleave_block);
| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
I'm building ggml on debian and got warning and note:
it referencing this method:
ggml/src/ggml-cpu/ggml-cpu-aarch64.cpp
Lines 3594 to 3634 in 475e012
this might be issue in building app that requires ggml in debian such as ollama
The text was updated successfully, but these errors were encountered: