-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deepseek 671B unable to run locally (Flatpak) #510
Comments
Sorry, just noticed there was a debugger function. Please refer to below. Couldn't find '/home/user/.ollama/id_ed25519'. Generating new private key. ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIObIEWaCEq49QSa3EgMEFudE9WqAhyBh9rfrPK6Zt/XX 2025/02/01 15:35:36 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11435 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/user/.var/app/com.jeffser.Alpaca/data/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
time=2025-02-01T15:35:36.462+08:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11435 (version 0.5.7)" |
can you tell about the system configuration that you have |
Sure. System Specifications: I also tested with Deepseek 70b which is about 40GB and I can run it successfully with my VRAM (20GB to 22GB used) and the remainder overflow properly to my system RAM (~22GB). |
Have you tried with ollama directly. |
would interest me as well A feature to disable CPU fallback (GPU only) or force CPU-only (globally / each model) usage would be handy. Occasionally, the app partially loads models into VRAM, then fails (until app restart clears the VRAM) and switches to CPU, requiring manual termination repeatedly. Not tested the latest releases in that regard) |
Hi,
I encountered the following error when I try to run DeepSeek 671B on my system.
user@fedora:~$ flatpak run com.jeffser.Alpaca
INFO [main.py | main] Alpaca version: 4.0.0
INFO [connection_handler.py | start] Starting Alpaca's Ollama instance...
INFO [connection_handler.py | start] Started Alpaca's Ollama instance
INFO [connection_handler.py | start] client version is 0.5.7
ERROR [window.py | run_message] ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Exception in thread Thread-5 (run_message):
Traceback (most recent call last):
File "/app/lib/python3.12/site-packages/urllib3/connectionpool.py", line 793, in urlopen
ERROR [window.py | generate_chat_title] ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/connectionpool.py", line 537, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/connection.py", line 466, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/http/client.py", line 1428, in getresponse
response.begin()
File "/usr/lib/python3.12/http/client.py", line 331, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/http/client.py", line 300, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/lib/python3.12/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/connectionpool.py", line 847, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/util/retry.py", line 470, in increment
raise reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/util/util.py", line 38, in reraise
raise value.with_traceback(tb)
File "/app/lib/python3.12/site-packages/urllib3/connectionpool.py", line 793, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/connectionpool.py", line 537, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/urllib3/connection.py", line 466, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/http/client.py", line 1428, in getresponse
response.begin()
File "/usr/lib/python3.12/http/client.py", line 331, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/http/client.py", line 300, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/share/Alpaca/alpaca/window.py", line 670, in run_message
response = self.ollama_instance.request("POST", "api/chat", json.dumps(data), lambda data, message_element=message_element: message_element.update_message(data))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/share/Alpaca/alpaca/connection_handler.py", line 82, in request
response = requests.post(connection_url, headers=self.get_headers(True), data=data, stream=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/requests/api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lib/python3.12/site-packages/requests/adapters.py", line 501, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
self.run()
File "/usr/lib/python3.12/threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
File "/app/share/Alpaca/alpaca/window.py", line 675, in run_message
raise Exception(e)
Exception: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
I am using the integrated Ollama instance which was shown Running. No changes or modifications were made in the Ollama Instance section.
System Specifications:
GPU: 4090
RAM: 768GB
OS: Fedora 41 Gnome
I tested with a smaller model (Qwen2 72B) and has no issue generating a response with no errors. This may be due to it fitting into my 4090 (99% utilization) and not spilling over to system RAM whereas Deepseek 671B cannot.
Is there a way to disable loading of models into VRAM and into system RAM only to test this?
The text was updated successfully, but these errors were encountered: