You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running anon-link-entity-service with 6 hospitals each contributing 1,000,000 patients. During the run, there is a long section that uses about 60% of the available cpu, followed by a long (hours) period of time when only about 10% of the available CPU is being used, followed by the error shown in the attached logs.
Below is the area of the log where the error occurs. The attached files have more of the logs. I have the full logs but they are very large (about 1g).
I'm running anon-link-entity-service with 6 hospitals each contributing 1,000,000 patients. During the run, there is a long section that uses about 60% of the available cpu, followed by a long (hours) period of time when only about 10% of the available CPU is being used, followed by the error shown in the attached logs.
Below is the area of the log where the error occurs. The attached files have more of the logs. I have the full logs but they are very large (about 1g).
full-log-error-section.txt
run-log.txt
run-log-error-focus.txt
backend_1 | [debug ] Connecting to redis [entityservice.cache.connection] pid=8efc27085d66234af616468b4251613028f05fa792c02df9 port=26379 request=64fe99e6 rid=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1 server=redis
backend_1 | [info ] LOG_FILE: Connecting to redis [entityservice.cache.
connection] pid=8efc27085d66234af616468b4251613028f05fa792c02df9 request=64fe99e6 rid=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1
backend_1 | [debug ] total comparisons: 10000000000000 [entityservice.views.run.status] pid=8efc27085d66234af616468b4251613028f05fa792c02df9 request=64fe99e6 rid=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1
nginx_1 | [200] - 172.18.0.1 - "GET /api/v1/projects/8efc27085d66234af616468b4251613028f05fa792c02df9/runs/eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1/status HTTP/1.1" 335 920 374 0.014 "-" "python-requests/2.26.0" "-"
worker_a13_1 | [2021-12-30 20:44:28,832:
DEBUG/ForkPoolWorker-2] [debug ] setting up tracing on task [entityservice.tasks] task_name=aggregate_comparisons
worker_a13_1 | [2021-12-30 20:44:28,853: DEBUG/ForkPoolWorker-2] [debug ] Aggregating result chunks from 33060 files, total size: 958067760 [entityservice.tasks] pid=8efc27085d66234af616468b4251613028f05fa792c02df9 run_id=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1 task_name=aggregate_comparisons
worker_a13_1 | [2021-12-30 20:44:28,949: WARNING/ForkPoolWorker-2] [warning ] Task 33ed3bfa-0953-429b-a530-c2818051fc31 is retrying after a 'S3Error' exception [entityservice.tasks] pid=8efc27085d66234af616468b4251613028f05fa792c02df9 run_id=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1 task_name=aggregate_comparisons
worker_a13_1 | [2021-12-30 20:44:28,952: WARNING/ForkPoolWorker-2] /usr/lib/python3.9/signal.py:60: RuntimeWarning: invalid signal number 32, please use valid_signals()
worker_a13_1 | sigs_set = _signal.pthread_sigmask(how, mask)
worker_a13_1 |
worker_a13_1 | [2021-12-30 20:44:28,952: WARNING/ForkPoolWorker-2] /usr/lib/python3.9/signal.py:60: RuntimeWarning: invalid signal number 33, please use valid_signals()
worker_a13_1 | sigs_set = _signal.pthread_sigmask(how, mask)
worker_a13_1 |
worker_a13_1 | [2021-12-30 20:44:28,952: WARNING/ForkPoolWorker-2] /usr/lib/python3.9/signal.py:60: RuntimeWarning: invalid signal number 34, please use valid_signals()
worker_a13_1 | sigs_set = _signal.pthread_sigmask(how, mask)
worker_a13_1 |
worker_a13_1 | [2021-12-30 20:44:28,995: INFO/MainProcess] [info ] An error occurred while processing task [entityservice.tasks] run_id=eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1 task_id=<Context: {'lang': 'py', 'task': 'entityservice.tasks.comparing.aggregate_comparisons', 'id': '33ed3bfa-0953-429b-a530-c2818051fc31', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'a95304d4-eedc-4712-b479-315c1a3b3714', 'parent_id': '2434d2de-235b-45a8-8bc0-488c92b5438a', 'argsrepr': "([[18129, 435100, 'similarity-scores/771dc665dafe5bf0cbd98d2e.bin'], [871, 20908, 'similarity-scores/31cae8ef35597b136be6e185.bin'], [811, 19468, 'similarity-scores/fa76d3d05a26b3817834267f.bin'], [860, 20644, 'similarity-scores/f1f37c32c300f7bd365055c9.bin'], [879, 21100, 'similarity-scores/81719e644f12b1a400797d82.bin'], [860, 20644, 'similarity-scores/69fb24c851caebe8ed81ae03.bin'], [896, 21508, 'similarity-scores/26e9cbdcf0abf33b8a20c0fb.bin'], [929, 22300, 'similarity-scores/f74670768d2f987b4fb1b01b.bin'], [908, 21796, 'similarity-scores/3fca1b757b3f154bf1424045.bin'], [883, 21196, 'similarity-scores/06e7f5024859f3403eba7796.bin'], [919, 22060, 'similarity-scores/2ffa682d3afe6b54fff78835.bin'], [920, 22084, 'similarity-scores/498ef5687d1dcab93127d8eb.bin'], [837, 20092, 'similarity-scores/7befc3448aaf0822a0224496.bin'], [887, 21292, 'similarity-scores/615b4d08d37990b461ac70e9.bin'], [836, 20068, 'similarity-scores/e91720668cf63cc0769491b9.bin'], [924, 22180, 'similarity-scores/3ff5c3d9704a5a97163e2cd5.bin...', ...],)", 'kwargsrepr': "{'project_id': '8efc27085d66234af616468b4251613028f05fa792c02df9', 'run_id': 'eee1624a7a5a42e277f9e16dee29f30ca1f90fe440d494b1', 'parent_span': {'uber-trace-id': 'e38e6235a8b07c59:5a1d8351c03ad888:1e07d0b84a4e637d:1'}}", 'origin': 'gen8@6dc4aae3ab70', 'ignore_result': True, 'redelivered': True, 'reply_to': 'fa078a18-bf93-3164-8de4-0665067672f7', 'correlation_id': '33ed3bfa-0953-429b-a530-c2818051fc31', 'hostname': 'celery@bec4c46dba22', 'delivery_info': {'exchange': '', 'routing_key': 'highmemory', 'priority': 0, 'redelivered': None}, 'args': [[[18129, 435100, 'similarity-scores/771dc665dafe5bf0cbd98d2e.bin'], [871, 20908, 'similarity-scores/31cae8ef35597b136be6e185.bin'], [811, 19468, 'similarity-scores/fa76d3d05a26b3817834267f.bin'], [860, 20644, 'similarity-scores/f1f37c32c300f7bd365055c9.bin'], [879, 21100, 'similarity-scores/81719e644f12b1a400797d82.bin'], [860, 20644, 'similarity-scores/69fb24c851caebe8ed81ae03.bin'], [896, 21508, 'similarity-scores/26e9cbdcf0abf33b8a20c0fb.bin'], [929, 22300, 'similarity-scores/f74670768d2f987b4fb1b01b.bin'], [908, 21796, 'similarity-scores/3fca1b757b3f154bf1424045.bin'], [883, 21196, 'similarity-scores/06e7f5024859f3403eba7796.bin'], [919, 22060, 'similarity-scores/2ffa682d3afe6b54fff78835.bin'], [920, 22084, 'similarity-scores/498ef5687d1dcab93127d8eb.bin'], [837, 20092, 'similarity-scores/7befc3448aaf0822a0224496.bin'], [887, 21292, 'similarity-scores/615b4d08d37990b461ac70e9.bin'], [836, 20068, 'similarity-scores/e91720668cf63cc0769491b9.bin'], [924, 22180, 'similarity-scores/3ff5c3d9704a5a97163e2cd5.bin'], [878, 21076, 'similarity-scores/4c7ead830a841a3366663e76.bin'], [858, 20596, 'similarity-scores/2175593b9f3fbac45767a875.bin'], [853, 20476, 'similarity-scores/2546fe61601f8ec3fa087fdb.bin'], [889, 21340, 'similarity-scores/5a86e1059f0ee9ca9033ad56.bin'], [934, 22420, 'similarity-scores/eb2d93507c788eaa15a39f8e.bin'], [920, 22084, 'similarity-scores/6ebe9e0973cbad97c975d78f.bin'], [893, 21436, 'similarity-scores/e0ce6c8af04fead1a414bfc2.bin'], [882, 21172, 'similarity-scores/b21a1505cfcb6952a209163c.bin'], [914, 21940, 'similarity-scores/75c013d872e16288db3ff2d3.bin'], [910, 21844, 'similarity-scores/6e15c5aa835fb218d6bf588c.bin'], [928, 22276, 'similarity-scores/4855fd93634871ad004dad20.bin'], [867, 20812, 'similarity-scores/5fbc3a270cfe4b173d886320.bin'], [841, 20188, 'similarity-scores/d6bd858b6b5519a9392e0e17.bin'], [887, 21292, 'similarity-scores/d76e34f22dd656b373041df6.bin'], [948, 22756, 'similarity-scores/c3ac231c58efeaaf70057310.bin'], [908, 21796, 'similarity-scores/4227445c6b7c
The text was updated successfully, but these errors were encountered: