Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocTCPBase/multiprocUDPBase not working on a Mac #92

Open
tomerd opened this issue Apr 22, 2024 · 10 comments
Open

multiprocTCPBase/multiprocUDPBase not working on a Mac #92

tomerd opened this issue Apr 22, 2024 · 10 comments

Comments

@tomerd
Copy link

tomerd commented Apr 22, 2024

multiprocTCPBase and multiprocUDPBase fail in the same way on a Mac

tracked it down to multiprocessCommon::_startAdmin where the response after starting the admin comes back null

...
self.transport.connectEndpoint(endpointPrep)

response = self.transport.run(None, MAX_ADMIN_STARTUP_DELAY)
^^

exact same setup works fine on linux, but this makes developing on a Mac difficult. triple checked nothing else is listening on the port, so not sure what is going on

startup log:

2024-04-21 22:32:58,518 ++++ Actor System gen (3, 10) started, admin @ ActorAddr-(UDP|:7070)
2024-04-21 22:32:58,518 Thespian source: <redacted>/lib/python3.12/site-packages/thespian/__init__.py
None
2024-04-21 22:33:03,520 startup failed: ActorAddr-(UDP|:7070) is not a valid ActorSystem admin
2024-04-21 22:41:23,604 ++++ Actor System gen (3, 10) started, admin @ ActorAddr-(T|:7070)
2024-04-21 22:41:23,605 Thespian source: <redacted>/lib/python3.12/site-packages/thespian/__init__.py
None
2024-04-21 22:41:28,606 startup failed: ActorAddr-(T|:7070) is not a valid ActorSystem admin
@kquick
Copy link
Owner

kquick commented Apr 22, 2024

Thespian has worked previously on Mac systems, but I haven't had a Mac for several years, so it's possible there have been changes in the OS releases in the interim.

Can you provide the Mac OS release info, and also run $ python3 thespian/diagnose.py and provide the results?

@tomerd
Copy link
Author

tomerd commented Apr 23, 2024

❯ python ./thespian/diagnose.py
Initiating diagnostics
   [......Python] : namespace(name='cpython', cache_tag='cpython-312', version=sys.version_info(major=3, minor=12, micro=3, releaselevel='final', serial=0), hexversion=51119088, _multiarch='darwin')
   [.........(t)] : sys.thread_info(name='pthread', lock='mutex+cond', version=None)
   [.........(p)] : darwin
   [........(mp)] : [<apple_certifi._vendor.wrapt.importer.ImportHookFinder object at 0x102cb8e90>, <_distutils_hack.DistutilsMetaFinder object at 0x102cb97c0>, <class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib_external.PathFinder'>]
# checking imports -> verified ok
# checking thespian internal system imports -> verified ok
# checking existing running actors was skipped - please install psutils python package to support this
# checking hostname -> verified ok
# checking fqdn -> verified ok
# checking addr info proto=UDP desc=default usage=0 -> verified ok
# checking addr info proto=UDP desc=default usage=passive -> verified ok
# checking addr info addr=tombp-5.local proto=UDP desc=hostname usage=0 -> verified ok
# checking addr info addr=tombp-5.local proto=UDP desc=hostname usage=passive -> verified ok
# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=UDP desc=fqdn us# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=UDP desc=fqdn usage=0 -> verified ok
# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=UDP desc=fqdn us# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=UDP desc=fqdn usage=passive -> verified ok
# checking addr info proto=TCP desc=default usage=0 -> verified ok
# checking addr info proto=TCP desc=default usage=passive -> verified ok
# checking addr info addr=tombp-5.local proto=TCP desc=hostname usage=0 -> verified ok
# checking addr info addr=tombp-5.local proto=TCP desc=hostname usage=passive -> verified ok
# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=TCP desc=fqdn us# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=TCP desc=fqdn usage=0 -> verified ok
# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=TCP desc=fqdn us# checking addr info addr=1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa proto=TCP desc=fqdn usage=passive -> verified ok
# checking IP addresses ... Got 6 IP addresses

    None
    127.0.0.1
    10.0.0.120
    localhost
    0.0.0.0
# checking IP addresses -> verified ok

@kquick
Copy link
Owner

kquick commented Apr 23, 2024

That indicates no issues were found that the diagnostics.py were able to detect. Do any of the following work?

$ python3 examples/hellogoodbye.py
$ python3 examples/hellogoodbye.py multiprocQueueBase
$ python3 examples/hellogoodbye.py multiprocUDPBase
$ python3 examples/hellogoodbye.py multiprocTCPBase

These should be run from the top-level checkout of the thespian repository, where the thespian and examples directory exist, and either this directory must be in PYTHONPATH or thespian should be installed where a python3 import thespian will work.

@tomerd
Copy link
Author

tomerd commented Apr 24, 2024

❯ python3 examples/hellogoodbye.py multiprocTCPBase

None
Traceback (most recent call last):
  File "/Users/tomerd/code/other/Thespian/examples/hellogoodbye.py", line 45, in <module>
    run_example(sys.argv[1] if len(sys.argv) > 1 else None)
  File "/Users/tomerd/code/other/Thespian/examples/hellogoodbye.py", line 33, in run_example
    asys = ActorSystem(systembase, logDefs=logcfg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/actors.py", line 637, in __init__
    systemBase = self._startupActorSys(
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/actors.py", line 678, in _startupActorSys
    systemBase = sbc(self, logDefs=logDefs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/system/multiprocTCPBase.py", line 28, in __init__
    super(ActorSystemBase, self).__init__(system, logDefs)
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/system/multiprocCommon.py", line 83, in __init__
    super(multiprocessCommon, self).__init__(system, logDefs)
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/system/systemBase.py", line 326, in __init__
    self._startAdmin(self.adminAddr,
  File "/Users/tomerd/miniconda/envs/portola/lib/python3.12/site-packages/thespian/system/multiprocCommon.py", line 116, in _startAdmin
    raise InvalidActorAddress(adminAddr,
thespian.actors.InvalidActorAddress: ActorAddr-(T|:1900) is not a valid ActorSystem admin

@tomerd
Copy link
Author

tomerd commented Apr 24, 2024

the rest work okay

@kquick
Copy link
Owner

kquick commented Apr 28, 2024

This looks like everything is in reasonable shape for the Mac and that it's primarily just a port issue for the multiprocTCPBase. There are three things I can think of that might be occurring here:

  1. There is already something else running on port 1900. You can select a different port by modifying the "Admin Port" when starting the actor system (see https://thespianpy.com/doc/using#hH-a17e6c70-5592-4d06-b818-bd25350c4c53 and https://thespianpy.com/doc/using.html#hH-9d33a877-b4f0-4012-9510-442d81b0837c for more information).
  2. The multiprocTCPBase itself is persistent: it will normally stay running (i.e. a "daemon" or "system service") after the application that started it exits (see Base Persistence at https://thespianpy.com/doc/in_depth.html#hH-b2414e9c-4cec-46e7-8d53-80008f2c9498). A subsequent application issuing a startup can simply connect to that long-running multiprocTCPBase. However, any code loaded in the original base is still running in that service, including any errors, so if there was a failure, the process might still be running but be unable to support additional connections; you would use ps and kill or similar techniques to find and stop this process.
  3. Port 1900 is a relatively low-numbered port. Your system might disallow binding to a low-numbered port, or you may have some sort of firewall or virus protection (perhaps builtin to newer Mac OS versions) that is preventing this. There may be additional information in the thespian_diagnostics.log that was generated when you ran the diagnose.py above. Try using a higher numbered port using the "Admin Port" discussed in the first item above.

Let me know if one of these seems to fix the issue or if you are still having problems.

@tomerd
Copy link
Author

tomerd commented Apr 29, 2024

thanks @kquick I actually checked all three several times before reporting the issue. it consistently does not work on macOS without anything else on the port. I suspect it may be one other macOS behavior that is getting in the way. I am working with docker rn to workaround this issue

@skunath
Copy link

skunath commented Aug 9, 2024

I can confirm that I have seen this as well at least for python 3.11 on MacOS Sonoma. I ended up playing around with the gatekeeper settings as I assumed there might be some odd security issue related to it ( this seemed interesting https://medium.com/@ansonliao.xiao/how-to-enable-openanywhere-security-option-in-mac-09e1570aa9ac ). I also ended up trying with python 3.9 and then installing python 3.12. In both 3.9 and 3.12 it seems like I can get thespian to launch correctly with multiprocTCPBase.

Not sure if adjusting gatekeeper was the solution, but it seems to be working.

@kquick
Copy link
Owner

kquick commented Aug 9, 2024 via email

@tomerd
Copy link
Author

tomerd commented Aug 12, 2024

interesting. tried python 3.12 and disabling gateway without success. @skunath do you recall the exact steps you took?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants