Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with chrome driver on Raspberry Pi #358

Closed
mxfilerelatedcache opened this issue Apr 9, 2023 · 12 comments
Closed

Problem with chrome driver on Raspberry Pi #358

mxfilerelatedcache opened this issue Apr 9, 2023 · 12 comments

Comments

@mxfilerelatedcache
Copy link

I'm trying to set up flathunter on my Raspberry Pi 4 running Debian GNU/Linux 11 (bullseye), but get a problem when running flat hunt.py. It seems to be related to the Chromium Driver and #192. I have the newest chromium driver (109.0.5414.112-rpt2). The error looks like this:

simon@simonspi:~/Documents/flathunter $ pipenv run python flathunt.py [2023/04/09 13:42:28|config.py |INFO ]: Using config path /home/simon/Documents/flathunter/config.yaml [2023/04/09 13:42:28|chrome_wrapper.py |INFO ]: Initializing Chrome WebDriver for crawler... [2023/04/09 13:42:30|patcher.py |INFO ]: patching driver executable /home/simon/.local/share/undetected_chromedriver/undetected_chromedriver Traceback (most recent call last): File "/home/simon/Documents/flathunter/flathunt.py", line 118, in <module> main() File "/home/simon/Documents/flathunter/flathunt.py", line 114, in main launch_flat_hunt(config, heartbeat) File "/home/simon/Documents/flathunter/flathunt.py", line 36, in launch_flat_hunt hunter.hunt_flats() File "/home/simon/Documents/flathunter/flathunter/hunter.py", line 56, in hunt_flats for expose in processor_chain.process(self.crawl_for_exposes(max_pages)): File "/home/simon/Documents/flathunter/flathunter/hunter.py", line 35, in crawl_for_exposes return chain(*[try_crawl(searcher, url, max_pages) File "/home/simon/Documents/flathunter/flathunter/hunter.py", line 35, in <listcomp> return chain(*[try_crawl(searcher, url, max_pages) File "/home/simon/Documents/flathunter/flathunter/hunter.py", line 27, in try_crawl return searcher.crawl(url, max_pages) File "/home/simon/Documents/flathunter/flathunter/abstract_crawler.py", line 150, in crawl return self.get_results(url, max_pages) File "/home/simon/Documents/flathunter/flathunter/crawler/immobilienscout.py", line 90, in get_results soup = self.get_page(search_url, self.get_driver(), page_no) File "/home/simon/Documents/flathunter/flathunter/crawler/immobilienscout.py", line 65, in get_driver self.driver = get_chrome_driver(driver_arguments) File "/home/simon/Documents/flathunter/flathunter/chrome_wrapper.py", line 47, in get_chrome_driver driver = uc.Chrome(version_main=chrome_version, options=chrome_options) # pylint: disable=no-member File "/home/simon/.local/share/virtualenvs/flathunter-QaHh8Mme/lib/python3.9/site-packages/undetected_chromedriver/__init__.py", line 441, in __init__ super(Chrome, self).__init__( File "/home/simon/.local/share/virtualenvs/flathunter-QaHh8Mme/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 80, in __init__ super().__init__( File "/home/simon/.local/share/virtualenvs/flathunter-QaHh8Mme/lib/python3.9/site-packages/selenium/webdriver/chromium/webdriver.py", line 101, in __init__ self.service.start() File "/home/simon/.local/share/virtualenvs/flathunter-QaHh8Mme/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 90, in start self._start_process(self.path) File "/home/simon/.local/share/virtualenvs/flathunter-QaHh8Mme/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 203, in _start_process self.process = subprocess.Popen( File "/usr/lib/python3.9/subprocess.py", line 951, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.9/subprocess.py", line 1823, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) OSError: [Errno 8] Exec format error: '/home/simon/.local/share/undetected_chromedriver/undetected_chromedriver' [2023/04/09 13:42:30|__init__.py |INFO ]: ensuring close

I tried the steps described by @Ralfons-06 in #192, but it seems the code changed so I'm unsure how to proceed. Any ideas? Thanks!

@codders
Copy link

codders commented Apr 9, 2023 via email

@mxfilerelatedcache
Copy link
Author

Hey, thanks for the quick reply. uname -a says Linux simonspi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux. I'm not sure what the output of file for the binary is or how to get it - but I did some research and it seems to be related to the fact that there is no chromedriver for ARM64. The workaround explained here seems promising but it will take a while to wrap my head around this.

@navels
Copy link

navels commented Apr 9, 2023

The latest Pi OS has a 64-bit kernel and a 32-bit OS, which throws off any software installer using uname to determine host architecture. See https://forums.raspberrypi.com/viewtopic.php?t=344246 for some discussion (among other random problems with the 6.1 kernel).

Anyway, I did this to downgrade my kernel to the previous, 32-bit version:

sudo rpi-update bdb151a

While I cannot guarantee that will fix your issue, it is probably a good place to start to get your system in a more predictable state. After a reboot, uname -a should show

Linux pi 5.15.84-v7l+ #1613 SMP Thu Jan 5 12:01:26 GMT 2023 armv7l GNU/Linux

This did fix similar problems I was having getting chromedriver working after I had previously upgraded the kernel to 6.1.

@mxfilerelatedcache
Copy link
Author

Better late than never, finally fixed the issue. It had to do with the undetected-chromedriver not being made for ARM64 (which my Pi runs at). I finally fixed the issue by manually changing code in the [chrome_wrapper.py](https://github.com/flathunters/flathunter/blob/main/flathunter/chrome_wrapper.py) file.

I followed the instructions described here to download and patch the unofficial ARM64 undetected-chromedriver from electron, and then manually set this patched driver in the chrome_wrapper.py file, similar (but not exactly) like described in #192. This finally worked, thanks @navels!

If someone comes across the same issue, let me know and I can share the changed code in the chrome_wrapper.py file.

@osharaki
Copy link

Hey @mxfilerelatedcache 👋 Would be great if you could share the changes you made in chrome_wrapper.py 🙂

@heckhoff
Copy link

Sharing your changes would help me out as well, thanks in advance @mxfilerelatedcache :)

@xeruun
Copy link

xeruun commented Jun 19, 2024

I have the same problem. Would it possible to share the modified file @mxfilerelatedcache?
Thanks in advance 🙂

@mxfilerelatedcache
Copy link
Author

Hey, sorry for the late response, was a bit down under. I'm at work right now but when I get home I'll try to look for my Pi and the respective chrome_wrapper.py!

@valnurat
Copy link

Looking forward to see this. Thank you

@mxfilerelatedcache
Copy link
Author

It's been a while so I'm not quite sure about all the places I changed so I'll provide the whole chrome_wrapper.py. I do remember though that I manually added the driver path. Here is the file:

"""Chrome needs some special handling to work out where the correct
binary is, to attach the correct selenium chromedriver, and to set
the correct version number"""
import re
import subprocess
from typing import List
from sys import platform
import undetected_chromedriver as uc
from selenium.webdriver.chrome.service import Service

from flathunter.logging import logger
from flathunter.exceptions import ChromeNotFound

CHROME_VERSION_REGEXP = re.compile(r'.* (\d+\.\d+\.\d+\.\d+)( .*)?')
WINDOWS_CHROME_REG_PATH = r'HKEY_CURRENT_USER\Software\Google\Chrome\BLBeacon'
WINDOWS_CHROME_REG_REGEXP = re.compile(r'\s*version\s*REG_SZ\s*(\d+)\..*')
CHROME_BINARY_NAMES = ['google-chrome', 'chromium', 'chrome', 'chromium-browser',
                       '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome']

def get_command_output(args) -> List[str]:
    """Run a command and return stdout"""
    try:
        with subprocess.Popen(args,
                    stdout=subprocess.PIPE, stderr=subprocess.PIPE,
                    universal_newlines=True) as process:
            if process.stdout is None:
                return []
            return process.stdout.readlines()
    except FileNotFoundError:
        return []

def get_chrome_version() -> int:
    """Determine the correct name for the chrome binary"""
    for binary_name in CHROME_BINARY_NAMES:
        try:
            version_output = get_command_output([binary_name, '--version'])
            if not version_output:
                continue
            match = CHROME_VERSION_REGEXP.match(version_output[0])
            if match is None:
                continue
            return int(match.group(1).split('.')[0])
        except FileNotFoundError:
            pass
    try:
        # on Windows, Chrome doesn't respond to --version, but we can find
        # the version in the registry
        output = get_command_output(
            ['reg', 'query', WINDOWS_CHROME_REG_PATH, '/v', 'version']
        )
        version_matches = (WINDOWS_CHROME_REG_REGEXP.match(l) for l in output)
        version_matches = [m for m in version_matches if m is not None]
        if version_matches:
            return int(version_matches[0].group(1))
    except FileNotFoundError:
        pass
    raise ChromeNotFound()

def get_chrome_driver(driver_arguments):
    """Configure Chrome WebDriver"""
    logger.info('Initializing Chrome WebDriver for crawler...')
    chrome_options = uc.ChromeOptions() # pylint: disable=no-member
    
    # manually configure browser
    chrome_options.BinaryLocation = "/usr/bin/chromium-browser"
    driver_path = "/home/simon/chromedriver"

    if platform == "darwin":
        chrome_options.add_argument("--headless")
    if driver_arguments is not None:
        for driver_argument in driver_arguments:
            chrome_options.add_argument(driver_argument)
    chrome_version = get_chrome_version()
    chrome_options.add_argument("--headless=new")
    #driver = uc.Chrome(version_main=chrome_version, options=chrome_options) # pylint: disable=no-member

    # manually configure chromedriver
    driver = uc.Chrome(
        driver_executable_path=driver_path,
        options=chrome_options,
	version_main=chrome_version)

    driver.execute_cdp_cmd(
        "Network.setUserAgentOverride",
        {
            "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
                         "AppleWebKit/537.36 (KHTML, like Gecko)"
                         "Chrome/120.0.0.0 Safari/537.36"
        },
    )

    driver.execute_cdp_cmd('Network.setBlockedURLs',
        {"urls": ["https://api.geetest.com/get.*"]})
    driver.execute_cdp_cmd('Network.enable', {})
    return driver

@valnurat
Copy link

valnurat commented Jun 21, 2024

Hi @mxfilerelatedcache

I'm not using the flathunter, but I'm trying to do a scraper of my own. What I already have done works in windows, but I do have issues with UC on my raspberry. I'm running Raspberry OS, but should your solution fix my issue that I have here:
https://github.com/ultrafunkamsterdam/undetected-chromedriver/discussions/1925
If so, do you think you could explain from scrath how got yours working?
Br

@osharaki
Copy link

osharaki commented Jun 22, 2024

For anyone still interested, in chrome_wrapper.py, I needed to change this line from

driver = uc.Chrome(version_main=chrome_version, options=chrome_options) # pylint: disable=no-member

to

chrome_options.BinaryLocation = "/usr/bin/chromium-browser"
driver = uc.Chrome(driver_executable_path='/usr/bin/chromedriver', options=chrome_options) # pylint: disable=no-member

Of course, make sure that driver_executable_path points to where your patched chromedriver is located. In my case, it's /usr/bin/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants